Python文本文件操作,以秒为单位为每行添加增量时间
我是python的初学者,正在尝试解决以下问题: 我有一个文本文件,每行开头如下:Python文本文件操作,以秒为单位为每行添加增量时间,python,Python,我是python的初学者,正在尝试解决以下问题: 我有一个文本文件,每行开头如下: <18:12:53.972> <18:12:53.975> <18:12:53.975> <18:12:53.975> <18:12:54.008> 等 我想在每一行的开头加上以秒为单位的运行时间,但前提是该行以“第一点开头:当内存不足时(特别是当您不知道是否有足够的可用内存时),永远不要读取内存中的完整文件 第二点:学习使用python的循环
<18:12:53.972>
<18:12:53.975>
<18:12:53.975>
<18:12:53.975>
<18:12:54.008>
等
我想在每一行的开头加上以秒为单位的运行时间,但前提是该行以“第一点开头:当内存不足时(特别是当您不知道是否有足够的可用内存时),永远不要读取内存中的完整文件 第二点:学习使用python的循环和迭代协议。迭代
列表
和任何其他iterable的方法是:
for item in some_iterable:
do_something_with(item)
这样可以避免弄乱索引和出错;)
Pythonfile
对象的一个优点是它们实际上是可编辑的,因此要在文件行上迭代,最简单的方法是:
for line in my_opened_file:
do_something_with(line)
以下是一种简单但有效且主要是pythonic(nb:python2.7.x)的程序编写方法:
# -*- coding: utf-8 -*-
import os
import sys
import datetime
import re
import tempfile
def totime(timestr):
""" returns a datetime object for a "HH:MM:SS" string """
# we actually need datetime objects for substraction
# so let's use the first available bogus date
# notes:
# `timestr.split(":")` will returns a list `["MM", "HH", "SS]`
# `map(int, ...)` will apply `int()` on each item
# of the sequence (second argument) and return
# the resulting list, ie
# `map(int, "01", "02", "03")` => `[1, 2, 3]`
return datetime.datetime(1900, 1, 1, *map(int, timestr.split(":")))
def process(instream, outstream):
# some may consider that regexps are not that pythonic
# but as far as I'm concerned it seems like a sensible
# use case.
time_re = re.compile("^<(?P<time>\d{2}:\d{2}:\d{2})\.")
first = None
# iterate over our input stream lines
for line in instream:
# should we handle this line at all ?
# (nb a bit redundant but faster than re.match)
if not line.startswith("<"):
continue
# looks like a candidate, let's try and
# extract the 'time' value from it
match = time_re.search(line)
if not match:
# starts with '<' BUT not followed by 'HH:MM:SS.' ?
# unexpected from the sample source but well, we
# can't do much about it either
continue
# retrieve the captured "time" (HH:MM:SS) part
current = totime(match.group("time"))
# store the first occurrence so we can
# compute the elapsed time
if first is None:
first = current
# `(current - first)` yields a `timedelta` object
# we now just have to retrieve it's `seconds` attribute
seconds = (current - first).seconds
# inject the seconds before the line
# and write the whole thing tou our output stream
newline = "{}{}".format(seconds, line)
outstream.write(newline)
def usage(err=None):
if err:
print >> sys.stderr, err
print >> sys.stderr, "usage: python retime.py <filename>"
# unix standards process exit codes
return 2 if err else 0
def main(*args):
# our entry point...
# gets the source filename, process it
# (storing the results in a temporary file),
# and if everything's ok replace the source file
# by the temporary file.
try:
sourcename = args[0]
except IndexError as e:
return usage("missing <filename> argument")
# `delete=False` prevents the tmp file to be
# deleted on closing.
dest = tempfile.NamedTemporaryFile(delete=False)
with open(sourcename) as source:
try:
process(source, dest)
except Exception as e:
dest.close()
os.remove(dest)
raise
# ok done
dest.close()
os.rename(dest.name, sourcename)
return 0
if __name__ == "__main__":
# only execute main() if we are called as a script
# (so we can also import this file as a module)
sys.exit(main(*sys.argv[1:]))
#-*-编码:utf-8-*-
导入操作系统
导入系统
导入日期时间
进口稀土
导入临时文件
def totime(timestr):
“”“返回“HH:MM:SS”字符串的datetime对象”“”
#我们实际上需要datetime对象进行减法运算
#那么,让我们使用第一个可用的伪造日期
#注:
#`timestr.split(“:”`将返回一个列表“[”MM”,“HH”,“SS]`
#`map(int,…)`将对每个项目应用`int()
#序列(第二个参数)的
#结果列表,即
#'map(int,“01”,“02”,“03”)`=>`[1,2,3]`
return datetime.datetime(1900,1,1,*map(int,timestr.split(“:”))
def流程(流入、流出):
有些人可能认为,正则表达式不是Python的。
#但就我而言,这似乎是一个明智的选择
#用例。
时间=重新编译(“^这在python中是可能的吗?是的,当然是可能的。事实上这很容易。你可以在datetime模块中检查函数?你的问题是什么?你有没有尝试过什么不起作用?@KlausD。我添加了一些代码。Thanks@Chaker.谢谢,我已经试过了,效果很好,但是我不能让代码正常工作。谢谢你的支持帮助,我不知道如何让它工作。你能看看我更新的代码吗?现在它几乎可以作为一个初学者:-)谢谢“我不知道如何让它工作,虽然”嗯,它是相当有文档记录的虽然。。。你试过用它吗?
for line in my_opened_file:
do_something_with(line)
# -*- coding: utf-8 -*-
import os
import sys
import datetime
import re
import tempfile
def totime(timestr):
""" returns a datetime object for a "HH:MM:SS" string """
# we actually need datetime objects for substraction
# so let's use the first available bogus date
# notes:
# `timestr.split(":")` will returns a list `["MM", "HH", "SS]`
# `map(int, ...)` will apply `int()` on each item
# of the sequence (second argument) and return
# the resulting list, ie
# `map(int, "01", "02", "03")` => `[1, 2, 3]`
return datetime.datetime(1900, 1, 1, *map(int, timestr.split(":")))
def process(instream, outstream):
# some may consider that regexps are not that pythonic
# but as far as I'm concerned it seems like a sensible
# use case.
time_re = re.compile("^<(?P<time>\d{2}:\d{2}:\d{2})\.")
first = None
# iterate over our input stream lines
for line in instream:
# should we handle this line at all ?
# (nb a bit redundant but faster than re.match)
if not line.startswith("<"):
continue
# looks like a candidate, let's try and
# extract the 'time' value from it
match = time_re.search(line)
if not match:
# starts with '<' BUT not followed by 'HH:MM:SS.' ?
# unexpected from the sample source but well, we
# can't do much about it either
continue
# retrieve the captured "time" (HH:MM:SS) part
current = totime(match.group("time"))
# store the first occurrence so we can
# compute the elapsed time
if first is None:
first = current
# `(current - first)` yields a `timedelta` object
# we now just have to retrieve it's `seconds` attribute
seconds = (current - first).seconds
# inject the seconds before the line
# and write the whole thing tou our output stream
newline = "{}{}".format(seconds, line)
outstream.write(newline)
def usage(err=None):
if err:
print >> sys.stderr, err
print >> sys.stderr, "usage: python retime.py <filename>"
# unix standards process exit codes
return 2 if err else 0
def main(*args):
# our entry point...
# gets the source filename, process it
# (storing the results in a temporary file),
# and if everything's ok replace the source file
# by the temporary file.
try:
sourcename = args[0]
except IndexError as e:
return usage("missing <filename> argument")
# `delete=False` prevents the tmp file to be
# deleted on closing.
dest = tempfile.NamedTemporaryFile(delete=False)
with open(sourcename) as source:
try:
process(source, dest)
except Exception as e:
dest.close()
os.remove(dest)
raise
# ok done
dest.close()
os.rename(dest.name, sourcename)
return 0
if __name__ == "__main__":
# only execute main() if we are called as a script
# (so we can also import this file as a module)
sys.exit(main(*sys.argv[1:]))