“更好的方法”；仅从一个文件中复制注释”；及；将其预编到另一个文件“中”；使用python_Python_File Io_Prepend

“更好的方法”；仅从一个文件中复制注释”；及；将其预编到另一个文件“中”；使用python

python file-io

“更好的方法”；仅从一个文件中复制注释”；及；将其预编到另一个文件“中”；使用python,python,file-io,prepend,Python,File Io,Prepend,基本上，我想从一个文件中复制注释并将其添加到另一个数据中文件'data\u with_comments.txt'可从pastebin获得：看起来是这样的： # coating file for detector A/R # column 1 is the angle of incidence (degrees) # column 2 is the wavelength (microns) # column 3 is the transmission probability # column

基本上，我想从一个文件中复制注释并将其添加到另一个数据中

文件

'data\u with_comments.txt'

可从pastebin获得：

看起来是这样的：

# coating file for detector A/R
# column 1 is the angle of incidence (degrees)
# column 2 is the wavelength (microns)
# column 3 is the transmission probability
# column 4 is the reflection probability
      14.2000     0.300000  8.00000e-05     0.999920
      14.2000     0.301000  4.00000e-05     0.999960
      14.2000     0.302000  2.00000e-05     0.999980
      14.2000     0.303000  2.00000e-05     0.999980
      14.2000     0.304000  2.00000e-05     0.999980
      14.2000     0.305000  3.00000e-05     0.999970
      14.2000     0.306000  5.00000e-05     0.999950

300.0 1.53345164121e-32
300.1 1.53345164121e-32
300.2 1.53345164121e-32
300.3 1.53345164121e-32
300.4 1.53345164121e-32
300.5 1.53345164121e-32

现在，我有另一个数据文件

'test.txt'

，如下所示：

# coating file for detector A/R
# column 1 is the angle of incidence (degrees)
# column 2 is the wavelength (microns)
# column 3 is the transmission probability
# column 4 is the reflection probability
      14.2000     0.300000  8.00000e-05     0.999920
      14.2000     0.301000  4.00000e-05     0.999960
      14.2000     0.302000  2.00000e-05     0.999980
      14.2000     0.303000  2.00000e-05     0.999980
      14.2000     0.304000  2.00000e-05     0.999980
      14.2000     0.305000  3.00000e-05     0.999970
      14.2000     0.306000  5.00000e-05     0.999950

300.0 1.53345164121e-32
300.1 1.53345164121e-32
300.2 1.53345164121e-32
300.3 1.53345164121e-32
300.4 1.53345164121e-32
300.5 1.53345164121e-32

所需输出：

# coating file for detector A/R
# column 1 is the angle of incidence (degrees)
# column 2 is the wavelength (microns)
# column 3 is the transmission probability
# column 4 is the reflection probability
300.0 1.53345164121e-32
300.1 1.53345164121e-32
300.2 1.53345164121e-32
300.3 1.53345164121e-32
300.4 1.53345164121e-32

一种方法是：

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Author    : Bhishan Poudel
# Date      : Jun 18, 2016


# Imports
from __future__ import print_function
import fileinput


# read in comments from the file
infile = 'data_with_comments.txt'
comments = []
with open(infile, 'r') as fi:
    for line in fi.readlines():
        if line.startswith('#'):
            comments.append(line)

# reverse the list
comments = comments[::-1]
print(comments[0])
#==============================================================================


# preprepend a list to a file
filename = 'test.txt'

for i in range(len(comments)):
    with file(filename, 'r') as original: data = original.read()
    with file(filename, 'w') as modified: modified.write(comments[i] + data)

在这种方法中，我们必须多次打开文件，当数据文件非常大时，这种方法效率很低

有没有更好的办法

相关链接如下：

如果文件仅在开头包含注释，则可以使用文件的开头，然后只处理文件的第一行，直到找到非注释。找到一行开头没有“#”字符后，您可以从循环中中断，让python的

with

语句处理文件关闭。

您可以使用文件的开头，然后只处理文件的第一行，直到找到非注释（如果文件仅在开头包含注释）。找到一行开头不带“#”字符的行后，您可以从循环中中断，让python的

with

语句处理文件关闭。

特别是如果数据文件（此处为test.txt）很大（如OP所述），我建议（文件只打开一次用于读取，另一个文件用于写入）：

创建临时文件夹

在其中预先填充一个临时文件，其中包含剥离的（！）注释行

添加数据文件中的行

将临时文件重命名为数据文件

移除临时文件夹，瞧像这样：

#! /usr/bin/env python
from __future__ import print_function

import os
import tempfile


infile = 'data_with_comments.txt'
comments = None
with open(infile, 'r') as f_i:
    comments = [t.strip() for t in f_i.readlines() if t.startswith('#')]

file_name = 'test.txt'
file_path = file_name  # simpl0ification here

tmp_dir = tempfile.mkdtemp()  # create tmp folder (works on all platforms)
tmp_file_name = '_' + file_name  # determine the file name in temp folder

s_umask = os.umask(0077)

tmp_file_path = os.path.join(tmp_dir, tmp_file_name)
try:
    with open(file_path, "rt") as f_prep, open(
            tmp_file_path, "wt") as f_tmp:
        f_tmp.write('\n'.join(comments) + '\n')
        for line in f_prep.readlines():
            f_tmp.write(line)
except IOError as e:
    print(e)  # or what you want to tell abnout it, instead of aborting
else:
    os.rename(tmp_file_path, file_path)
finally:
    try:  # so we have an empty folder in - nearly - any case
        os.remove(tmp_file_path)
    except OSError:
        pass
    os.umask(s_umask)
    os.rmdir(tmp_dir)

没什么特别的，每行迭代可能是，嗯…，我们应该衡量它是否足够的性能。在我必须写入文件“顶部”的场景中，大多数情况下都是“良好的nuff”，或者使用shell，如：

cat comments_only test.txt > foo && mv foo test.txt

PS：为了在“追加”阶段提高文件读写能力，应该使用块大小为基础系统调用优化的匹配块读写，以获得最大性能（因为这是一个一对一的副本，不需要逐行迭代）。

尤其是当数据文件（此处为test.txt）较大时（如OP所述）我建议（文件只打开一次进行读取，另一个文件进行写入）：

创建临时文件夹

在其中预先填充一个临时文件，其中包含剥离的（！）注释行

添加数据文件中的行

将临时文件重命名为数据文件

移除临时文件夹，瞧

像这样：

#! /usr/bin/env python
from __future__ import print_function

import os
import tempfile


infile = 'data_with_comments.txt'
comments = None
with open(infile, 'r') as f_i:
    comments = [t.strip() for t in f_i.readlines() if t.startswith('#')]

file_name = 'test.txt'
file_path = file_name  # simpl0ification here

tmp_dir = tempfile.mkdtemp()  # create tmp folder (works on all platforms)
tmp_file_name = '_' + file_name  # determine the file name in temp folder

s_umask = os.umask(0077)

tmp_file_path = os.path.join(tmp_dir, tmp_file_name)
try:
    with open(file_path, "rt") as f_prep, open(
            tmp_file_path, "wt") as f_tmp:
        f_tmp.write('\n'.join(comments) + '\n')
        for line in f_prep.readlines():
            f_tmp.write(line)
except IOError as e:
    print(e)  # or what you want to tell abnout it, instead of aborting
else:
    os.rename(tmp_file_path, file_path)
finally:
    try:  # so we have an empty folder in - nearly - any case
        os.remove(tmp_file_path)
    except OSError:
        pass
    os.umask(s_umask)
    os.rmdir(tmp_dir)

没什么特别的，每行迭代可能是，嗯…，我们应该衡量它是否足够高的性能。在我不得不写到文件的“顶部”的场景中，大多数情况下都是“良好的nuff”，或者使用类似shell的：

cat comments_only test.txt > foo && mv foo test.txt

PS：为了在“追加”阶段增强文件读写，应该使用匹配的块读写，块大小针对底层系统调用进行优化，以获得最大性能（因为这将是一个一对一的副本，不需要逐行迭代）.

使用临时目录，您已经有了一个很好的答案，但是在与目标文件相同的目录中创建一个临时文件也是很常见的。在tmp是单独装载点的系统上，重命名临时文件时，您可以避免额外的数据副本。请注意，没有中间注释列表是如果注释列表较大，则为“重要”

import os
import shutil

infile = 'data_with_comments.txt'
filename = 'test.txt'

tmpfile = filename + '.tmp'

try:
    # write wanted data to tempfile
    with open(tmpfile, 'w') as out_fp:
        # prepend comments from infle
        with open(infile) as in_fp:
            out_fp.writelines(filter(lambda l: l.startswith('#'), in_fp))
        # then add filename
        with open(filename) as in2_fp:
            shutil.copyfileobj(in2_fp, out_fp)
    # get rid of original data
    os.remove(filename)
    # replace with new data
    os.rename(tmpfile, filename)
finally:
    # cleanup on error
    if os.path.exists(tmpfile):
        os.remove(tmpfile)

使用临时目录已经有了一个很好的答案，但是在与目标文件相同的目录中创建一个临时文件也是很常见的。在tmp是单独装载点的系统上，重命名临时文件时可以避免额外的数据副本。请注意，没有中间的注释列表是带符号的如果注释列表很大，则不需要

import os
import shutil

infile = 'data_with_comments.txt'
filename = 'test.txt'

tmpfile = filename + '.tmp'

try:
    # write wanted data to tempfile
    with open(tmpfile, 'w') as out_fp:
        # prepend comments from infle
        with open(infile) as in_fp:
            out_fp.writelines(filter(lambda l: l.startswith('#'), in_fp))
        # then add filename
        with open(filename) as in2_fp:
            shutil.copyfileobj(in2_fp, out_fp)
    # get rid of original data
    os.remove(filename)
    # replace with new data
    os.rename(tmpfile, filename)
finally:
    # cleanup on error
    if os.path.exists(tmpfile):
        os.remove(tmpfile)

遵循Dilletant的理念

对于多个文本和一个注释文件，我们可以使用shell脚本执行此操作：

# in the directory i have one file called   : comment
# and, other many files with file_extension : .txt

for file in *.txt; do cat comments "$file" > foo && mv foo "$file"; done

这将向目录中的所有文件（.txt）写入相同的注释。

遵循Dilletant的思想

对于多个文本和一个注释文件，我们可以使用shell脚本执行此操作：

# in the directory i have one file called   : comment
# and, other many files with file_extension : .txt

for file in *.txt; do cat comments "$file" > foo && mv foo "$file"; done

这将向目录中的所有文件（.txt）写入相同的注释。

如何使用延迟打开，有什么提示吗？我认为问题不在于检测文件是否需要预先添加这些“注释”行，而在于如何即使对于大文件也这样做……请参阅我对IMO“经典”的回答这样做的方法是，通过使用一个临时文件，比如在shell中：

cat comments\u only test.txt>/tmp/foo&&mv/tmp/foo test.txt

哦，对不起，我误读了这个问题，你的数据文件很大，你想在它前面添加注释，对吗？如果不读取整个数据文件，就没有方法了……回答了你关于lazy readi的问题ng.另外，您的方法将多次重新读取原始文件。将此部分：`for i in range（len（comments））：将文件（filename，'r'）转换为原始：data=original.read（），将文件（filename，'w'）转换为modified:modified.write（comments[i]+data）`转换为this:`with file（filename，'r'））as original:data=original.read（）表示范围内的i（len（comments））：文件（filename，'w'）已修改：modified.write（comments[i]+data）`如何处理延迟打开，有什么提示吗？我想问题不在于检测文件是否需要预先添加这些“comment”行，但如何做到这一点，即使是大文件…请参阅我的答案为IMO“经典”这样做的方法是，通过使用一个临时文件，比如在shell中：

cat comments\u only test.txt>/tmp/foo&&mv/tmp/foo test.txt

哦，对不起，我误读了这个问题，你的数据文件很大，你想在它前面加上注释，对吗？没有方法可以这样做