Python：重新格式化一组文本文件的简洁/优雅方式？_Python_Process_Ascii

Python：重新格式化一组文本文件的简洁/优雅方式？

python process

Python：重新格式化一组文本文件的简洁/优雅方式？,python,process,ascii,Python,Process,Ascii,我已经编写了一个python脚本来处理给定目录中的一组ASCII文件。我想知道是否有一种更简洁和/或“pythonesque”的方式来做它，而不失去可读性 Python代码例如：输入：处理：在我的构建脚本中，我有以下代码： inFile = open(sourceFile,'r') outFile = open(targetFile,'w') for line in inFile: line = doKeywordSubstitution(line) outFile.wr

我已经编写了一个python脚本来处理给定目录中的一组ASCII文件。我想知道是否有一种更简洁和/或“pythonesque”的方式来做它，而不失去可读性

Python代码例如：输入：处理：

在我的构建脚本中，我有以下代码：

inFile = open(sourceFile,'r')
outFile = open(targetFile,'w')
for line in inFile:
    line = doKeywordSubstitution(line)
    outFile.write(line)
inFile.close()
outFile.close()

我不知道有什么方法能让这更简洁。不过，将换行逻辑放在不同的函数中看起来更整洁

我可能没有理解您代码的要点，但我不明白为什么您有

lines=iter（fileinput.input（[indir+filename]）

可以写为：

with open(indir+filename,'r') as fin, open(outdir+filename,'w') as fout:
    #code

在python 2.6中，您可以使用：

with open(indir+filename,'r') as fin:
    with open(outdir+filename,'w') as fout:
        #code

线路呢

lines = iter(fileinput.input([indir+filename]))

没用。您只需迭代打开的文件（在您的示例中为fin）

您还可以执行

line.split（“”）

而不是

string.split（line“”）

如果更改这些内容，则无需导入字符串和文件输入

编辑：我不知道你可以使用内联代码。那很酷

我不明白你为什么用：

string.split（line.）

而不是

line.split（“”）

也许我会这样写字符串处理部分：

values = line.split(' ')
values[0] = '{0:6.2f}'.format(float(values[0]))
values[1:] = ['{0:10.6f}'.format(float(v)) for v in values[1:]]
fout.write(' '.join(values))

至少对我来说，这看起来更好，但这可能是主观的：）

我将使用

os.curdir

，而不是

indir

。我会做的是：

os.path.join（os.curdir，'processed'）

除了一些小的更改之外，由于Python是如何随时间变化的，这看起来很好

您正在混合两种不同风格的next（）；旧的方法是it.next（），新的方法是next（it）。您应该使用string方法split（）而不是使用string模块（该模块主要用于向后兼容Python 1.x）。无需使用几乎无用的“fileinput”模块，因为open file handle也是迭代器（该模块来自Python的文件句柄是迭代器之前的时代）

编辑：@codeape指出，glob（）返回完整路径。如果indir不是“/”的话，您的代码就不会工作。我已更改以下内容以使用正确的listdir/os.path.join解决方案。与字符串格式相比，我更熟悉“%”字符串插值

下面是我将如何用更为地道的现代Python编写这篇文章

def reformat(fin, fout):
    fout.write(next(fin)) # just copy the first line (the header) to output
    for line in fin:
        fields = line.split(' ')

        # Make a format header specific to the number of fields
        fmt = '%6.2f' + ('%10.6f' * (len(fields)-1)) + '\n'

        fout.write(fmt % tuple(map(float, fields)))

basenames = os.listdir(indir)  # get a list of input ASCII files to be processed
for basename in basenames:
    input_filename = os.path.join(indir, basename)
    output_filename = os.path.join(outdir, basename)
    with open(input_filename, 'r') as fin, open(output_filename, 'w') as fout:
        reformat(fin, fout)

Python的禅宗思想是“应该有一种——最好只有一种——显而易见的方法”。你的工作方式很有趣，在过去10多年中，它“显然”是正确的解决方案，但现在已经不是了。：）

看起来不错。也许在他们自己的台词上发表评论（我觉得每行大概80个字符）。其中一些。为什么inut有9列，outp有5列？我认为，因为您在

fout.write（{0:10.6f}.format（float（val.next（））

中有一个bug，所以应该是

fout.write（{0:10.6f}.format（float（x））

我相信

glob

返回绝对文件名列表，所以您应该直接使用

filename

，而不是

indir+filename>。只有当indir
为/
时，您的代码才会工作。您应该使用os.path.join（outdir，path.basename（filename））
，而不是outdir+filename
。永远不要使用字符串连接来操纵目录路径，请使用os.path.join
。我对问题reglob.glob
和path.join
的评论也适用于这里。代码fout.write（fmt.format（*map（float，fields））有问题。。第一个字段是复制的len（fields）输出中的次。该行发生了什么？@user1069609:我对格式的每个项使用了{0}，而不是递增它。我没有为format（）生成有效的格式字符串，而是切换到旧的“%”字符串插值格式。这很好，但不幸的是，我有python 2.6.5，所以我不能这样做：打开（indir+filename，'r'）作为fin打开（outdir+filename，'w'）作为fout:@user1069609我用python 2.6替代版本更新了我的答案。
with open(indir+filename,'r') as fin:
    with open(outdir+filename,'w') as fout:
        #code

lines = iter(fileinput.input([indir+filename]))

values = line.split(' ')
values[0] = '{0:6.2f}'.format(float(values[0]))
values[1:] = ['{0:10.6f}'.format(float(v)) for v in values[1:]]
fout.write(' '.join(values))

def reformat(fin, fout):
    fout.write(next(fin)) # just copy the first line (the header) to output
    for line in fin:
        fields = line.split(' ')

        # Make a format header specific to the number of fields
        fmt = '%6.2f' + ('%10.6f' * (len(fields)-1)) + '\n'

        fout.write(fmt % tuple(map(float, fields)))

basenames = os.listdir(indir)  # get a list of input ASCII files to be processed
for basename in basenames:
    input_filename = os.path.join(indir, basename)
    output_filename = os.path.join(outdir, basename)
    with open(input_filename, 'r') as fin, open(output_filename, 'w') as fout:
        reformat(fin, fout)