Python 将不同长度的数字列从单独的文本文件合并到单个csv文件中_Python_Csv_Itertools

Python 将不同长度的数字列从单独的文本文件合并到单个csv文件中

python csv

Python 将不同长度的数字列从单独的文本文件合并到单个csv文件中,python,csv,itertools,Python,Csv,Itertools,嗨，我对Python编程相当陌生，似乎无法克服这个问题我有一个包含100个子文件夹的目录，每个子文件夹中有一个文本文件（没有文件扩展名），所有子文件夹的名称都完全相同。每个文件包含一列不同长度的数字我想将每个文件的所有编号合并到一个csv文件中，每个文件的编号在单独的列中所以我应该得到一个由100列组成的矩阵，每列对应一个文件，每列的数字长度不同文件示例：文件1 文件2 3 55 22 我有这个剧本： # import modules import glob import csv i

嗨，我对Python编程相当陌生，似乎无法克服这个问题

我有一个包含100个子文件夹的目录，每个子文件夹中有一个文本文件（没有文件扩展名），所有子文件夹的名称都完全相同。每个文件包含一列不同长度的数字

我想将每个文件的所有编号合并到一个csv文件中，每个文件的编号在单独的列中

所以我应该得到一个由100列组成的矩阵，每列对应一个文件，每列的数字长度不同

文件示例：

文件1

文件2

3
55
22

我有这个剧本：

# import modules
import glob
import csv
import sys
import itertools

inf = glob.glob("*/*-ambig")

for f in inf:
    with open(f) as fin:
        with open(sys.argv[1], 'w') as fout:

            writer = csv.writer(fout, delimiter=',',  quotechar='', quoting=csv.QUOTE_NONE)
            headers = ('coverage', )
            writer.writerow(headers)

            for line in fin:
                columns = line.split("\n") # split each column on new line
                writer.writerow(itertools.izip_longest(*columns, fillvalue=['']))

然而，我得到了这个错误：

Traceback (most recent call last):
  File "coverage_per_strain.py", line 21, in <module>
    writer.writerow(itertools.izip_longest(*columns, fillvalue=['']))
_csv.Error: sequence expected

回溯（最近一次呼叫最后一次）：
文件“coverage_per_strain.py”，第21行，在
writer.writerow（itertools.izip_longest（*列，fillvalue=['']））
_csv.错误：应为序列

有人知道我的代码有什么问题吗？你能看到其他错误吗

谢谢

csv.writerow

需要一个序列作为参数

itertools.izip_longest

正在返回迭代器。因此出现了错误消息

您应该能够通过以下方式解决此问题：

writer.writerow(list(itertools.izip_longest(*columns, fillvalue=[''])))

这是我在意识到您正在使用Python2.7之前编写的一个解决方案。这只适用于Python 3.3+，因为它使用了非常漂亮的

contextlib.ExitStack

上下文管理器，该管理器仅在该版本中添加（我也使用Python 3的

map

）：

下面是我将其移植回Python 2的尝试。我还没有测试过这个版本。我使用

try

finally

对来处理文件的关闭（和

imap

来处理剥离，而无需预先将所有文件读入内存）：

打开文件

fin

后，您应该使用

csv.reader（）

来阅读它。只需看看csv模块教程。您使用的是什么版本的Python？在Python3中，一次迭代多个文件比在Python2中安全地执行任何操作都要容易

writer.writerow(list(itertools.izip_longest(*columns, fillvalue=[''])))

import glob
import csv
import sys
import contextlib

from itertools import zip_longest

in_filenames = glob.glob("*/*-ambig")

with contextlib.ExitStack() as stack:
    in_files = [stack.enter_context(open(filename)) for filename in in_filenames]
    out_file = stack.enter_context(open(sys.argv[1], "w", newlines=""))

    writer = csv.writer(out_file, delimiter=',', quoting=csv.QUOTE_NONE)

    writer.writerow(('coverage',)) # do you want the header repeated for each column?

    writer.writerows(zip_longest(*(map(str.strip, f) for f in in_files), fillvalue=""))

import glob
import csv
import sys

from itertools import izip_longest, imap

in_filenames = glob.glob("*/*-ambig")

with open(sys.argv[1], "wb") as out_file:
    in_files = []
    try:
        for filename in in_filenames:
            in_files.append(open(filename))

        writer = csv.writer(out_file, delimiter=',', quoting=csv.QUOTE_NONE)

        writer.writerow(('coverage',))

        writer.writerows(izip_longest(*[imap(str.strip, f) for f in in_files],
                                      fillvalue=""))

    finally:
        for f in in_files:
            try:
                f.close()
            except:   # ignore exceptions here, or the later files might not get closed!
                pass