Python 组合多个csv文件_Python_Csv_Pandas

Python 组合多个csv文件

python csv pandas

Python 组合多个csv文件,python,csv,pandas,Python,Csv,Pandas,我有3个csv文件，我想将这3个文件写入一个csv文件中。比如说 file1.csv 文件2.csv 文件3.csv 所需输出如下所示 a b c d e f g h i j k l 1 2 3 4 13 14 15 16 9 10 11 12 5 6 7 8 17 18 19 20 21 22 23 24 这些人是对的，你不应该要求代码。尽管如此，我还是觉得这项任务很有说服力，我花了三分钟的时间来解决这个问题： import csv allColumns = []

我有3个csv文件，我想将这3个文件写入一个csv文件中。比如说

file1.csv 文件2.csv 文件3.csv 所需输出如下所示

  a b c d e   f g  h  i j  k  l
  1 2 3 4 13 14 15 16 9 10 11 12
  5 6 7 8 17 18 19 20 21 22 23 24

这些人是对的，你不应该要求代码。尽管如此，我还是觉得这项任务很有说服力，我花了三分钟的时间来解决这个问题：

import csv

allColumns = []
for dataFileName in [ 'a.csv', 'b.csv', 'c.csv' ]:
  with open(dataFileName) as dataFile:
    fileColumns = zip(*list(csv.reader(dataFile, delimiter=' ')))
    allColumns += fileColumns

allRows = zip(*allColumns)

with open('combined.csv', 'w') as resultFile:
  writer = csv.writer(resultFile, delimiter=' ')
  for row in allRows:
    writer.writerow(row)

请注意，此解决方案可能无法对大输入正常工作。它还假设所有文件的行（行）数相等，如果不是这样，则可能会中断。

一个想法是使用zip函数

file1 = "a b c d\n1 2 3 4\n5 6 7 8"
file2 = "e f g h\n13 14 15 16\n17 18 19 20"
file3 = "i j k l\n9 10 11 12\n21 22 23 24"

merged_file =[i+" " +j+" " +k for i,j,k in zip(file1.split('\n'),file2.split('\n'),file3.split('\n'))]
for i in merged_file:
   print i

考虑到所有文件都有相等的行。这个解决方案也适用于大输入，因为一次只将3行（每个文件一行）带入内存

import csv
with open('foo1.txt') as f1, open('foo2.txt') as f2, \
     open('foo2.txt') as f3, open('out.txt', 'w') as f_out:

     writer = csv.writer(f_out, delimiter=' ')
     readers = [csv.reader(x, delimiter=' ') for x in (f1, f2, f3)]
     while True:
         try:
             writer.writerow([y for w in readers for y in next(w)])
         except StopIteration:
             break

上述代码的基于for循环的版本，但这需要首先对其中一个文件进行迭代以获得行数：

import csv
with open('foo1.txt') as f1, open('foo2.txt') as f2, \
     open('foo2.txt') as f3, open('out.txt', 'w') as f_out:

     writer = csv.writer(f_out, delimiter=' ')
     lines = sum(1 for _ in f1) #Number of lines in f1
     f1.seek(0)                 #Move the file pointer to the start of file 
     readers = [csv.reader(x, delimiter=' ') for x in (f1, f2, f3)]
     for _ in range(lines):
         writer.writerow([y for w in readers for y in next(w)])

蟒蛇走了。（上述发布代码的稍微改进版本）

然后得到组合的csv文件

output.csv

inputs = 'file1.csv', 'file2.csv', 'file3.csv'

with open('out.csv','w') as output:
    for line in zip(*map(open, inputs)):
        output.write('%s\n'%' '.join(i.strip() for i in line))

Unix命令行方式。如果您使用的是UNIX类型的操作系统，请检查您是否只关心合并文件

Godspeed.

您可以使用一个数据处理工具

import pandas as pd

df1 = pd.read_csv('file1.csv')
df2 = pd.read_csv('file2.csv')
df3 = pd.read_csv('file3.csv')

df_combined = pd.concat([df1, df2, df3],axis=1)
df_combined.to_csv('output.csv', index=None)

然后得到组合的csv文件

output.csv

inputs = 'file1.csv', 'file2.csv', 'file3.csv'

with open('out.csv','w') as output:
    for line in zip(*map(open, inputs)):
        output.write('%s\n'%' '.join(i.strip() for i in line))

编辑：
这里有一个详细的版本

inputs = 'file1.csv', 'file2.csv', 'file3.csv'

# open all input files
inputs = map(open, inputs)

with open('out.csv','w') as output:

    # iter over all the input files at the same time
    for line in zip(*inputs):

        # format the output line from input lines
        line = ' '.join(i.strip() for i in line)

        output.write('%s\n' % line)

除了第一个答案（这是正确的答案）之外，您还可以通过以下方式处理文件夹中任意数量的csv文件：

import os
import pandas as pd

folder = r"C:\MyFolder"

frames = [pd.read_csv(os.path.join(folder,name) for name in os.listdir(folder) if name.endswith('.csv')]

merged = pd.concat(frames)

文件：

首先想到的是使用熊猫模块，正如waitingkuo的回答一样。但我想你也可以用听写器

import csv

# Initialize output file
header = [x for x in 'abcdefghijkl']    
output = csv.DictWriter(open('final_output.csv', 'wb'), fieldnames = header)
output.writerow(dict(zip(header, header))) 

# Compile contents of all three files into a single dictionary, outputdict
outputdict = {key:[] for key in header}
for fname in ['file1.csv', 'file2.csv', 'file3.csv']: 
    f = csv.DictReader(open(fname, 'r')) 
   [(outputdict[k]).append(line[k]) for k in line for line in f]


# Transfer the contents of outputdict into a csv file
[output.writerow(l) for l in outputdict]

任何人都能提供python代码吗

-哇，不。这是stackoverlow.com，不是domyworksforyou.com。向我们展示你的代码，我们很乐意帮助你。那么请阅读一些关于文件处理的Python教程。你可以开始了。要理解基础知识，你需要学习。你不可能通过要求代码来获得任何结果。它包括一个额外的新行，打开

out.txt

作为

wb

@thefour你确定吗？在Python2和Python3上都测试过，没有额外的新行。我在Python2上试过，得到了一个额外的新行，所以Jon建议使用

wb

，效果很好。请提供更多细节。

inputs = 'file1.csv', 'file2.csv', 'file3.csv'

with open('out.csv','w') as output:
    for line in zip(*map(open, inputs)):
        output.write('%s\n'%' '.join(i.strip() for i in line))

inputs = 'file1.csv', 'file2.csv', 'file3.csv'

# open all input files
inputs = map(open, inputs)

with open('out.csv','w') as output:

    # iter over all the input files at the same time
    for line in zip(*inputs):

        # format the output line from input lines
        line = ' '.join(i.strip() for i in line)

        output.write('%s\n' % line)

import os
import pandas as pd

folder = r"C:\MyFolder"

frames = [pd.read_csv(os.path.join(folder,name) for name in os.listdir(folder) if name.endswith('.csv')]

merged = pd.concat(frames)

import csv

# Initialize output file
header = [x for x in 'abcdefghijkl']    
output = csv.DictWriter(open('final_output.csv', 'wb'), fieldnames = header)
output.writerow(dict(zip(header, header))) 

# Compile contents of all three files into a single dictionary, outputdict
outputdict = {key:[] for key in header}
for fname in ['file1.csv', 'file2.csv', 'file3.csv']: 
    f = csv.DictReader(open(fname, 'r')) 
   [(outputdict[k]).append(line[k]) for k in line for line in f]


# Transfer the contents of outputdict into a csv file
[output.writerow(l) for l in outputdict]