Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/358.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 组合多个csv文件_Python_Csv_Pandas - Fatal编程技术网

Python 组合多个csv文件

Python 组合多个csv文件,python,csv,pandas,Python,Csv,Pandas,我有3个csv文件,我想将这3个文件写入一个csv文件中。 比如说 file1.csv 文件2.csv 文件3.csv 所需输出如下所示 a b c d e f g h i j k l 1 2 3 4 13 14 15 16 9 10 11 12 5 6 7 8 17 18 19 20 21 22 23 24 这些人是对的,你不应该要求代码。尽管如此,我还是觉得这项任务很有说服力,我花了三分钟的时间来解决这个问题: import csv allColumns = []

我有3个csv文件,我想将这3个文件写入一个csv文件中。 比如说

file1.csv 文件2.csv 文件3.csv 所需输出如下所示

  a b c d e   f g  h  i j  k  l
  1 2 3 4 13 14 15 16 9 10 11 12
  5 6 7 8 17 18 19 20 21 22 23 24

这些人是对的,你不应该要求代码。尽管如此,我还是觉得这项任务很有说服力,我花了三分钟的时间来解决这个问题:

import csv

allColumns = []
for dataFileName in [ 'a.csv', 'b.csv', 'c.csv' ]:
  with open(dataFileName) as dataFile:
    fileColumns = zip(*list(csv.reader(dataFile, delimiter=' ')))
    allColumns += fileColumns

allRows = zip(*allColumns)

with open('combined.csv', 'w') as resultFile:
  writer = csv.writer(resultFile, delimiter=' ')
  for row in allRows:
    writer.writerow(row)

请注意,此解决方案可能无法对大输入正常工作。它还假设所有文件的行(行)数相等,如果不是这样,则可能会中断。

一个想法是使用zip函数

file1 = "a b c d\n1 2 3 4\n5 6 7 8"
file2 = "e f g h\n13 14 15 16\n17 18 19 20"
file3 = "i j k l\n9 10 11 12\n21 22 23 24"

merged_file =[i+" " +j+" " +k for i,j,k in zip(file1.split('\n'),file2.split('\n'),file3.split('\n'))]
for i in merged_file:
   print i

考虑到所有文件都有相等的行。这个解决方案也适用于大输入,因为一次只将3行(每个文件一行)带入内存

import csv
with open('foo1.txt') as f1, open('foo2.txt') as f2, \
     open('foo2.txt') as f3, open('out.txt', 'w') as f_out:

     writer = csv.writer(f_out, delimiter=' ')
     readers = [csv.reader(x, delimiter=' ') for x in (f1, f2, f3)]
     while True:
         try:
             writer.writerow([y for w in readers for y in next(w)])
         except StopIteration:
             break
上述代码的基于for循环的版本,但这需要首先对其中一个文件进行迭代以获得行数:

import csv
with open('foo1.txt') as f1, open('foo2.txt') as f2, \
     open('foo2.txt') as f3, open('out.txt', 'w') as f_out:

     writer = csv.writer(f_out, delimiter=' ')
     lines = sum(1 for _ in f1) #Number of lines in f1
     f1.seek(0)                 #Move the file pointer to the start of file 
     readers = [csv.reader(x, delimiter=' ') for x in (f1, f2, f3)]
     for _ in range(lines):
         writer.writerow([y for w in readers for y in next(w)])
蟒蛇走了。 (上述发布代码的稍微改进版本)

然后得到组合的csv文件
output.csv

inputs = 'file1.csv', 'file2.csv', 'file3.csv'

with open('out.csv','w') as output:
    for line in zip(*map(open, inputs)):
        output.write('%s\n'%' '.join(i.strip() for i in line))
Unix命令行方式。 如果您使用的是UNIX类型的操作系统,请检查您是否只关心合并文件

Godspeed.

您可以使用一个数据处理工具

import pandas as pd

df1 = pd.read_csv('file1.csv')
df2 = pd.read_csv('file2.csv')
df3 = pd.read_csv('file3.csv')

df_combined = pd.concat([df1, df2, df3],axis=1)
df_combined.to_csv('output.csv', index=None)
然后得到组合的csv文件
output.csv

inputs = 'file1.csv', 'file2.csv', 'file3.csv'

with open('out.csv','w') as output:
    for line in zip(*map(open, inputs)):
        output.write('%s\n'%' '.join(i.strip() for i in line))
编辑:
这里有一个详细的版本

inputs = 'file1.csv', 'file2.csv', 'file3.csv'

# open all input files
inputs = map(open, inputs)

with open('out.csv','w') as output:

    # iter over all the input files at the same time
    for line in zip(*inputs):

        # format the output line from input lines
        line = ' '.join(i.strip() for i in line)

        output.write('%s\n' % line)

除了第一个答案(这是正确的答案)之外,您还可以通过以下方式处理文件夹中任意数量的csv文件:

import os
import pandas as pd

folder = r"C:\MyFolder"

frames = [pd.read_csv(os.path.join(folder,name) for name in os.listdir(folder) if name.endswith('.csv')]

merged = pd.concat(frames)
文件:

首先想到的是使用熊猫模块,正如waitingkuo的回答一样。但我想你也可以用听写器

import csv

# Initialize output file
header = [x for x in 'abcdefghijkl']    
output = csv.DictWriter(open('final_output.csv', 'wb'), fieldnames = header)
output.writerow(dict(zip(header, header))) 

# Compile contents of all three files into a single dictionary, outputdict
outputdict = {key:[] for key in header}
for fname in ['file1.csv', 'file2.csv', 'file3.csv']: 
    f = csv.DictReader(open(fname, 'r')) 
   [(outputdict[k]).append(line[k]) for k in line for line in f]


# Transfer the contents of outputdict into a csv file
[output.writerow(l) for l in outputdict]

任何人都能提供python代码吗
-哇,不。这是stackoverlow.com,不是domyworksforyou.com。向我们展示你的代码,我们很乐意帮助你。那么请阅读一些关于文件处理的Python教程。你可以开始了。要理解基础知识,你需要学习。你不可能通过要求代码来获得任何结果。它包括一个额外的新行,打开
out.txt
作为
wb
@thefour你确定吗?在Python2和Python3上都测试过,没有额外的新行。我在Python2上试过,得到了一个额外的新行,所以Jon建议使用
wb
,效果很好。请提供更多细节。
inputs = 'file1.csv', 'file2.csv', 'file3.csv'

with open('out.csv','w') as output:
    for line in zip(*map(open, inputs)):
        output.write('%s\n'%' '.join(i.strip() for i in line))
inputs = 'file1.csv', 'file2.csv', 'file3.csv'

# open all input files
inputs = map(open, inputs)

with open('out.csv','w') as output:

    # iter over all the input files at the same time
    for line in zip(*inputs):

        # format the output line from input lines
        line = ' '.join(i.strip() for i in line)

        output.write('%s\n' % line)
import os
import pandas as pd

folder = r"C:\MyFolder"

frames = [pd.read_csv(os.path.join(folder,name) for name in os.listdir(folder) if name.endswith('.csv')]

merged = pd.concat(frames)
import csv

# Initialize output file
header = [x for x in 'abcdefghijkl']    
output = csv.DictWriter(open('final_output.csv', 'wb'), fieldnames = header)
output.writerow(dict(zip(header, header))) 

# Compile contents of all three files into a single dictionary, outputdict
outputdict = {key:[] for key in header}
for fname in ['file1.csv', 'file2.csv', 'file3.csv']: 
    f = csv.DictReader(open(fname, 'r')) 
   [(outputdict[k]).append(line[k]) for k in line for line in f]


# Transfer the contents of outputdict into a csv file
[output.writerow(l) for l in outputdict]