Python 将字典转换为CSV格式

Python 将字典转换为CSV格式,python,python-3.x,Python,Python 3.x,我的输出当前如下所示: [{'date':'20140206','exchange':'cme','total_bytes':'15400000}, {'date':'20140206','exchange':'phlx','total_bytes':'14100000'}, {'date':'20140206','exchange':'phlx','total_bytes':'13800000'}, {'date':'20140207','exchange':'cme','total_byte

我的输出当前如下所示:

[{'date':'20140206','exchange':'cme','total_bytes':'15400000},
{'date':'20140206','exchange':'phlx','total_bytes':'14100000'},
{'date':'20140206','exchange':'phlx','total_bytes':'13800000'},
{'date':'20140207','exchange':'cme','total_bytes':'15800000'},
{'date':'20140207','exchange':'cme','total_bytes':'14200000'},
{'date':'20140207','exchange':'phlx','total_bytes':'24100000'}]
但我需要它看起来更像这样:

日期、交换、总字节数
20140206,cme,15400000
20140206,phlx,2790000
20140207,cme,30000000
20140207,phlx,24100000
截至目前,我有多行相同的日期,我想把他们分组,这样就没有重复的条目。也就是说,只有一个phlx条目用于第七个。(添加两个字节值)

这是我的密码:

import csv
import pprint

endresult = []

# write csv_input to a csv file
with open('csv_input.csv','w') as file:
    for line in csv_input:
        file.write(line)

# manipulate text - remove the 0001 from the host name to get just the initials - ex. cme
text = open("csv_input.csv", "r")
text = ''.join([i for i in text]) \
    .replace("0001", "")
x = open("csv_input.csv","w")
x.writelines(text)
x.close()

# read csv file created and add column names
with open('csv_input.csv', 'r') as csv_file:
    reader = csv.DictReader(csv_file)
    for row in reader:
        endresult.append({
            'date': row['date'],
            'exchange': row['host'],
            'total_bytes': row['bytes']})
#print(row)

with open('last.csv', 'w', newline='') as txt_file:
    fieldnames = ['date','exchange','total_bytes']
    csv_dict_writer = csv.DictWriter(txt_file, fieldnames=fieldnames)
    csv_dict_writer.writeheader()
    for result in endresult:
        csv_dict_writer.writerow(result)

pprint.pprint(endresult)
  • 'test.csv'
  • 使用defaultdict将类似的
    'date'
    'exchange'
    值分组,并将
    'total_bytes'
    添加到列表中
  • 将键和值拆分为列表列表,并对“总字节数”求和
  • 将列表列表写入csv文件
从集合导入defaultdict
导入csv
#阅读csv
以open('test.csv','r')作为f:
数据=列表(csv.DictReader(f))
#从数据中提取和分组信息
dd=默认DICT(列表)
对于数据中的d:
date,proc,u,u,tb=d.values()#提取感兴趣的值
过程=过程拆分(“”“)[0]
dd[f'{date}{proc}'].追加(tb)
#将dd键添加到列表中,对dd值求和并添加到列表中
csv_列表=[['日期','交换','总字节']]
对于dd.items()中的k,v:
d、 e=k.分割(“”)
tb=总和(映射(整数,v))
csv_list.append([d,e,tb])
#将列表列表写入文件
将open('new_csv.csv','w',newline='')作为f:
write=csv.writer(f)
write.writerows(csv_列表)
#csv文件视图
日期、交换、总字节数
20140206,cme,15400000
20140206,phlx,2790000
20140207,cme,30000000
20140207,phlx,24100000
如果不允许使用任何导入
#在文件中读取
以open('test.csv','r')作为f:
数据=[row.strip().split(','),用于f.readlines()中的行]
#将感兴趣的列表值从数据添加到dict
dd=dict()
对于枚举中的i,d(数据):
如果i>0:
日期,过程,时间,tb=d
过程=过程拆分(“”“)[0]
key=f'{date}{proc}'
如果不存在dd.get(密钥):#如果密钥不存在
dd[key]=[tb]#创建键值对
其他:
dd[键]。追加(tb)
#将dd键添加到列表中,对dd值求和并添加到列表中
csv_列表=[['日期','交换','总字节']]
对于dd.items()中的k,v:
d、 e=k.分割(“”)
tb=总和(映射(整数,v))
csv_list.append([d,e,tb])
#将列表列表写入文件
将open('new_csv.csv','w',newline='')作为f:
write=csv.writer(f)
write.writerows(csv_列表)

Python附带的许多库在这里都有帮助

  • 有效地为排序键附加值
  • 将按相似的键对迭代行进行分组
  • 并通过命名列引用CSV数据
注意
input.csv
下面是您的原始csv数据

import csv
from operator import itemgetter
from itertools import groupby

# Build a sort key by primary/secondary sort value
sorter = itemgetter('date','host')

# Read all the data
with open('input.csv','r',newline='') as fin:
    r = csv.DictReader(fin)
    data = sorted(r,key=sorter)

# build output lines grouped by the sort key
lines = []
for (date,host),group in groupby(data,sorter):
    lines.append({'date' : date,
                  'host' : host[:-4],
                  'total_bytes' : sum(int(data['bytes']) for data in group)})

# generate output
with open('output.csv','w',newline='') as fout:
    w = csv.DictWriter(fout,fieldnames='date host total_bytes'.split())
    w.writeheader()
    w.writerows(lines)
output.csv:

日期、主机、总字节数
20140206,cme,15400000
20140206,phlx,2790000
20140207,cme,30000000
20140207,phlx,24100000
此外,如果您的输入数据已经进行了适当的排序,那么代码可以简化为跳过将整个数据读入内存并进行排序,而是逐行处理。对于大量数据来说,将整个文件读入内存是不切实际的

注意使用
.writerow()
获取单个
dict
与使用
列表的
dict
相比

import csv
from operator import itemgetter
from itertools import groupby

sorter = itemgetter('date','host')

with open('input.csv','r',newline='') as fin, \
     open('output.csv','w',newline='') as fout:

    r = csv.DictReader(fin)
    w = csv.DictWriter(fout,fieldnames='date host total_bytes'.split())
    w.writeheader()

    for (date,host),group in groupby(r,sorter):
        w.writerow({'date' : date,
                    'host' : host[:-4],
                    'total_bytes' : sum(int(data['bytes']) for data in group)})