Python 将字典转换为CSV格式_Python_Python 3.x

Python 将字典转换为CSV格式

python python-3.x

Python 将字典转换为CSV格式,python,python-3.x,Python,Python 3.x,我的输出当前如下所示： [{'date'：'20140206'，'exchange'：'cme'，'total_bytes'：'15400000}， {'date'：'20140206'，'exchange'：'phlx'，'total_bytes'：'14100000'}， {'date'：'20140206'，'exchange'：'phlx'，'total_bytes'：'13800000'}， {'date'：'20140207'，'exchange'：'cme'，'total_byte

我的输出当前如下所示：

[{'date'：'20140206'，'exchange'：'cme'，'total_bytes'：'15400000}，
{'date'：'20140206'，'exchange'：'phlx'，'total_bytes'：'14100000'}，
{'date'：'20140206'，'exchange'：'phlx'，'total_bytes'：'13800000'}，
{'date'：'20140207'，'exchange'：'cme'，'total_bytes'：'15800000'}，
{'date'：'20140207'，'exchange'：'cme'，'total_bytes'：'14200000'}，
{'date'：'20140207'，'exchange'：'phlx'，'total_bytes'：'24100000'}]

但我需要它看起来更像这样：

日期、交换、总字节数
20140206，cme，15400000
20140206，phlx，2790000
20140207，cme，30000000
20140207，phlx，24100000

截至目前，我有多行相同的日期，我想把他们分组，这样就没有重复的条目。也就是说，只有一个phlx条目用于第七个。（添加两个字节值）

这是我的密码：

import csv
import pprint

endresult = []

# write csv_input to a csv file
with open('csv_input.csv','w') as file:
    for line in csv_input:
        file.write(line)

# manipulate text - remove the 0001 from the host name to get just the initials - ex. cme
text = open("csv_input.csv", "r")
text = ''.join([i for i in text]) \
    .replace("0001", "")
x = open("csv_input.csv","w")
x.writelines(text)
x.close()

# read csv file created and add column names
with open('csv_input.csv', 'r') as csv_file:
    reader = csv.DictReader(csv_file)
    for row in reader:
        endresult.append({
            'date': row['date'],
            'exchange': row['host'],
            'total_bytes': row['bytes']})
#print(row)

with open('last.csv', 'w', newline='') as txt_file:
    fieldnames = ['date','exchange','total_bytes']
    csv_dict_writer = csv.DictWriter(txt_file, fieldnames=fieldnames)
    csv_dict_writer.writeheader()
    for result in endresult:
        csv_dict_writer.writerow(result)

pprint.pprint(endresult)

以
```
'test.csv'
```
使用defaultdict将类似的
```
'date'
```
和
```
'exchange'
```
值分组，并将
```
'total_bytes'
```
添加到列表中
将键和值拆分为列表列表，并对“总字节数”求和
将列表列表写入csv文件

从集合导入defaultdict
导入csv
#阅读csv
以open（'test.csv'，'r'）作为f：
数据=列表（csv.DictReader（f））
#从数据中提取和分组信息
dd=默认DICT（列表）
对于数据中的d：
date，proc，u，u，tb=d.values（）#提取感兴趣的值
过程=过程拆分（“”“）[0]
dd[f'{date}{proc}'].追加（tb）
#将dd键添加到列表中，对dd值求和并添加到列表中
csv_列表=[['日期'，'交换'，'总字节']]
对于dd.items（）中的k，v：
d、 e=k.分割（“”）
tb=总和（映射（整数，v））
csv_list.append（[d，e，tb]）
#将列表列表写入文件
将open（'new_csv.csv'，'w'，newline=''）作为f：
write=csv.writer（f）
write.writerows（csv_列表）
#csv文件视图
日期、交换、总字节数
20140206，cme，15400000
20140206，phlx，2790000
20140207，cme，30000000
20140207，phlx，24100000

如果不允许使用任何导入

#在文件中读取
以open（'test.csv'，'r'）作为f：
数据=[row.strip（）.split（'，'），用于f.readlines（）中的行]
#将感兴趣的列表值从数据添加到dict
dd=dict（）
对于枚举中的i，d（数据）：
如果i>0：
日期，过程，时间，tb=d
过程=过程拆分（“”“）[0]
key=f'{date}{proc}'
如果不存在dd.get（密钥）：#如果密钥不存在
dd[key]=[tb]#创建键值对
其他：
dd[键]。追加（tb）
#将dd键添加到列表中，对dd值求和并添加到列表中
csv_列表=[['日期'，'交换'，'总字节']]
对于dd.items（）中的k，v：
d、 e=k.分割（“”）
tb=总和（映射（整数，v））
csv_list.append（[d，e，tb]）
#将列表列表写入文件
将open（'new_csv.csv'，'w'，newline=''）作为f：
write=csv.writer（f）
write.writerows（csv_列表）

Python附带的许多库在这里都有帮助

有效地为排序键附加值
将按相似的键对迭代行进行分组
并通过命名列引用CSV数据

注意

input.csv

下面是您的原始csv数据

import csv
from operator import itemgetter
from itertools import groupby

# Build a sort key by primary/secondary sort value
sorter = itemgetter('date','host')

# Read all the data
with open('input.csv','r',newline='') as fin:
    r = csv.DictReader(fin)
    data = sorted(r,key=sorter)

# build output lines grouped by the sort key
lines = []
for (date,host),group in groupby(data,sorter):
    lines.append({'date' : date,
                  'host' : host[:-4],
                  'total_bytes' : sum(int(data['bytes']) for data in group)})

# generate output
with open('output.csv','w',newline='') as fout:
    w = csv.DictWriter(fout,fieldnames='date host total_bytes'.split())
    w.writeheader()
    w.writerows(lines)

output.csv：

日期、主机、总字节数
20140206，cme，15400000
20140206，phlx，2790000
20140207，cme，30000000
20140207，phlx，24100000

此外，如果您的输入数据已经进行了适当的排序，那么代码可以简化为跳过将整个数据读入内存并进行排序，而是逐行处理。对于大量数据来说，将整个文件读入内存是不切实际的

注意使用

.writerow（）

获取单个

dict

与使用

列表的dict
相比
import csv
from operator import itemgetter
from itertools import groupby

sorter = itemgetter('date','host')

with open('input.csv','r',newline='') as fin, \
     open('output.csv','w',newline='') as fout:

    r = csv.DictReader(fin)
    w = csv.DictWriter(fout,fieldnames='date host total_bytes'.split())
    w.writeheader()

    for (date,host),group in groupby(r,sorter):
        w.writerow({'date' : date,
                    'host' : host[:-4],
                    'total_bytes' : sum(int(data['bytes']) for data in group)})