使用python向csv追加字典值_Python_Csv_Pandas_Dictionary

使用python向csv追加字典值

python csv pandas dictionary

使用python向csv追加字典值,python,csv,pandas,dictionary,Python,Csv,Pandas,Dictionary,我的python脚本生成一个字典，如下所示： import collections import pandas as pd # ........................ # Other part of code, which produces the dictionary by name "data_dict" # ........................ #Sorting the dictionary (And adding it to a ordereddict) in

我的python脚本生成一个字典，如下所示：

import collections
import pandas as pd

# ........................
# Other part of code, which produces the dictionary by name "data_dict"
# ........................

#Sorting the dictionary (And adding it to a ordereddict) in order to skip matching dictionary keys with column headers
data_dict_sorted = collections.OrderedDict(sorted(data_dict.items()))

# For the first time to produce column headers, I used .items() and rest of the following lines follows it.
# df = pd.DataFrame.from_dict(data_dict_sorted.items())

#For the second time onwards, I just need to append the values, I am using .values()
df = pd.DataFrame.from_dict(data_dict_sorted.values())

df2 = df.T # transposing because from_dict creates all keys in one column, and corresponding values in the next column.
df2.columns = df2.iloc[0] 
df3 = df2[1:]
df3["FULLNAME"] = args.name #This is how we add a value, isn't it?
df3.to_csv('test.csv', mode = 'a', sep=str('\t'), encoding='utf-8', index=False)

================================================================

{u'19:00': 2, u'12:00': 1, u'06:00': 2, u'00:00': 0, u'23:00': 2, u'05:00': 2, u'11:00': 4, u'14:00': 2, u'04:00': 0, u'09:00': 7, u'03:00': 1, u'18:00': 6, u'01:00': 0, u'21:00': 5, u'15:00': 8, u'22:00': 1, u'08:00': 5, u'16:00': 8, u'02:00': 0, u'13:00': 8, u'20:00': 5, u'07:00': 11, u'17:00': 12, u'10:00': 8}

TL&DR

我使用了来自dict的

方法，同时从dictionary创建了一个数据帧，从而使问题变得过于复杂。多亏了@剑
换句话说，pd.DataFrame.from_dict
仅当您希望创建一个数据帧，其中所有键在一列中，所有值在另一列中时才需要。在所有其他情况下，它与公认答案中提到的方法一样简单
==============================================================
{u'19:00': 2, u'12:00': 1, u'06:00': 2, u'00:00': 0, u'23:00': 2, u'05:00': 2, u'11:00': 4, u'14:00': 2, u'04:00': 0, u'09:00': 7, u'03:00': 1, u'18:00': 6, u'01:00': 0, u'21:00': 5, u'15:00': 8, u'22:00': 1, u'08:00': 5, u'16:00': 8, u'02:00': 0, u'13:00': 8, u'20:00': 5, u'07:00': 11, u'17:00': 12, u'10:00': 8}

它还产生了一个变量，比如说全名
（作为脚本的参数），它的值为“John”
每次我运行脚本时，它都会以上述格式为我提供一个字典和名称
我想以以下格式将其写入csv文件以供以后分析：
FULLNAME | 00:00  |  01:00  |  02:00  | .....| 22:00  |  23:00  |
John     | 0      |  0      |  0      | .....| 1      |  2      |

我要生成的代码如下所示：
import collections
import pandas as pd

# ........................
# Other part of code, which produces the dictionary by name "data_dict"
# ........................

#Sorting the dictionary (And adding it to a ordereddict) in order to skip matching dictionary keys with column headers
data_dict_sorted = collections.OrderedDict(sorted(data_dict.items()))

# For the first time to produce column headers, I used .items() and rest of the following lines follows it.
# df = pd.DataFrame.from_dict(data_dict_sorted.items())

#For the second time onwards, I just need to append the values, I am using .values()
df = pd.DataFrame.from_dict(data_dict_sorted.values())

df2 = df.T # transposing because from_dict creates all keys in one column, and corresponding values in the next column.
df2.columns = df2.iloc[0] 
df3 = df2[1:]
df3["FULLNAME"] = args.name #This is how we add a value, isn't it?
df3.to_csv('test.csv', mode = 'a', sep=str('\t'), encoding='utf-8', index=False)

我的代码正在生成以下csv
00:00 | 01:00 | 02:00 | …….. | 22:00 | 23:00 | FULLNAME
0     | 0     | 0     | …….. | 1     | 2     | John
0     | 0     | 0     | …….. | 1     | 2     | FULLNAME
0     | 0     | 0     | …….. | 1     | 2     | FULLNAME

我的问题有两个：
为什么在第二次迭代中打印“全名”而不是“John”（就像在第二次运行脚本时一样）？我缺少什么
有更好的方法吗
这个怎么样
df = pd.DataFrame(data_dict, index=[0])
df['FullName'] = 'John'

编辑：

理解您执行操作的方式有点困难，但问题似乎出在行df.columns=df.iloc[0]
上。上面提到的代码不需要指定列名或转置操作。如果要在每次迭代中添加词典，请尝试：
data_dict['FullName'] = 'John'
df = df.append(pd.DataFrame(data_dict, index =[0]), ignore_index = True).reset_index()

如果每一行可能有不同的名称，那么df['FullName']='John'
将导致整个列等同于John。因此，作为一个更好的步骤，在dict中创建一个名为“FullName”的键，并将适当的名称作为其值，以避免为整个列指定统一的值，即
data_dict['FullName'] = 'John'

index=[0]做什么？每当您将字典传递给pd.DataFrame时，每个键的值都需要采用列表格式。但在您的情况下，值是整数，只有在提供有关索引的信息时，才能传递所需的标量。index=[0]只是表示行的索引为0。对于多行，这应该是一个索引列表，可以是标签或数字。但我认为这并不能解决我在这里面临的问题。你是如何得到第二行和第三行的？我已经编辑了答案，假设您每次向现有df添加一个字典。正如注释中提到的，我首先运行df=pd.DataFrame.from_dict（data_dict_sorted.items（））
，它将列标题作为时隙（字典的键），然后运行值。第二次运行脚本时（这就是我所说的迭代），我将这一行替换为df=pd.DataFrame.from_dict（data_dict_sorted.values（））
，以便只追加值，而不追加键。唯一的问题是在“FULLNAME”列中，当脚本第二次运行时，我得到的值是“FULLNAME”而不是“John”。