Python 读取csv文件的头并查看它是否匹配字典键,然后将该键的值写入行

Python 读取csv文件的头并查看它是否匹配字典键,然后将该键的值写入行,python,csv,dictionary,file-io,Python,Csv,Dictionary,File Io,基本上我会有一堆小字典,比如: dictionary_list = [ {"eight": "yes", "queen": "yes", "we": "yes", "eighteen": "yes"}, {"nine": "yes", "king": "yes","we": "yes", "nineteen": "yes"} ] 然后我有一个csv文件,其中有一大堆列,标题中也有单词,如下所示: 可能有500列,每个列有一个单词,我不知道列的显示顺序。然而,我知道我的小词典中的任何单词都应该与

基本上我会有一堆小字典,比如:

dictionary_list = [
{"eight": "yes", "queen": "yes", "we": "yes", "eighteen": "yes"},
{"nine": "yes", "king": "yes","we": "yes", "nineteen": "yes"}
]
然后我有一个csv文件,其中有一大堆列,标题中也有单词,如下所示: 可能有500列,每个列有一个单词,我不知道列的显示顺序。然而,我知道我的小词典中的任何单词都应该与列中的单词相匹配

我想遍历文件的标题,首先跳转到5列标题,每次查看是否可以在字典中找到标题名称,如果可以,则将值添加到该行中,如果不能,则添加一个编号。这将逐行执行,其中每一行用于一个小字典。对该文件使用上述词典的结果如下:

到目前为止,我已经能够尝试以下不起作用的方法:

f = open("file.csv", "r")
writer = csv.DictWriter(f)
for dict in dictionary_list: # this is the collection of little dictionaries
    # do some other stuff
    for r in writer: 
        #not sure how to skip 10 columns here. next() seems to work on rows 
        for col in r:
            if col in dict.keys():
                 writer.writerow(dict.values())
             else:
                 writer.writerow("no")

“熊猫”可能会帮助你

这是网站

您可以使用pandas.read_csv方法处理csv文件,并使用Dataframe.append方法根据需要添加一些数据


希望这些对您有所帮助。

您的问题似乎是为了确保字典列表中的字段存在于记录中。如果字段最初存在于记录中,则将字段值设置为“是”,否则将字段添加到记录中并将值设置为“否”

#!/usr/bin/env python3

import csv


dictionary_list = [
    {"eight": "yes", "queen": "yes", "we": "yes", "eighteen": "yes"},
    {"nine": "yes", "king": "yes","them": "yes", "nineteen": "yes"}
]

"""
flatten all the dicionary keys into a uniq list as the
key names will be used for field names and can't be duplicated
"""
field_check = set([k for d in dictionary_list for k in d.keys()])

if __name__ == "__main__":

    with open("file.csv", "r") as f:
        reader = csv.DictReader(f)

        # do not consider the first 10 colums
        field_tail = set(reader.fieldnames[10:])

        """
        initialize yes and no fields as they
        should be the same for every row in the file
        """
        yes_fields = set(field_check & field_tail)
        no_fields = field_check.difference(yes_fields)
        yes_dict = {k:"yes" for k in yes_fields}
        no_dict = {k:"no" for k in no_fields}
        for row in reader:
            row.update(yes_dict)
            row.update(no_dict)
            print(row)
给定一个输入文件headers.csv:

以下代码生成您的输出:

import csv

dictionary_list = [{"eight": "yes", "queen": "yes", "we": "yes", "eighteen": "yes"},
                   {"nine": "yes", "king": "yes","we": "yes", "nineteen": "yes"}]

# Read the input header line as a list
with open('headers.csv',newline='') as f:
    reader = csv.reader(f)
    headers = next(reader)

# Generate the fixed values for the first 5 rows.
rowvals = dict(zip(headers[:5],['x'] * 5))

with open('file.csv', 'w', newline='') as f:
    # When writing a row, restval is the default value when it isn't in the dict row.
    # extrasaction='ignore' prevents complaining if all columns are not present in dict row.
    writer = csv.DictWriter(f,headers,restval='no',extrasaction='ignore')
    writer.writeheader()
    for dictionary in dictionary_list:
        D = dictionary.copy() # needed if the original shouldn't be modified.
        D.update(rowvals)
        writer.writerow(D)
输出:

row1,row2,row3,row4,row5,bad,good,eight,nine,queen,three,eighteen,nineteen,king,jack,ace,we,them,you,two
x,x,x,x,x,no,no,yes,no,yes,no,yes,no,no,no,no,yes,no,no,no
x,x,x,x,x,no,no,no,yes,no,no,no,yes,yes,no,no,yes,no,no,no

您应该显示您处理的所有额外文件。如果不使用图像作为数据,请添加文件片段的图片。粘贴.csv中的文本数据,这样如果我们想复制,我们可以剪切粘贴。我将尝试一下。我将为原始文件添加一个编辑:文件的每一行基本上都应该是字典列表中的每一个小字典。所以我不想把它们都放在一起,而是一行一行地分别对待它们。
row1,row2,row3,row4,row5,bad,good,eight,nine,queen,three,eighteen,nineteen,king,jack,ace,we,them,you,two
x,x,x,x,x,no,no,yes,no,yes,no,yes,no,no,no,no,yes,no,no,no
x,x,x,x,x,no,no,no,yes,no,no,no,yes,yes,no,no,yes,no,no,no