Python—从csvfile1提取数据，并根据列中的值写入csvfile2_Python_Csv

Python—从csvfile1提取数据，并根据列中的值写入csvfile2

python csv

Python—从csvfile1提取数据，并根据列中的值写入csvfile2,python,csv,Python,Csv,我将数据存储在csv文件中： ID;Event;Date ABC;In;05/01/2015 XYZ;In;05/01/2016 ERT;In;05/01/2014 ... ... ... ABC;Out;05/01/2017 首先，我试图提取事件所在的所有行，并将这些行保存在新的csv文件中。以下是我迄今为止尝试过的代码： [更新日期：2017年5月18日] with open('csv_in', 'r') as f, open('csv_out','w') as f2:

我将数据存储在csv文件中：

ID;Event;Date
ABC;In;05/01/2015
XYZ;In;05/01/2016
ERT;In;05/01/2014
...     ...       ...
ABC;Out;05/01/2017

首先，我试图提取事件所在的所有行，并将这些行保存在新的csv文件中。以下是我迄今为止尝试过的代码：

[更新日期：2017年5月18日]

with open('csv_in', 'r') as f, open('csv_out','w') as f2:
    fieldnames=['ID','Event','Date']
    reader = csv.DictReader(f, delimiter=';', lineterminator='\n', 
    fieldnames=fieldnames)
    wr = csv.DictWriter(f2,dialect='excel',delimiter=';', 
    lineterminator='\n',fieldnames=fieldnames)
    rows = [row for row in reader if row['Event'] == 'In']
    for row in rows:
        wr.writerows(row)

我得到以下错误：“ValueError:dict包含字段名中没有的字段：'I'，'D'

[/更新]

1/关于如何解决这个问题有什么想法吗

2/下一步，您将如何继续对ID进行“查找”（如果根据ID“ABC”存在多次），并提取事件为“Out”的给定“日期”值

所需输出：

ID        Date         Exit date
ABC     05/01/2015     05/01/2017
XYZ     05/01/2016
ERT     05/01/2014

提前感谢您的投入

PS：不能使用panda..只能使用标准库。

您可以使用标准库解释原始csv，如下所示：

oldcsv=open('csv_in.csv','r').read().split('\n')

newcsv=[]

#this next part checks for events that are in

for line in oldcsv:
    if 'In' in line.split(';'):
       newcsv.append(line)

new_csv_file=open('new_csv.csv','w')
[new_csv_file.write(line+'\n') for line in newcsv]
new_csv_file.close()

您可以使用相同的方法进行查找，只是您要更改for循环中的关键字，如果新生成的列表中有多个项，您的ID出现了多个项，那么只需修改条件以包含两个关键字。这里的错误是因为您没有添加分隔符。语法-

csv.DictReader（f，分隔符='；'）

第二部分

import csv
import datetime

with open('csv_in', 'r') as f, open('csv_out','w') as f2:
    reader = csv.DictReader(f, delimiter=';')
    wr = csv.writer(f2,dialect='excel',lineterminator='\n')
    result = {}
    for row in reader:
      if row['ID'] not in result:
        # Assign Values if not in dictionary
        if row['Event'] == 'In':
          result[row['ID']] = {'IN' : datetime.datetime.strptime(row['Date'], '%d/%m/%Y') }
        else:
          result[row['ID']] = {'OUT' : datetime.datetime.strptime(row['Date'], '%d/%m/%Y') }
      else:
          # Compare dates with those present in csv.
          if row['Event'] == 'In':
            # if 'IN' is not present, use the max value of Datetime to compare
            result[row['ID']]['IN'] = min(result[row['ID']].get('IN', datetime.datetime.max), datetime.datetime.strptime(row['Date'], '%d/%m/%Y'))
          else:
            # Similarly if 'OUT' is not present, use the min value of datetime to compare
            result[row['ID']]['OUT'] = max(result[row['ID']].get('OUT', datetime.datetime.min), datetime.datetime.strptime(row['Date'], '%d/%m/%Y'))
    # format the results back to desired representation
    for v1 in result.values():
      for k2,v2 in v1.items():
        v1[k2] = datetime.datetime.strftime(v2, '%d/%m/%Y')
    wr.writerow(['ID', 'Entry', 'Exit'])
    for row in result:
      wr.writerow([row, result[row].get('IN'), result[row].get('OUT')])

这段代码应该可以正常工作。我在一个小输入上测试了它

您的

csv\u中的“列”在

中的分隔方式是什么？它们之间用“；“示例：ABC；05/01/2015；05/01/2017您应该使用上下文管理器。

将open（filename，'r'）作为f:…

与第1部分仍然存在问题。当我将代码更改为csv.DictReader（f，delimiter='；'）时，生成的文件csv_out仅包含这样写出的标题项（每个项之间有一个换行符）”：D，a，t，e，e，v，e，n，t I，D你能在OP中更新csv格式吗？你在上面的评论中说它被

分隔；

更新了原始消息。很抱歉。你能在读卡器中的行启动后调试并打印行吗？这是我能看到的1）行：[{'ID'：'ABC'，…，'Event'：'in}]2）读卡器：和3）行：{'ID'：'ABC'，..，'Event'：'In'}