Python:如何解析包含空值的CSV文件?

Python:如何解析包含空值的CSV文件?,python,csv,null,Python,Csv,Null,我有一个包含二进制字段的csv文件,当我通过csv.reader(f)读取它时,我得到 包含空值的 我在网上尝试过各种各样的解决方案,比如,但还是会出现同样的错误。我设法逐行阅读,并通过,将其分开,但有些字段中也包含,,因此我想知道如何读取和提取列?一行的示例如下所示: 212344408,"cp233.net","net","cp233","clientTransferProhibited,ClientDeleteProhibited","ENAME TECHNOLOGY CO., LTD."

我有一个包含二进制字段的csv文件,当我通过
csv.reader(f)
读取它时,我得到

包含空值的

我在网上尝试过各种各样的解决方案,比如,但还是会出现同样的错误。我设法逐行阅读,并通过
将其分开,但有些字段中也包含
,因此我想知道如何读取和提取列?一行的示例如下所示:

212344408,"cp233.net","net","cp233","clientTransferProhibited,ClientDeleteProhibited","ENAME TECHNOLOGY CO., LTD.",1331,"DNS1.IIDNS.COM","DNS2.IIDNS.COM","2017-02-14","2018-02-14","2017-02-14","WANG MIN CHUN","wangminchun","WANG MIN CHUN","wangminchun","957596578@QQ.COM","QUANZHOUSHIANXIXIANCHANGKENGXIANGHUAMEICUN","QUAN ZHOU HI","FU,JIAN","362421","CN","+86.59523128184","+86.59523128184","%^^<AD>!^S\0<A8>E<98><AC>/^<A5><A0><C9>7","WANG MIN CHUN","WANG MIN CHUN","957596578@QQ.COM","WANG MIN CHUN","WANG MIN CHUN","957596578@QQ.COM",0,"2017-03-14 21:33:15","2017-03-12 20:44:02",0,"whois_zone_snr","2017-03-14 21:33:15",\N
212344408,“cp233.net”,“net”,“cp233”,“禁止客户转让,禁止客户删除”,“艾美科技有限公司”,1331,“DNS1.IIDNS.COM”,“DNS2.IIDNS.COM”,“2017-02-14”,“2018-02-14”,“2017-02-14”,“王敏春”,“王敏春”,“王敏春”,“王敏春”957596578@QQ.COM“,”泉州市西安县长坑乡花梅村“,”泉州市喜“,”福、健“,“362421”、“CN”、“+86.59523128184”、“+86.59523128184”、“%^^ ^^S\0E/^7“,”王敏春“,”王敏春“,”957596578@QQ.COM“,”王敏春“,”王敏春“,”957596578@QQ.COM“,0,“2017-03-14 21:33:15”,“2017-03-12 20:44:02”,0,“谁是区”,2017-03-14 21:33:15“,\N

如果有任何建议,我将不胜感激。

熊猫对我的案例非常有用,它可以检索文件并跳过那些因为奇怪字符而被破坏的行

import pandas as pd

df = pandas.read_csv(filename, verbose =True , warn_bad_lines = True, error_bad_lines=False, names = header)

这在您的示例中效果很好,我甚至用NULL替换了一个字符串,它处理得很好

test.csv:

212344408,"cp233.net","net","cp233","clientTransferProhibited,ClientDeleteProhibited","ENAME TECHNOLOGY CO., LTD.",1331,"DNS1.IIDNS.COM","DNS2.IIDNS.COM","2017-02-14","2018-02-14","2017-02-14","WANG MIN CHUN","wangminchun","WANG MIN CHUN","wangminchun","957596578@QQ.COM","QUANZHOUSHIANXIXIANCHANGKENGXIANGHUAMEICUN","QUAN ZHOU HI","FU,JIAN","362421","CN","+86.59523128184","+86.59523128184","%^^<AD>!^S\0<A8>E<98><AC>/^<A5><A0><C9>7","WANG MIN CHUN","WANG MIN CHUN","957596578@QQ.COM","WANG MIN CHUN","WANG MIN CHUN","957596578@QQ.COM",0,"2017-03-14 21:33:15","2017-03-12 20:44:02",0,"whois_zone_snr","2017-03-14 21:33:15",\N
212344408,NULL,"net","cp233","clientTransferProhibited,ClientDeleteProhibited","ENAME TECHNOLOGY CO., LTD.",1331,"DNS1.IIDNS.COM","DNS2.IIDNS.COM","2017-02-14","2018-02-14","2017-02-14","WANG MIN CHUN","wangminchun","WANG MIN CHUN","wangminchun","957596578@QQ.COM","QUANZHOUSHIANXIXIANCHANGKENGXIANGHUAMEICUN","QUAN ZHOU HI","FU,JIAN","362421","CN","+86.59523128184","+86.59523128184","%^^<AD>!^S\0<A8>E<98><AC>/^<A5><A0><C9>7","WANG MIN CHUN","WANG MIN CHUN","957596578@QQ.COM","WANG MIN CHUN","WANG MIN CHUN","957596578@QQ.COM",0,"2017-03-14 21:33:15","2017-03-12 20:44:02",0,"whois_zone_snr","2017-03-14 21:33:15",\N

如果这不是你正在经历的行为,你能提供一行失败的代码吗?

为什么最后会有一个
\N
?你能显示你的cvs对象配置吗?@SatishGarg:这是NUL字节的常见表示法。这是Python 2还是3?你对f中的行尝试过
reader=csv.reader(line.translate({0:None})了吗
方法(例如,简单地删除NUL字节)?熊猫的可能重复是一种方法。上面的
df
将创建一个数据帧,这是一种更宽松的结构。因此,您不应该遇到与使用csv模块时相同的错误。
import csv
with open('test.csv', 'r') as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)