Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/301.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
读取使用python拆分为多行的行_Python_Python 3.x_Pandas - Fatal编程技术网

读取使用python拆分为多行的行

读取使用python拆分为多行的行,python,python-3.x,pandas,Python,Python 3.x,Pandas,读取数据时,数据中存在差异,就像少数几行被拆分为多行一样(从第6行开始)。下面是我的数据和代码,你们能帮我吗 数据: 代码: 我将使用字符串操作(replace)将“|\n”替换为”): 嗨,安迪,谢谢你在这个问题上的快速解决,但是它抛出了一个错误:df4_mk=pd.read_csv(StringIO(s.replace(“\n |)”,“|”)、header=None、delimiter='|'、index|col=0、names=['record_type'、'unique_system_

读取数据时,数据中存在差异,就像少数几行被拆分为多行一样(从第6行开始)。下面是我的数据和代码,你们能帮我吗

数据:

代码:


我将使用字符串操作(
replace
)将
“|\n”
替换为
):


嗨,安迪,谢谢你在这个问题上的快速解决,但是它抛出了一个错误:df4_mk=pd.read_csv(StringIO(s.replace(“\n |)”,“|”)、header=None、delimiter='|'、index|col=0、names=['record_type'、'unique_system_identifier'、'uls_file_number'、'ebf_number',TypeError:需要像object这样的字节,而不是'str'@BharatGuda。你在使用python 3吗?我想你可能必须使用BytesIO而不是StringIO?和
。替换(b“\n|”,b“|”)
。奇怪的是我没有看到这一点。@BharatGuda yep,看起来您需要在python 2上使用io导入BytesIO的
pd.read_csv(BytesIO(s.replace(“\n |“,“|”)、分隔符=“|”,header=None)#加上其他参数
!Andy Hayden-作业运行正常,没有错误,但从第6列开始有空值
MP|3560039|||L000011396|BTA171|30071: PHILLIPS, MT|4253|I|103278|||D|1
MP|3561042|||WQTI544|BEA148|16023: BUTTE, ID|2891|I|103306|||D|1
MP|3561042|||WQTI544|BEA148|16077: POWER, ID|7817|I|103306|||D|1
MP|3561042|||WQTI544|BEA148|16011: BINGHAM, ID|45607|I|103306|||D|1
MP|3561042|||WQTI544|BEA148|16005: BANNOCK, ID|82839|I|103306|||D|1
MP|3561250|||WQTI576
|BEA135|48301: LOVING, TX|82|I|103308|||D|1
MP|3561250|||WQTI576
|BEA135|48443: TERRELL, TX|984|I|103308|||D|1
MP|3561250|||WQTI576
|BEA135|48173: GLASSCOCK, TX|1226|I|103308|||D|1
MP|3561250|||WQTI576
|BEA135|48243: JEFF DAVIS, TX|2342|I|103308|||D|1
MP|3561250|||WQTI576
|BEA135|48461: UPTON, TX|3355|I|103308|||D|1
MP|3561250|||WQTI576
|BEA135|48383: REAGAN, TX|3367|I|103308|||D|1
df4_mk = pd.read_csv(zf1.open('MP.dat'),header=None,delimiter='|', index_col=0, names=['record_type', 'unique_system_identifier', 'uls_file_number','ebf_number','call_sign',
                           'market_partition_code','defined_partition_area','defined_area_population','include_exclude_ind','partition_sequence_area_id',
                           'action_performed','census_figures','def_undef_ind','partition_sequence_number'],low_memory=False,
                           dtype={'record_type':str,'unique_system_identifier':int,'uls_file_number':str,'ebf_number':str,'call_sign': str,
                                  'market_partition_code':str,'defined_partition_area':str,'defined_area_population':int,'include_exclude_ind':str,
                                  'partition_sequence_area_id':int,'action_performed': str,'census_figures': int,'def_undef_ind': str,'partition_sequence_number':int })
In [11]: s = open('MP.dat').read()

In [12]: print(s.replace("\n|", "|"))
MP|3560039|||L000011396|BTA171|30071: PHILLIPS, MT|4253|I|103278|||D|1
MP|3561042|||WQTI544|BEA148|16023: BUTTE, ID|2891|I|103306|||D|1
MP|3561042|||WQTI544|BEA148|16077: POWER, ID|7817|I|103306|||D|1
MP|3561042|||WQTI544|BEA148|16011: BINGHAM, ID|45607|I|103306|||D|1
MP|3561042|||WQTI544|BEA148|16005: BANNOCK, ID|82839|I|103306|||D|1
MP|3561250|||WQTI576|BEA135|48301: LOVING, TX|82|I|103308|||D|1
MP|3561250|||WQTI576|BEA135|48443: TERRELL, TX|984|I|103308|||D|1
MP|3561250|||WQTI576|BEA135|48173: GLASSCOCK, TX|1226|I|103308|||D|1
MP|3561250|||WQTI576|BEA135|48243: JEFF DAVIS, TX|2342|I|103308|||D|1
MP|3561250|||WQTI576|BEA135|48461: UPTON, TX|3355|I|103308|||D|1
MP|3561250|||WQTI576|BEA135|48383: REAGAN, TX|3367|I|103308|||D|1

In [13]: from io import StringIO
    ...: pd.read_csv(StringIO(s.replace("\n|", "|")), delimiter='|', header=None) # plus other args