Python 3.x 从数据帧中删除附加到值的垃圾(';[';,';]';,';等)
有一个经过整理的json文件,它被转换为csv文件Python 3.x 从数据帧中删除附加到值的垃圾(';[';,';]';,';等),python-3.x,pandas,dataframe,Python 3.x,Pandas,Dataframe,有一个经过整理的json文件,它被转换为csv文件 appended_data = [] for file in glob.glob('data-part.json'): dfjson = pd.read_json(file,encoding='utf-8',lines=True,dtype=str,error_bad_lines=False) appended_data.append(dfjson) appended_data = pd.concat(appended_data
appended_data = []
for file in glob.glob('data-part.json'):
dfjson = pd.read_json(file,encoding='utf-8',lines=True,dtype=str,error_bad_lines=False)
appended_data.append(dfjson)
appended_data = pd.concat(appended_data)
appended_data.to_csv("data.csv",index = False)
但是,在打开convert csv文件时,它看起来是这样的(代码段如下所示)
但是需要csv文件看起来像这样(因为需要对其进行一些搜索)
如何捕获这些垃圾('[',']'等)并规范化数据查看数据帧。替换方法:尝试了
df.replace(regex={'^\[\'.'.'.'.\]$:'''^\[\]$':'''.\]$,''.\[\]$':''})
但数据帧中没有任何更改。。
color gear_type oil_type material date_purchase
[] ['Helical'] ['Synthetic'] ['Composite'] 20201505
[] ['Axle'] ['High Mileage'] ['Asphalt'] 20201505
nan ['Front-Axle'] ['Synthetic'] ['Vulcanised'] 20201505
nan ['Bevel'] ['Conventional'] ['Carbon black'] 20201505
color gear_type oil_type material date_purchase
nan Helical Synthetic Composite 20201505
nan Axle High Mileage Asphalt 20201505
nan Front-Axle Synthetic Vulcanised 20201505
nan Bevel Conventional Carbon black 20201505