在python中提取子字符串时出错
我有一个数据框,看起来像这样:在python中提取子字符串时出错,python,split,substring,python-3.5,Python,Split,Substring,Python 3.5,我有一个数据框,看起来像这样: varA json_data 'string1' {"str":{"str":"string","str":12345,"str":"str","str":"str","xyz":"1234","zyx":"string1","str":[{"str":"str","str":"str"}],"str":["str","str"],"str":["str"]},"str":"str"} 'string2'
varA json_data
'string1' {"str":{"str":"string","str":12345,"str":"str","str":"str","xyz":"1234","zyx":"string1","str":[{"str":"str","str":"str"}],"str":["str","str"],"str":["str"]},"str":"str"}
'string2' {"str":{"str":"string","str":12345,"str":"str","str":"str","xyz":"4567","zyx":"string2","str":[{"str":"str","str":"str"}],"str":["str","str"],"str":["str"]},"str":"str"}
'string3' {"str":{"str":"string","str":12345,"str":"str","str":"str","xyz":"8910","zyx":"string3","str":[{"str":"str","str":"str"}],"str":["str","str"],"str":["str"]},"str":"str"}
varA json_data xyz zyx
'string' ... 1234 string1
'string' ... 4567 string2
'string' ... 8910 string3
我需要创建新列,并将json_数据中提取的数字填充到第一列,将“string1”、“string2”和“string3”值填充到第二个新列。生成的数据集如下所示:
varA json_data
'string1' {"str":{"str":"string","str":12345,"str":"str","str":"str","xyz":"1234","zyx":"string1","str":[{"str":"str","str":"str"}],"str":["str","str"],"str":["str"]},"str":"str"}
'string2' {"str":{"str":"string","str":12345,"str":"str","str":"str","xyz":"4567","zyx":"string2","str":[{"str":"str","str":"str"}],"str":["str","str"],"str":["str"]},"str":"str"}
'string3' {"str":{"str":"string","str":12345,"str":"str","str":"str","xyz":"8910","zyx":"string3","str":[{"str":"str","str":"str"}],"str":["str","str"],"str":["str"]},"str":"str"}
varA json_data xyz zyx
'string' ... 1234 string1
'string' ... 4567 string2
'string' ... 8910 string3
我的代码如下:
df['xyz'] = df['json_data'].str.split('xyz":')[1].split('","zyx')[0]
但是,我得到一个错误:
AttributeError: 'list' object has no attribute 'split'
我怎么修理它?还有别的选择吗 停止尝试拆分列表。
{“string4”、“string5”、“xyz”:“1234”、“zyx”:“string10”}
不是合法的Python表达式:它既不是集合,也不是字典。请发布实际的数据帧。为什么不将JSON视为JSON,而不是将其作为字符串进行操作?@FeyziBagirov您发布的代码有一些明显的错误,与您看到的问题无关(数据也是如此)。因此,人们很难帮助你。请复制/粘贴您正在测试的代码和数据以防止此问题。同时,df['json_data'].str.split('xyz:“).str[1].str.split(','zyx')。str[0]
是错误语句的更正版本,但它似乎不是解决大问题的正确方法。