Python 将dict转换为数据帧
我的数据如下所示:Python 将dict转换为数据帧,python,json,pandas,Python,Json,Pandas,我的数据如下所示: {u'"57e01311817bc367c030b390"': u'{"ad_since": 2016, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}', u'"57e01311817bc367c030b3a8"': u'{"ad_since": 2012, "indoor_swimming_pool": "No", "seaside": "No", "handicapp
{u'"57e01311817bc367c030b390"': u'{"ad_since": 2016, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}', u'"57e01311817bc367c030b3a8"': u'{"ad_since": 2012, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}'}
我想把它转换成熊猫数据帧。但是当我尝试的时候
df = pd.DataFrame(response.items())
我得到一个包含两列的数据帧,第一列包含第一个键,第二列包含键的值:
0 1
0 "57e01311817bc367c030b390" {"ad_since": 2016, "indoor_swimming_pool": "No...
1 "57e01311817bc367c030b3a8" {"ad_since": 2012, "indoor_swimming_pool": "No...
如何为每个键获取一列:
“ad\U自”
,“室内游泳池”
,“室内游泳池”
?并保留第一列,或获取id作为索引。您需要通过将类型的列str
转换为dict
。apply(literal\u eval)
或。apply(json.loads)
,然后使用:
由于值是字符串,因此可以使用和列表:
In [20]: d = {u'"57e01311817bc367c030b390"': u'{"ad_since": 2016, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}', u'"57e01311817bc367c030b3a8"': u'{"ad_since": 2012, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}'}
In [21]: import json
In [22]: pd.DataFrame(dict([(k, [json.loads(e)[k] for e in d.values()]) for k in json.loads(d.values()[0])]), index=d.keys())Out[22]:
ad_since handicapped_access indoor_swimming_pool \
"57e01311817bc367c030b390" 2016 Yes No
"57e01311817bc367c030b3a8" 2012 Yes No
seaside
"57e01311817bc367c030b390" No
"57e01311817bc367c030b3a8" No
Try read_json您是否使用pd.DataFrame(response.items())
尝试示例数据?对我来说,它不起作用。@jezrael谢谢你的评论,我编辑了我的post@RichardRublev我试过了,但出现了错误TypeError:Expected String或Unicode
@mitsi-谢谢。但我认为两条记录很好,但现在只有一条记录——数据帧中的第二行丢失了。你能添加一些json或json列表吗?使用第一个方法(使用literal\u eval
)和整个数据集,我得到了错误ValueError:malformed string
可能是因为特殊字符。但是它与第二种方法的json.loadsjson.loads完美结合,谢谢你,我很高兴能帮助你。
import pandas as pd
import json
response = {u'"57e01311817bc367c030b390"': u'{"ad_since": 2016, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}',
u'"57e01311817bc367c030b3a8"': u'{"ad_since": 2012, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}'}
df = pd.DataFrame.from_dict(response, orient='index')
df.iloc[:,0] = df.iloc[:,0].apply(json.loads)
print (pd.DataFrame.from_records(df.iloc[:,0].values.tolist(), index=df.index))
ad_since handicapped_access indoor_swimming_pool \
"57e01311817bc367c030b3a8" 2012 Yes No
"57e01311817bc367c030b390" 2016 Yes No
seaside
"57e01311817bc367c030b3a8" No
"57e01311817bc367c030b390" No
In [20]: d = {u'"57e01311817bc367c030b390"': u'{"ad_since": 2016, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}', u'"57e01311817bc367c030b3a8"': u'{"ad_since": 2012, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}'}
In [21]: import json
In [22]: pd.DataFrame(dict([(k, [json.loads(e)[k] for e in d.values()]) for k in json.loads(d.values()[0])]), index=d.keys())Out[22]:
ad_since handicapped_access indoor_swimming_pool \
"57e01311817bc367c030b390" 2016 Yes No
"57e01311817bc367c030b3a8" 2012 Yes No
seaside
"57e01311817bc367c030b390" No
"57e01311817bc367c030b3a8" No