Python加载多个带有中文字符的Json文件
我有以下类似的多个Json文件test.Json:Python加载多个带有中文字符的Json文件,python,json,datatable,Python,Json,Datatable,我有以下类似的多个Json文件test.Json: {"_id":{"$oid":"5886dff9129a960d825fd574"},"game_type":6,"desk_id":41387,"round_count":2,"begin_time":{"$date":"2017-01-24T04:58:50.475Z"},"end_time":{"$date":"2017-01-24T05:02:33.959Z"},"club_id":11006,"club_name":"梧州麻将新手圈"
{"_id":{"$oid":"5886dff9129a960d825fd574"},"game_type":6,"desk_id":41387,"round_count":2,"begin_time":{"$date":"2017-01-24T04:58:50.475Z"},"end_time":{"$date":"2017-01-24T05:02:33.959Z"},"club_id":11006,"club_name":"梧州麻将新手圈","owner_nick_name":"牌乐门","create_time":{"$date":"2017-01-24T05:02:49.860Z"},"items":[{"uid":16252,"nickname":"林家斌","win_gold":-4},{"uid":100074706,"nickname":" 年青*战场","win_gold":-4},{"uid":100175661,"nickname":" 所谓","win_gold":12},{"uid":100038017,"nickname":" 暖心","win_gold":-4}],"reason":"玩家退出房间,游戏结算","ok":true}
{"_id":{"$oid":"5886e996129a960d825fdf05"},"game_type":6,"desk_id":38913,"round_count":1,"begin_time":{"$date":"2017-01-24T05:41:26.135Z"},"end_time":{"$date":"2017-01-24T05:43:04.019Z"},"club_id":11006,"club_name":"梧州麻将新手圈","owner_nick_name":"牌乐门","create_time":{"$date":"2017-01-24T05:43:50.020Z"},"items":[{"uid":12028,"nickname":"林2--","win_gold":-2},{"uid":100080735,"nickname":" 圣裔","win_gold":6},{"uid":100087488,"nickname":" 平静","win_gold":-2},{"uid":100017168,"nickname":" 陈颖","win_gold":-2}],"reason":"玩家退出房间,游戏结算","ok":true}
{"_id":{"$oid":"5886ea68129a960d825fe04a"},"game_type":6,"desk_id":40381,"round_count":1,"begin_time":{"$date":"2017-01-24T05:45:40.833Z"},"end_time":{"$date":"2017-01-24T05:47:01.694Z"},"club_id":11006,"club_name":"梧州麻将新手圈","owner_nick_name":"牌乐门","create_time":{"$date":"2017-01-24T05:47:20.723Z"},"items":[{"uid":11987,"nickname":"转转","win_gold":-2},{"uid":100185361,"nickname":" 妞妞儿","win_gold":6},{"uid":100070056,"nickname":" 草木虫","win_gold":-2},{"uid":100195039,"nickname":" 三姑娘","win_gold":-2}],"reason":"玩家退出房间,游戏结算","ok":true}
我试过以下方法:
pd.concat([json_normalize(json.loads(line)) for line in open('test.json')])
但得到以下错误:
--------------------------------------UnicodeCodeError回溯最近的呼叫
最后的
-->1 pd.concat[json_normalizejson.loadsline,用于open'test.json'中的行]
c:\winpython-64bit-2.7.10.2\python-2.7.10.amd64\lib\json\uuuu init\uuuu.pyc
在loadss、encoding、cls、object_hook、parse_float、parse_int、,
解析常数,对象对钩子,**千瓦
336 parse_int为无,parse_float为无且
337 parse_常量为None且object_pairs_hook为None且非kw:
->338返回\u默认\u解码器.decodes
339如果cls为无:
340 cls=JSONDecoder
c:\winpython-64bit-2.7.10.2\python-2.7.10.amd64\lib\json\decoder.pyc
在self中,s
364
365
->366 obj,end=self.raw\u解码,idx=\u ws,0.end
367 end=_-ws,end.end
368如果结束!=镜头:
c:\winpython-64bit-2.7.10.2\python-2.7.10.amd64\lib\json\decoder.pyc
在原始解码中,使用self、s、idx
380
381试试:
->382 obj,结束=自扫描,idx
383除停止迭代外:
384提升值错误无法解码JSON对象
UnicodeDecodeError:“utf8”编解码器无法解码位置2的字节0x9a:
无效的起始字节
还尝试了以下方法:
import codecs
temp = []
with codecs.open('test.json', 'r') as f:
for line in f:
line = line.replace('\n','')
temp.append(line)
map(json.loads,temp)
我也犯了同样的错误
但对于这样的单一Json:
json_normalize(json.loads('{"_id":{"$oid":"5886dff9129a960d825fd574"},"game_type":6,"desk_id":41387,"round_count":2,"begin_time":{"$date":"2017-01-24T04:58:50.475Z"},"end_time":{"$date":"2017-01-24T05:02:33.959Z"},"club_id":11006,"club_name":"梧州麻将新手圈","owner_nick_name":"牌乐门","create_time":{"$date":"2017-01-24T05:02:49.860Z"},"items":[{"uid":16252,"nickname":"林家斌","win_gold":-4},{"uid":100074706,"nickname":" 年青*战场","win_gold":-4},{"uid":100175661,"nickname":" 所谓","win_gold":12},{"uid":100038017,"nickname":" 暖心","win_gold":-4}],"reason":"玩家退出房间,游戏结算","ok":true}'))
所以我得到了我想要的表格:
我想把所有的表连接到一个大表中,就像上面的表一样。
正确的方法是什么?在WinPython-3.6上,如果您在记事本中将文件注册为“UTF-8”,这可能会起作用
import pandas as pd
from pandas.io import json
from pandas.io.json import json_normalize
pd.concat([json_normalize(json.loads(line)) for line in open('test.json', encoding="utf-8-sig")])
_id.$oid begin_time.$date club_id club_name create_time.$date desk_id end_time.$date game_type items ok owner_nick_name reason round_count
0 5886dff9129a960d825fd574 2017-01-24T04:58:50.475Z 11006 梧州麻将新手圈 2017-01-24T05:02:49.860Z 41387 2017-01-24T05:02:33.959Z 6 [{'uid': 16252, 'nickname': '林家斌', 'win_gold':... True 牌乐门 玩家退出房间,游戏结算 2
0 5886e996129a960d825fdf05 2017-01-24T05:41:26.135Z 11006 梧州麻将新手圈 2017-01-24T05:43:50.020Z 38913 2017-01-24T05:43:04.019Z 6 [{'uid': 12028, 'nickname': '林2--', 'win_gold'... True 牌乐门 玩家退出房间,游戏结算 1
0 5886ea68129a960d825fe04a 2017-01-24T05:45:40.833Z 11006 梧州麻将新手圈 2017-01-24T05:47:20.723Z 40381 2017-01-24T05:47:01.694Z 6 [{'uid': 11987, 'nickname': '转转', 'win_gold': ... True 牌乐门 玩家退出房间,游戏结算 1
哦,这真是个好消息,非常感谢你的信息。