从python解析的数据生成字典_Python_Json_Parsing_Dictionary

从python解析的数据生成字典

python json parsing dictionary

从python解析的数据生成字典,python,json,parsing,dictionary,Python,Json,Parsing,Dictionary,我这里有一段代码，用于解析来自web的som信息： import lxml.html from lxml.etree import XPath import json url = "http://gbgfotboll.se/information/?scr=table&ftid=51168" date = '2014-09-27' # use this in real mode: currentDate = (time.strftime("%Y-%m-%d")) list = []

我这里有一段代码，用于解析来自web的som信息：

import lxml.html
from lxml.etree import XPath
import json



url = "http://gbgfotboll.se/information/?scr=table&ftid=51168"
date = '2014-09-27'
# use this in real mode: currentDate = (time.strftime("%Y-%m-%d"))
list = []
id = 0
score = ""
rows_xpath = XPath("//*[@id='content-primary']/table[3]/tbody/tr[td[1]/span/span//text()='%s']" % (date))
time_xpath = XPath("td[1]/span/span//text()[2]")
team_xpath = XPath("td[2]/a/text()")

html = lxml.html.parse(url)

for row in rows_xpath(html):
    time = time_xpath(row)[0].strip()
    team = team_xpath(row)[0]
    list.append("%d:"%id  + time + " " + team + " " + score)
    id += 1

print json.dumps(list)

其中打印：

0:13:00 Romelanda UF - IK Virgo (empty score for now)
1:15:00 Kode IF - IK Kongah\xe4lla (empty score)
etc..

我的第一个子问题是，一些解析的数据将包含字母åäö我该如何修复，以便它打印出正确的字母，正如您在结果第二行中看到的那样，它打印出Kongah\xe4lla，应该是Konghälla

主要问题我如何将该列表转换为字典，以便最终json输出如下：

{"id":"0", "time":"13:00", "teams":"Romelanda UF - IK Virgo", "score":"empty" }
etc...

谢谢你

对于您的第一个问题，\xe4不是ascii字符，如果您想打印出来，可能您可以尝试使用windows-1252之类的编码对其进行解码

当我尝试这一点时，它对我起了作用：

a='\xe4'
b=a.decode('windows-1252')
print b

对于第二个问题，只需将代码修改为：

for i,row in enumerate(rows_xpath(html)):
    #
    #
    list.append({"id":str(i), "time":time, "teams":team, "score":score})

我想你并不真的想命名你的列表，它是python的关键字~ 祝你好运

顺便说一句，枚举自动生成索引，您仍然可以使用您的id，只需执行以下操作：

 list.append({"id":str(id), "time":time, "teams":team, "score":score})

非常感谢你！这对我帮助很大。但是关于你的第一个答案。正如你所看到的，它解析来自网络的信息，所以我不知道这三个字母中的哪一个会在何时何地出现。每天都会有新的信息被解析，那么我该如何做一些事情，在每次解析和更正时都进行检查呢@lisnb还请注意，当涉及字典时，您不能使用.append。