Python 将刮取的数据转换为字典
我有一个XML文件,在运行漂亮的soup findAll(“命名查询”)并将其打印出来后,我得到如下结果:Python 将刮取的数据转换为字典,python,beautifulsoup,scrapy,Python,Beautifulsoup,Scrapy,我有一个XML文件,在运行漂亮的soup findAll(“命名查询”)并将其打印出来后,我得到如下结果: <named-query name="sdfsdfsdf"> <query> ---Query here...-- </query> </named-query> <named-query name="xkjlias"> <query>
<named-query name="sdfsdfsdf">
<query>
---Query here...--
</query>
</named-query>
<named-query name="xkjlias">
<query>
---Query here...--
</query>
</named-query>
.
.
.
---在这里查询--
---在这里查询--
.
.
.
有没有办法将其转换为字典、json或类似csv的格式:
name=“sdfsdfsdf”
查询=
name=“xkjlias”
查询=
提前感谢。试试这个:
# initialize a dictionary
data = {}
# for each tag 'named-query
for named_query in soup.findAll('named-query'):
# get the value of name attribute and store it in a dict
data['name'] = named_query.attrs['name']
# traverse its children
for child in named_query.children:
# check for '\n' and empty strings
if len(child.string.strip()) > 0:
data['query'] = child.string.strip()
print (data)
试试这个:
# initialize a dictionary
data = {}
# for each tag 'named-query
for named_query in soup.findAll('named-query'):
# get the value of name attribute and store it in a dict
data['name'] = named_query.attrs['name']
# traverse its children
for child in named_query.children:
# check for '\n' and empty strings
if len(child.string.strip()) > 0:
data['query'] = child.string.strip()
print (data)
代码:
导入json
从bs4导入BeautifulSoup
text=”“”
---在这里查询--
---在此查询2--
"""
soup=BeautifulSoup(文本“html.parser”)
querys={nq.attrs['name']:nq.text.strip()表示汤中的nq.find_all('named-query')}
查询\u json=json.dumps(查询)
打印(查询)#记录
打印(查询_json)#json
输出:
{'sdfsdfsdf':'-在此处查询…-','xkjlias':'-在此处查询2…-'}
{“sdfsdfsdf”:--“此处查询…”,“xkjlias”:--“此处查询2…”
代码:
导入json
从bs4导入BeautifulSoup
text=”“”
---在这里查询--
---在此查询2--
"""
soup=BeautifulSoup(文本“html.parser”)
querys={nq.attrs['name']:nq.text.strip()表示汤中的nq.find_all('named-query')}
查询\u json=json.dumps(查询)
打印(查询)#记录
打印(查询_json)#json
输出:
{'sdfsdfsdf':'-在此处查询…-','xkjlias':'-在此处查询2…-'}
{“sdfsdfsdf”:--“此处查询…”,“xkjlias”:--“此处查询2…”