Web scraping 如何从json ld代码片段中提取数据？_Web Scraping_Beautifulsoup_Json Ld

Web scraping 如何从json ld代码片段中提取数据？

web-scraping

Web scraping 如何从json ld代码片段中提取数据？,web-scraping,beautifulsoup,json-ld,Web Scraping,Beautifulsoup,Json Ld,我试图从这些代码中获取坐标（“纬度”和“经度”） > <script type="application/ld+json"> > {"@context":"http://schema.org","@graph":[ > {"@type":"Place","address": > {"@type&q

我试图从这些代码中获取坐标（“纬度”和“经度”）

> <script type="application/ld+json">
> {"@context":"http://schema.org","@graph":[
>   {"@type":"Place","address":
>       {"@type":"PostalAddress","streetAddress":"XX, XX"},"geo":
>       {"@type":"GeoCoordinates","latitude":50.08872,"longitude":20.0297}}]}
> </script>

但即使是这个脚本也给了我以前的json ld代码块（完整html代码中的第一块）

我甚至希望能像字符串一样获得json ld块

谢谢

导入json
从bs4导入BeautifulSoup
data=”“”
{@context”：http://schema.org“，“@graph”：[
{@type:“地点”，“地址”：
{@type:“PostalAddress”，“streetAddress:“XX，XX”}，“geo”：
{@type:“地理坐标”，“纬度”：50.08872，“经度”：20.0297}]}
"""
soup=BeautifulSoup（数据'html.parser'）
目标=汤。选择一个（“脚本”）。字符串
match=json.load（目标）
打印（类型（匹配））
打印（匹配）


{@context'：'http://schema.org“，”

req = requests.get(link)
soup = BeautifulSoup(req.text, 'html.parser')
text_ = json.loads("".join(soup.find("script", {"type":"application/ld+json"}).contents)

<class 'dict'>
{'@context': 'http://schema.org', '@graph': [{'@type': 'Place', 'address': {'@type': 'PostalAddress', 'streetAddress': 'XX, XX'}, 'geo': {'@type': 'GeoCoordinates', 'latitude': 50.08872, 'longitude': 20.0297}}]}