Python 正在分析没有<;表>/&书信电报;td>/&书信电报;tr>;标记和数据嵌套在<;部门>;标签-beautifulsoup、selenium和webdriver_管理器

Python 正在分析没有<;表>/&书信电报;td>/&书信电报;tr>;标记和数据嵌套在<;部门>;标签-beautifulsoup、selenium和webdriver_管理器,python,html,selenium,beautifulsoup,webdriver,Python,Html,Selenium,Beautifulsoup,Webdriver,我正在尝试获取此url中的所有表=”https://www.topuniversities.com/university-rankings/university-subject-rankings/2021/psychology". 问题是没有表标记,也没有和标记。行中的所有数据都在嵌套的“div”标记中。 我使用的代码是: from bs4 import BeautifulSoup from selenium import webdriver from webdriver_manager.fir

我正在尝试获取此url中的所有表=”https://www.topuniversities.com/university-rankings/university-subject-rankings/2021/psychology". 问题是没有
标记,也没有
标记。行中的所有数据都在嵌套的“div”标记中。 我使用的代码是:

from bs4 import BeautifulSoup
from selenium import webdriver
from webdriver_manager.firefox import GeckoDriverManager
import time

driver = webdriver.Firefox(executable_path=GeckoDriverManager().install())
driver.maximize_window()
driver.get(url)

time.sleep(5)
content = driver.page_source.encode('utf-8').strip()
soup = BeautifulSoup(content,"html.parser")

driver.quit()
print(soup)
另外,我只从嵌套的
标记中的一列(名为“总分”的列)中获取数据。 我还意识到,
soup
输出中只有前10行的数据,但我正在尝试获取所有302行的数据

非常感谢你能给我的任何建议

编辑 在昆都克的回答之后,我终于得到了我所期望的。这是我最后使用的代码:

res = requests.get('https://www.topuniversities.com/sites/default/files/qs-rankings-data/en/3519089_indicators.txt?1614801117').json()

df = pd.DataFrame(res["data"])
df = df[["uni", "region", "location", "city", "overall",
         "ind_69", "ind_70", "ind_76", "ind_77"]]
headers = {"uni":"University", "overall": "Overall Score", "ind_69": "H-index Citations",
           "ind_70": "Citations per Paper", "ind_76": "Academic Reputation", "ind_77": "Employer Reputation"}
df.rename(columns=headers, inplace=True)
for column in headers.values():
    df[column] = df[column].apply(lambda value: BeautifulSoup(value, 'html.parser').find('div').text)
df
数据帧如下所示:
我已检查了您提供的URL。似乎数据(从XHR请求@接收)是通过分页进行拆分的,这就是为什么您只看到其中的10个条目

处理此问题有两种选择:

  • 模拟单击下一页按钮
  • 以JSON格式从XHR URL读取完整数据

  • 您不需要selenium,如果您转到网络选项卡,您将看到下面的链接,该链接以json的形式返回数据。您需要遍历它并获取值

    代码:

    import requests
    import json
    res=requests.get("https://www.topuniversities.com/sites/default/files/qs-rankings-data/en/3519089.txt?1615516693?v=1616064930668").json()
    
    print("Total records :{}".format(len(res['data'])))
    for item in res['data']:
         print(item['country'])
         print(item['city'])
         print(item['score'])
         print("============")
    
    Total records :302
    United States
    Cambridge
    98.6
    ============
    United States
    Stanford
    96.4
    ============
    United Kingdom
    Oxford
    95.5
    ============
    United Kingdom
    Cambridge
    94.8
    ============
    United States
    Berkeley
    92.3
    ============
    United States
    Los Angeles
    91.4
    ============
    United States
    New Haven
    90.9
    ============
    United States
    Ann Arbor
    89.5
    ============
    United States
    Cambridge
    89.3
    ============
    United Kingdom
    London
    89.2
    ============
    United States
    Philadelphia
    89.2
    ============
    United States
    New York City
    89.1
    ============
    United States
    New York City
    88.4
    ============
    United States
    Chicago
    88.2
    ============
    Netherlands
    Amsterdam
    87.7
    ============
    Singapore
    Singapore
    87.2
    ============
    Canada
    Vancouver
    87.2
    ============
    United States
    Princeton
    87
    ============
    Canada
    Toronto
    86.1
    ============
    United Kingdom
    London
    85.7
    ============
    Australia
    Parkville
    85.7
    ============
    United States
    Evanston
    85.5
    ============
    Belgium
    Leuven
    85.2
    ============
    United Kingdom
    London
    85.1
    ============
    Australia
    Sydney
    85.1
    ============
    Australia
    Brisbane
    84.4
    ============
    Singapore
    Singapore
    84.3
    ============
    United States
    Durham
    83.6
    ============
    Canada
    Montreal
    83.5
    ============
    Australia
    Sydney
    83.4
    ============
    Netherlands
    Utrecht
    82.9
    ============
    United States
    Champaign
    82.7
    ============
    United Kingdom
    Edinburgh
    82.5
    ============
    United Kingdom
    Manchester
    81.7
    ============
    Hong Kong SAR
    Hong Kong
    81.7
    ============
    United States
    Austin
    81.6
    ============
    United States
    Pittsburgh
    81.5
    ============
    Australia
    Canberra
    81.3
    ============
    Netherlands
    Rotterdam
    81.2
    ============
    United States
    East Lansing
    81.1
    ============
    Germany
    Berlin
    81
    ============
    Australia
    Perth
    81
    ============
    Germany
    Berlin
    80.9
    ============
    Netherlands
    Groningen
    80.9
    ============
    United States
    Ithaca
    80.7
    ============
    Hong Kong SAR
    Hong Kong
    80.4
    ============
    United States
    Madison
    80.4
    ============
    United States
    Columbus
    80.3
    ============
    Switzerland
    Zürich
    80.3
    ============
    United States
    San Diego
    80.2
    ============
    Australia
    Melbourne
    80.1
    ============
    Netherlands
    Leiden
    79.8
    ============
    United States
    Seattle
    79.8
    ============
    Netherlands
    Tilburg
    79.6
    ============
    United States
    Minneapolis
    79.5
    ============
    China (Mainland)
    Beijing
    79.4
    ============
    New Zealand
    Auckland
    79.3
    ============
    Netherlands
    Maastricht
    79.1
    ============
    United States
    University Park
    79.1
    ============
    United States
    Chapel Hill
    79.1
    ============
    Belgium
    Louvain-la-Neuve
    78.9
    ============
    Netherlands
    Nijmegen
    78.5
    ============
    United Kingdom
    Coventry
    78.5
    ============
    United States
    Nashville
    78.5
    ============
    Netherlands
    Amsterdam
    78.5
    ============
    United States
    Baltimore
    78.4
    ============
    United Kingdom
    Exeter
    78.3
    ============
    United States
    College Park
    78.3
    ============
    United Kingdom
    Cardiff
    78.2
    ============
    Germany
    Munich
    78.2
    ============
    Chile
    Santiago
    78.1
    ============
    New Zealand
    Kelburn, Wellington
    78.1
    ============
    United States
    Providence
    78
    ============
    Australia
    Sydney
    77.8
    ============
    Belgium
    Ghent
    77.8
    ============
    United States
    Boston
    77.3
    ============
    United States
    Los Angeles
    77.3
    ============
    Japan
    Tokyo
    77.1
    ============
    United Kingdom
    Birmingham
    77.1
    ============
    United Kingdom
    Bristol
    77
    ============
    New Zealand
    Dunedin
    77
    ============
    China (Mainland)
    Beijing
    76.9
    ============
    Italy
    Rome
    76.9
    ============
    Italy
    Padua
    76.9
    ============
    United States
    Charlottesville
    76.9
    ============
    Sweden
    Stockholm
    76.8
    ============
    Spain
    Madrid
    76.8
    ============
    United Kingdom
    York
    76.8
    ============
    United States
    Phoenix
    76.6
    ============
    Denmark
    Aarhus
    76.5
    ============ so on..
    
    输出:

    import requests
    import json
    res=requests.get("https://www.topuniversities.com/sites/default/files/qs-rankings-data/en/3519089.txt?1615516693?v=1616064930668").json()
    
    print("Total records :{}".format(len(res['data'])))
    for item in res['data']:
         print(item['country'])
         print(item['city'])
         print(item['score'])
         print("============")
    
    Total records :302
    United States
    Cambridge
    98.6
    ============
    United States
    Stanford
    96.4
    ============
    United Kingdom
    Oxford
    95.5
    ============
    United Kingdom
    Cambridge
    94.8
    ============
    United States
    Berkeley
    92.3
    ============
    United States
    Los Angeles
    91.4
    ============
    United States
    New Haven
    90.9
    ============
    United States
    Ann Arbor
    89.5
    ============
    United States
    Cambridge
    89.3
    ============
    United Kingdom
    London
    89.2
    ============
    United States
    Philadelphia
    89.2
    ============
    United States
    New York City
    89.1
    ============
    United States
    New York City
    88.4
    ============
    United States
    Chicago
    88.2
    ============
    Netherlands
    Amsterdam
    87.7
    ============
    Singapore
    Singapore
    87.2
    ============
    Canada
    Vancouver
    87.2
    ============
    United States
    Princeton
    87
    ============
    Canada
    Toronto
    86.1
    ============
    United Kingdom
    London
    85.7
    ============
    Australia
    Parkville
    85.7
    ============
    United States
    Evanston
    85.5
    ============
    Belgium
    Leuven
    85.2
    ============
    United Kingdom
    London
    85.1
    ============
    Australia
    Sydney
    85.1
    ============
    Australia
    Brisbane
    84.4
    ============
    Singapore
    Singapore
    84.3
    ============
    United States
    Durham
    83.6
    ============
    Canada
    Montreal
    83.5
    ============
    Australia
    Sydney
    83.4
    ============
    Netherlands
    Utrecht
    82.9
    ============
    United States
    Champaign
    82.7
    ============
    United Kingdom
    Edinburgh
    82.5
    ============
    United Kingdom
    Manchester
    81.7
    ============
    Hong Kong SAR
    Hong Kong
    81.7
    ============
    United States
    Austin
    81.6
    ============
    United States
    Pittsburgh
    81.5
    ============
    Australia
    Canberra
    81.3
    ============
    Netherlands
    Rotterdam
    81.2
    ============
    United States
    East Lansing
    81.1
    ============
    Germany
    Berlin
    81
    ============
    Australia
    Perth
    81
    ============
    Germany
    Berlin
    80.9
    ============
    Netherlands
    Groningen
    80.9
    ============
    United States
    Ithaca
    80.7
    ============
    Hong Kong SAR
    Hong Kong
    80.4
    ============
    United States
    Madison
    80.4
    ============
    United States
    Columbus
    80.3
    ============
    Switzerland
    Zürich
    80.3
    ============
    United States
    San Diego
    80.2
    ============
    Australia
    Melbourne
    80.1
    ============
    Netherlands
    Leiden
    79.8
    ============
    United States
    Seattle
    79.8
    ============
    Netherlands
    Tilburg
    79.6
    ============
    United States
    Minneapolis
    79.5
    ============
    China (Mainland)
    Beijing
    79.4
    ============
    New Zealand
    Auckland
    79.3
    ============
    Netherlands
    Maastricht
    79.1
    ============
    United States
    University Park
    79.1
    ============
    United States
    Chapel Hill
    79.1
    ============
    Belgium
    Louvain-la-Neuve
    78.9
    ============
    Netherlands
    Nijmegen
    78.5
    ============
    United Kingdom
    Coventry
    78.5
    ============
    United States
    Nashville
    78.5
    ============
    Netherlands
    Amsterdam
    78.5
    ============
    United States
    Baltimore
    78.4
    ============
    United Kingdom
    Exeter
    78.3
    ============
    United States
    College Park
    78.3
    ============
    United Kingdom
    Cardiff
    78.2
    ============
    Germany
    Munich
    78.2
    ============
    Chile
    Santiago
    78.1
    ============
    New Zealand
    Kelburn, Wellington
    78.1
    ============
    United States
    Providence
    78
    ============
    Australia
    Sydney
    77.8
    ============
    Belgium
    Ghent
    77.8
    ============
    United States
    Boston
    77.3
    ============
    United States
    Los Angeles
    77.3
    ============
    Japan
    Tokyo
    77.1
    ============
    United Kingdom
    Birmingham
    77.1
    ============
    United Kingdom
    Bristol
    77
    ============
    New Zealand
    Dunedin
    77
    ============
    China (Mainland)
    Beijing
    76.9
    ============
    Italy
    Rome
    76.9
    ============
    Italy
    Padua
    76.9
    ============
    United States
    Charlottesville
    76.9
    ============
    Sweden
    Stockholm
    76.8
    ============
    Spain
    Madrid
    76.8
    ============
    United Kingdom
    York
    76.8
    ============
    United States
    Phoenix
    76.6
    ============
    Denmark
    Aarhus
    76.5
    ============ so on..
    
    网络选项卡


    你能应用一些表格的例子吗?我检查了URL并得到了“this”,我找不到你提到的链接。顺便问一下,有没有什么方法可以从网站url中提取此链接而不必手动检查?@franco_ramos:正如我提到的,如果你进入“网络”选项卡>选择xhr并按
    control+r
    ,你将获得链接。@franco_ramos:如果这有助于未来的读者,请将答案向上投票。谢谢。