Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/307.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 为什么我从BeautifulSoup获得的HTML与我检查元素时看到的HTML不一样?_Python_Html_Beautifulsoup - Fatal编程技术网

Python 为什么我从BeautifulSoup获得的HTML与我检查元素时看到的HTML不一样?

Python 为什么我从BeautifulSoup获得的HTML与我检查元素时看到的HTML不一样?,python,html,beautifulsoup,Python,Html,Beautifulsoup,我正在做一个用户名刮刀,我真的不明白为什么在解析它时HTML会“消失”。让我们以这个网站为例: 看到了吗,里面有一个tbody和一堆桌子? 当我解析它并将它输出到shell时,tbody是空的 <div style="background: #333; box-shadow: 0 0 2px #000; padding: 10px;"> <table class="lktable" id="leaderboard_table" width="100%">

我正在做一个用户名刮刀,我真的不明白为什么在解析它时HTML会“消失”。让我们以这个网站为例:

看到了吗,里面有一个tbody和一堆桌子? 当我解析它并将它输出到shell时,tbody是空的

   <div style="background: #333; box-shadow: 0 0 2px #000; padding: 10px;">
    <table class="lktable" id="leaderboard_table" width="100%">
     <thead>
      <tr>
       <th style="width: 80px;">
        Rank
       </th>
       <th style="width: 80px;">
        Change
       </th>
       <th style="width: 100px;">
        Tier
       </th>
       <th>
        Summoner
       </th>
       <th style="width: 150px;">
        Top Champions
       </th>
      </tr>
     </thead>
     <tbody>
     </tbody>
    </table>
   </div>
  </div>

为什么会发生这种情况?我如何修复它?

此站点需要JavaScript才能工作。JavaScript用于通过形成web请求来填充表,该请求可能指向后端API。这意味着未经任何JavaScript处理的原始HTML有一个空表

如果我们访问禁用JavaScript的站点,实际上可以在后台看到这个空表:


BeautifulSoup不会导致执行此JavaScript。取而代之的是,看看一些替代的库,例如更高级的库。

正如您在Chrome开发工具中看到的,该站点发送2个XHR请求以获取数据,并使用JavaScript显示数据

因为BeautifulSoup是一个HTML解析器。它不会执行JavaScript。您应该使用类似的工具,它模拟真实的浏览器


但是在这种情况下,您最好使用API,它们用于获取数据。通过查看“网络”选项卡,您可以很容易地看到他们从哪些URL获取数据。重新加载页面,选择XHR,您可以使用该信息创建自己的请求,使用类似的方式。

您可以获得json格式的所有数据,您需要做的是从原始页面源中的脚本标记解析一个值,并将其传递给:

数据为您提供包含所有玩家信息的json,如:

{'data': [{'division': '1',
           'global_ranking': '12',
           'league_points': '1217',
           'lks': '2961',
           'losses': '31',
           'most_played_champions': [{'assists': '238',
                                      'champion_id': '236',
                                      'creep_score': '7227',
                                      'deaths': '131',
                                      'kills': '288',
                                      'losses': '5',
                                      'played': '39',
                                      'wins': '34'},
                                     {'assists': '209',
                                      'champion_id': '429',
                                      'creep_score': '5454',
                                      'deaths': '111',
                                      'kills': '204',
                                      'losses': '3',
                                      'played': '27',
                                      'wins': '24'},
                                     {'assists': '155',
                                      'champion_id': '81',
                                      'creep_score': '4800',
                                      'deaths': '103',
                                      'kills': '168',
                                      'losses': '8',
                                      'played': '26',
                                      'wins': '18'}],
           'name': 'Sadastyczny',
           'previous_ranking': '2',
           'profile_icon_id': 7,
           'ranking': '1',
           'region': 'eune',
           'summoner_id': '42893043',
           'tier': '6',
           'tier_name': 'CHALLENGER',
           'wins': '128'},
          {'division': '1',
           'global_ranking': '30',
           'league_points': '1128',
           'lks': '2956',
           'losses': '180',
           'most_played_champions': [{'assists': '928',
                                      'champion_id': '24',
                                      'creep_score': '37601',
                                      'deaths': '1426',
                                      'kills': '1874',
                                      'losses': '64',
                                      'played': '210',
                                      'wins': '146'},
                                     {'assists': '501',
                                      'champion_id': '67',
                                      'creep_score': '16836',
                                      'deaths': '584',
                                      'kills': '662',
                                      'losses': '37',
                                      'played': '90',
                                      'wins': '53'},
                                     {'assists': '124',
                                      'champion_id': '157',
                                      'creep_score': '5058',
                                      'deaths': '205',
                                      'kills': '141',
                                      'losses': '14',
                                      'played': '28',
                                      'wins': '14'}],
           'name': 'Richor',
           'previous_ranking': '1',
           'profile_icon_id': 577,
           'ranking': '2',
           'region': 'eune',
           'summoner_id': '40385818',
           'tier': '6',
           'tier_name': 'CHALLENGER',
           'wins': '254'},
          {'division': '1',
           'global_ranking': '49',
           'league_points': '1051',
           'lks': '2953',
           'losses': '47',
           'most_played_champions': [{'assists': '638',
                                      'champion_id': '117',
                                      'creep_score': '11927',
                                      'deaths': '99',
                                      'kills': '199',
                                      'losses': '7',
                                      'played': '66',
                                      'wins': '59'},
                                     {'assists': '345',
                                      'champion_id': '48',
                                      'creep_score': '8061',
                                      'deaths': '99',
                                      'kills': '192',
                                      'losses': '11',
                                      'played': '43',
                                      'wins': '32'},
                                     {'assists': '161',
                                      'champion_id': '114',
                                      'creep_score': '5584',
                                      'deaths': '64',
                                      'kills': '165',
                                      'losses': '11',
                                      'played': '31',
                                      'wins': '20'}],

表内容似乎是使用JavaScript生成的。BeautifulSoup不执行JavaScript,因此表为空。看一看硒,看一看就知道了。您可能会发现一些有用的信息。您不需要Selenium,只需模拟ajax请求,就可以获得json格式的所有数据format@PadraicCunningham,我得出了同样的结论。但他接受了另一个答案。
{'data': [{'division': '1',
           'global_ranking': '12',
           'league_points': '1217',
           'lks': '2961',
           'losses': '31',
           'most_played_champions': [{'assists': '238',
                                      'champion_id': '236',
                                      'creep_score': '7227',
                                      'deaths': '131',
                                      'kills': '288',
                                      'losses': '5',
                                      'played': '39',
                                      'wins': '34'},
                                     {'assists': '209',
                                      'champion_id': '429',
                                      'creep_score': '5454',
                                      'deaths': '111',
                                      'kills': '204',
                                      'losses': '3',
                                      'played': '27',
                                      'wins': '24'},
                                     {'assists': '155',
                                      'champion_id': '81',
                                      'creep_score': '4800',
                                      'deaths': '103',
                                      'kills': '168',
                                      'losses': '8',
                                      'played': '26',
                                      'wins': '18'}],
           'name': 'Sadastyczny',
           'previous_ranking': '2',
           'profile_icon_id': 7,
           'ranking': '1',
           'region': 'eune',
           'summoner_id': '42893043',
           'tier': '6',
           'tier_name': 'CHALLENGER',
           'wins': '128'},
          {'division': '1',
           'global_ranking': '30',
           'league_points': '1128',
           'lks': '2956',
           'losses': '180',
           'most_played_champions': [{'assists': '928',
                                      'champion_id': '24',
                                      'creep_score': '37601',
                                      'deaths': '1426',
                                      'kills': '1874',
                                      'losses': '64',
                                      'played': '210',
                                      'wins': '146'},
                                     {'assists': '501',
                                      'champion_id': '67',
                                      'creep_score': '16836',
                                      'deaths': '584',
                                      'kills': '662',
                                      'losses': '37',
                                      'played': '90',
                                      'wins': '53'},
                                     {'assists': '124',
                                      'champion_id': '157',
                                      'creep_score': '5058',
                                      'deaths': '205',
                                      'kills': '141',
                                      'losses': '14',
                                      'played': '28',
                                      'wins': '14'}],
           'name': 'Richor',
           'previous_ranking': '1',
           'profile_icon_id': 577,
           'ranking': '2',
           'region': 'eune',
           'summoner_id': '40385818',
           'tier': '6',
           'tier_name': 'CHALLENGER',
           'wins': '254'},
          {'division': '1',
           'global_ranking': '49',
           'league_points': '1051',
           'lks': '2953',
           'losses': '47',
           'most_played_champions': [{'assists': '638',
                                      'champion_id': '117',
                                      'creep_score': '11927',
                                      'deaths': '99',
                                      'kills': '199',
                                      'losses': '7',
                                      'played': '66',
                                      'wins': '59'},
                                     {'assists': '345',
                                      'champion_id': '48',
                                      'creep_score': '8061',
                                      'deaths': '99',
                                      'kills': '192',
                                      'losses': '11',
                                      'played': '43',
                                      'wins': '32'},
                                     {'assists': '161',
                                      'champion_id': '114',
                                      'creep_score': '5584',
                                      'deaths': '64',
                                      'kills': '165',
                                      'losses': '11',
                                      'played': '31',
                                      'wins': '20'}],