用于web抓取的python循环_Python

用于web抓取的python循环

python

用于web抓取的python循环,python,Python,我正在尝试创建一个循环来显示li标记中的所有值，以创建一个数据帧。此外，我只能使用：new=soup.find（“div”，class=“PlayerList”）来隔离代码。如果我使用标准for循环，它只显示一个值，而不是所有值我想展示的输出是：梅西拍摄9 通过9级解决4 <pre> import requests import pandas as pd import numpy as np from urllib.request import

我正在尝试创建一个循环来显示li标记中的所有值，以创建一个数据帧。此外，我只能使用：new=soup.find（“div”，class=“PlayerList”）来隔离代码。如果我使用标准for循环，它只显示一个值，而不是所有值

我想展示的输出是：

梅西

拍摄9

通过9级

解决4

   <pre>
   import requests
   import pandas as pd
   import numpy as np

   from urllib.request import urlopen
   from bs4 import BeautifulSoup

   main_url = 'https://examplelistpython.000webhostapp.com/messi.html'
   result = requests.get(main_url)
   result.text

   soup = BeautifulSoup(result.text, 'html.parser')
   print(soup.prettify())

   new  = soup.find("div", class_="PlayerList")
   new

    </pre>

<ul class="List">
 <li>
  <div class="PlayerList">
   <div class="HeaderList">
    <span class="player">Messi</span>
</div>
  <div class="PlayerStat">
   <span class="stat">Shooting   <span class="allStatContainer statShooting" data-stat="Shooting">
     9
  </span>
 </span>
</div>
<div class="PlayerStat">
<span class="stat">Passing   <span class="allStatContainer statPassing" data-stat="Passing">
  9
 </span>
</span>
</div>
<div class="PlayerStat">
<span class="stat">Tackle   <span class="allStatContainer statTackle" data-stat="Tackle">
     4
     </span>
   </span>
  </div>
 </li>
</ul>


导入请求
作为pd进口熊猫
将numpy作为np导入
从urllib.request导入urlopen
从bs4导入BeautifulSoup
主要的https://examplelistpython.000webhostapp.com/messi.html'
结果=requests.get（主url）
result.text
soup=BeautifulSoup（result.text'html.parser'）
打印（soup.prettify（））
新建=soup.find（“div”，class=“PlayerList”）
新的


梅西
射击
9
经过
9
解决
4.

结果:

玩家射击经过解决 0 梅西 9 9 4.

那个网站有用吗？它在使用requestsYes的get（）时显示错误。它应该能工作吗？您使用的是哪种代码？

player = [i.text.strip() for i in soup.find_all("span", class_="player")]
shooting = [i.text.strip() for i in soup.find_all("span", class_="allStatContainer statShooting")]
passing = [i.text.strip() for i in soup.find_all("span", class_="allStatContainer statPassing")] 
tackle = [i.text.strip() for i in soup.find_all("span", class_="allStatContainer statTackle")]

df = pd.DataFrame({'Player': player, 'Shooting': shooting, 'Passing': passing, 'Tackle': tackle})