在Python3中仅选择此数据刮取中的统计信息
我正在尝试从该链接中获取数据: 我的目标是能够将这些数据导入excel 我在代码中做了这么多:在Python3中仅选择此数据刮取中的统计信息,python,web-scraping,beautifulsoup,python-requests,python-3.5,Python,Web Scraping,Beautifulsoup,Python Requests,Python 3.5,我正在尝试从该链接中获取数据: 我的目标是能够将这些数据导入excel 我在代码中做了这么多: import sys import requests from bs4 import BeautifulSoup r = requests.get('http://www.hoopsstats.com/basketball/fantasy/nba/opponentstats/16/12/eff/1-1') soup = BeautifulSoup(r.text, "html.parser")
import sys
import requests
from bs4 import BeautifulSoup
r = requests.get('http://www.hoopsstats.com/basketball/fantasy/nba/opponentstats/16/12/eff/1-1')
soup = BeautifulSoup(r.text, "html.parser")
stats = soup.find_all('table', 'statscontent')
print(stats)
以下是返回内容的开始:
[<table bgcolor="#EBE9E9" cellpadding="0" cellspacing="0" class="statscontent" height="20" width="100%">
<tr id="myid1/0" onmouseout="hide_table_effect('myid1/0')" onmouseover="show_table_effect('myid1/0')">
<td width="3%"><center>1</center></td>
<td align="left" width="9%"><a href="/basketball/fantasy/nba/boston-celtics/team/profile/16/2">Boston</a></td>
<td width="3%"><center>66</center></td>
<td width="4%"><center>48.0</center></td>
<td width="4%"><center>19.6</center></td>
<td width="4%"><center>5.2</center></td>
<td width="4%"><center>7.2</center></td>
<td width="4%"><center>1.8</center></td>
<td width="4%"><center>0.5</center></td>
<td width="4%"><center>4.3</center></td>
<td width="4%"><center>4.1</center></td>
<td width="4%"><center>4.3</center></td>
<td width="4%"><center>0.9</center></td>
<td width="8%"><center>6.8-16.2</center></td>
<td width="3%"><center>.423</center></td>
<td width="7%"><center>1.6-5.0</center></td>
<td width="3%"><center>.324</center></td>
<td width="8%"><center>4.3-5.3</center></td>
<td width="3%"><center>.818</center></td>
<td width="5%"><center>19.8</center></td>
<td width="4%"><center>-6.7</center></td>
</tr>
</table>, <table bgcolor="#F8F8F8" cellpadding="0" cellspacing="0" class="statscontent" height="20" width="100%">
<tr id="myid1/1" onmouseout="hide_table_effect('myid1/1')" onmouseover="show_table_effect('myid1/1')">
<td width="3%"><center>2</center></td>
<td align="left" width="9%"><a href="/basketball/fantasy/nba/san-antonio-spurs/team/profile/16/27">San Antonio</a></td>
<td width="3%"><center>66</center></td>
<td width="4%"><center>47.9</center></td>
<td width="4%"><center>19.6</center></td>
<td width="4%"><center>5.0</center></td>
<td width="4%"><center>8.7</center></td>
<td width="4%"><center>1.8</center></td>
<td width="4%"><center>0.3</center></td>
[
1.
66
48
19.6
5.2
7.2
1.8
0.5
4.3
4.1
4.3
0.9
6.8-16.2
.423
1.6-5.0
.324
4.3-5.3
.818
19.8
-6.7
,
2.
66
47.9
19.6
5
8.7
1.8
0.3
我需要介于“x”之间的数字
此外,最好将数据格式化,以便可以轻松地在CSV文件中使用。这似乎可以实现以下目的:
import sys
import requests
from bs4 import BeautifulSoup
r = requests.get('http://www.hoopsstats.com/basketball/fantasy/nba/opponentstats/16/12/eff/1-1')
soup = BeautifulSoup(r.text, "html.parser")
stats = soup.find_all('table', 'statscontent')
for table in soup.find_all('table', 'statscontent','a'):
stats = [ stat.text for stat in table.find_all('center') ]
team = [team.text for team in table.find_all('a')]
print(team,stats)
可能是多余的或有其他缺陷,但我得到了我想要的东西你能展示一些你用
stats
做过的更多代码吗?--看起来你得到了正确的HTML,所以现在你需要解析它…遗憾的是,我对这一点很迷茫,几天前才开始使用Python。有关于数据教程的建议吗解析?到目前为止,还没有什么太有用的东西,而且通常都是非常模糊的。虽然这更多的是关于总统而不是篮球,但你会喜欢这里:……我一定会检查一下