在Python3中仅选择此数据刮取中的统计信息_Python_Web Scraping_Beautifulsoup_Python Requests_Python 3.5

在Python3中仅选择此数据刮取中的统计信息

python web-scraping

在Python3中仅选择此数据刮取中的统计信息,python,web-scraping,beautifulsoup,python-requests,python-3.5,Python,Web Scraping,Beautifulsoup,Python Requests,Python 3.5,我正在尝试从该链接中获取数据：我的目标是能够将这些数据导入excel 我在代码中做了这么多： import sys import requests from bs4 import BeautifulSoup r = requests.get('http://www.hoopsstats.com/basketball/fantasy/nba/opponentstats/16/12/eff/1-1') soup = BeautifulSoup(r.text, "html.parser")

我正在尝试从该链接中获取数据：

我的目标是能够将这些数据导入excel

我在代码中做了这么多：

import sys
import requests
from bs4 import BeautifulSoup

r = requests.get('http://www.hoopsstats.com/basketball/fantasy/nba/opponentstats/16/12/eff/1-1')


soup = BeautifulSoup(r.text, "html.parser")

stats = soup.find_all('table', 'statscontent')


print(stats)

以下是返回内容的开始：

[<table bgcolor="#EBE9E9" cellpadding="0" cellspacing="0" class="statscontent" height="20" width="100%">
<tr id="myid1/0" onmouseout="hide_table_effect('myid1/0')" onmouseover="show_table_effect('myid1/0')">
<td width="3%"><center>1</center></td>
<td align="left" width="9%"><a href="/basketball/fantasy/nba/boston-celtics/team/profile/16/2">Boston</a></td>
<td width="3%"><center>66</center></td>
<td width="4%"><center>48.0</center></td>
<td width="4%"><center>19.6</center></td>
<td width="4%"><center>5.2</center></td>
<td width="4%"><center>7.2</center></td>
<td width="4%"><center>1.8</center></td>
<td width="4%"><center>0.5</center></td>
<td width="4%"><center>4.3</center></td>
<td width="4%"><center>4.1</center></td>
<td width="4%"><center>4.3</center></td>
<td width="4%"><center>0.9</center></td>
<td width="8%"><center>6.8-16.2</center></td>
<td width="3%"><center>.423</center></td>
<td width="7%"><center>1.6-5.0</center></td>
<td width="3%"><center>.324</center></td>
<td width="8%"><center>4.3-5.3</center></td>
<td width="3%"><center>.818</center></td>
<td width="5%"><center>19.8</center></td>
<td width="4%"><center>-6.7</center></td>
</tr>
</table>, <table bgcolor="#F8F8F8" cellpadding="0" cellspacing="0" class="statscontent" height="20" width="100%">
<tr id="myid1/1" onmouseout="hide_table_effect('myid1/1')" onmouseover="show_table_effect('myid1/1')">
<td width="3%"><center>2</center></td>
<td align="left" width="9%"><a href="/basketball/fantasy/nba/san-antonio-spurs/team/profile/16/27">San Antonio</a></td>
<td width="3%"><center>66</center></td>
<td width="4%"><center>47.9</center></td>
<td width="4%"><center>19.6</center></td>
<td width="4%"><center>5.0</center></td>
<td width="4%"><center>8.7</center></td>
<td width="4%"><center>1.8</center></td>
<td width="4%"><center>0.3</center></td>

[
1.
66
48
19.6
5.2
7.2
1.8
0.5
4.3
4.1
4.3
0.9
6.8-16.2
.423
1.6-5.0
.324
4.3-5.3
.818
19.8
-6.7
, 
2.
66
47.9
19.6
5
8.7
1.8
0.3

我需要介于“x”之间的数字

此外，最好将数据格式化，以便可以轻松地在CSV文件中使用。这似乎可以实现以下目的：

  import sys
    import requests
    from bs4 import BeautifulSoup

    r = requests.get('http://www.hoopsstats.com/basketball/fantasy/nba/opponentstats/16/12/eff/1-1')


    soup = BeautifulSoup(r.text, "html.parser")

    stats = soup.find_all('table', 'statscontent')


    for table in soup.find_all('table', 'statscontent','a'):
        stats = [ stat.text for stat in table.find_all('center') ]
        team = [team.text for team in table.find_all('a')]


        print(team,stats)

可能是多余的或有其他缺陷，但我得到了我想要的东西

你能展示一些你用

stats

做过的更多代码吗？--看起来你得到了正确的HTML，所以现在你需要解析它…遗憾的是，我对这一点很迷茫，几天前才开始使用Python。有关于数据教程的建议吗解析？到目前为止，还没有什么太有用的东西，而且通常都是非常模糊的。虽然这更多的是关于总统而不是篮球，但你会喜欢这里：……我一定会检查一下