Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/334.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Web刮取并从整个表的td中提取属性值,而不是文本值_Python_Pandas_Web Scraping_Beautifulsoup_Python Requests - Fatal编程技术网

Python Web刮取并从整个表的td中提取属性值,而不是文本值

Python Web刮取并从整个表的td中提取属性值,而不是文本值,python,pandas,web-scraping,beautifulsoup,python-requests,Python,Pandas,Web Scraping,Beautifulsoup,Python Requests,我试图从表中提取一些数据,但它们的内容实际上是我想要的属性 xml示例: ''' ''' 在我当前的代码中,我得到了一个外观良好的数据框,其中包含了标题和查看表时可见的所有信息。然而,我想在桌子上用“出去:脑震荡”而不是“O”。我已经尝试了很多方法,但都想不出来。请让我知道当前流程是否可行,或者我是否完全错了。这将有助于您: import pandas as pd from bs4 import BeautifulSoup import requests url = 'https://www.

我试图从表中提取一些数据,但它们的内容实际上是我想要的属性

xml示例:

'''

'''

在我当前的代码中,我得到了一个外观良好的数据框,其中包含了标题和查看表时可见的所有信息。然而,我想在桌子上用“出去:脑震荡”而不是“O”。我已经尝试了很多方法,但都想不出来。请让我知道当前流程是否可行,或者我是否完全错了。

这将有助于您:

import pandas as pd
from bs4 import BeautifulSoup
import requests

url = 'https://www.pro-football-reference.com/teams/atl/2017_injuries.htm'
r = requests.get(url)
soup = BeautifulSoup(r.content, 'lxml')
table = soup.find('table', attrs={'class': 'sortable', 'id': 'team_injuries'})
table_rows = table.find_all('tr')

final_data = []
for tr in table_rows:
    td = tr.find_all(['th','td'])
    row = [tr['data-tip'] if tr.has_attr("data-tip") else tr.text for tr in td]

    final_data.append(row)

m = final_data[1:]
final_dataa = [[m[j][i] for j in range(len(m))] for i in range(len(m[0]))]

df = pd.DataFrame(final_dataa,final_data[0]).T

df.to_csv("D:\\injuries.csv", index = False)
csv
文件的屏幕截图(我做了一些格式化,使其看起来整洁):


因为它没有给我想要的信息。我正在寻找拉属性,而不是文本出来。这是巨大的!真不敢相信我竟然错过了这么一小步。非常感谢。
import pandas as pd
from bs4 import BeautifulSoup
import requests

url = 'https://www.pro-football-reference.com/teams/atl/2017_injuries.htm'
r = requests.get(url)
soup = BeautifulSoup(r.content, 'lxml')
table = soup.find('table', attrs={'class': 'sortable', 'id': 'team_injuries'})
table_rows = table.find_all('tr')

final_data = []
for tr in table_rows:
    td = tr.find_all(['th','td'])
    row = [tr.text for tr in td]
    final_data.append(row)
df = pd.DataFrame(final_data[1:],final_data[0])
import pandas as pd
from bs4 import BeautifulSoup
import requests

url = 'https://www.pro-football-reference.com/teams/atl/2017_injuries.htm'
r = requests.get(url)
soup = BeautifulSoup(r.content, 'lxml')
table = soup.find('table', attrs={'class': 'sortable', 'id': 'team_injuries'})
table_rows = table.find_all('tr')

final_data = []
for tr in table_rows:
    td = tr.find_all(['th','td'])
    row = [tr['data-tip'] if tr.has_attr("data-tip") else tr.text for tr in td]

    final_data.append(row)

m = final_data[1:]
final_dataa = [[m[j][i] for j in range(len(m))] for i in range(len(m[0]))]

df = pd.DataFrame(final_dataa,final_data[0]).T

df.to_csv("D:\\injuries.csv", index = False)