Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/cmake/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何循环所有<;th>;我的脚本中用于web抓取的标记?_Python_Python 3.x_Debugging_Web Scraping_Beautifulsoup - Fatal编程技术网

Python 如何循环所有<;th>;我的脚本中用于web抓取的标记?

Python 如何循环所有<;th>;我的脚本中用于web抓取的标记?,python,python-3.x,debugging,web-scraping,beautifulsoup,Python,Python 3.x,Debugging,Web Scraping,Beautifulsoup,到目前为止,我只得到了['1']作为下面用我当前代码打印的内容的输出。我想在网站的Rk栏中的团队击球表上获得1-54分 如何修改colNum,以便它可以在Rk列中打印1-54?我指出了colNum行,因为我觉得问题就在那里,但我可能错了 import pandas as pd import requests from bs4 import BeautifulSoup page = requests.get('https://www.baseball-reference.com/teams/NY

到目前为止,我只得到了
['1']
作为下面用我当前代码打印的内容的输出。我想在网站的
Rk
栏中的团队击球表上获得1-54分

如何修改
colNum
,以便它可以在
Rk
列中打印1-54?我指出了
colNum
行,因为我觉得问题就在那里,但我可能错了

import pandas as pd
import requests
from bs4 import BeautifulSoup

page = requests.get('https://www.baseball-reference.com/teams/NYY/2019.shtml')
soup = BeautifulSoup(page.content, 'html.parser')  # parse as HTML page, this is the source code of the page
week = soup.find(class_='table_outer_container')

items = week.find("thead").get_text() # grabs table headers
th = week.find("th").get_text() # grabs Rk only.

tbody = week.find("tbody")
tr = tbody.find("tr")

thtwo = tr.find("th").get_text()
colNum = [thtwo for thtwo in thtwo]
print(colNum)

你的错误在你提到的最后几行。如果我理解正确,您需要“Rk”列中所有值的列表。要获取所有行,必须使用
find\u all()
函数。我对代码进行了一些调整,以获得以下行中每行第一个字段的文本:

import pandas as pd
import requests
from bs4 import BeautifulSoup

page = requests.get('https://www.baseball-reference.com/teams/NYY/2019.shtml')
soup = BeautifulSoup(page.content, 'html.parser')
is the source code of the page
week = soup.find(class_='table_outer_container')

items = week.find("thead").get_text()
th = week.find("th").get_text()

tbody = week.find("tbody")
tr = tbody.find_all("tr")
colnum = [row.find("th").get_text() for row in tr]

print(colnum)