Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/ssis/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用beautifulSoup错误提取HTML_Python_Web Scraping_Beautifulsoup - Fatal编程技术网

Python 使用beautifulSoup错误提取HTML

Python 使用beautifulSoup错误提取HTML,python,web-scraping,beautifulsoup,Python,Web Scraping,Beautifulsoup,这是我第一次在左上角提取的代码 import qgrid import webbrowser import requests from bs4 import BeautifulSoup page = requests.get('http://www.meteo.gr/cf.cfm?city_id=14') #sending the request to take the html file. soup = BeautifulSoup(page.content, 'html.parser') #

这是我第一次在左上角提取的代码

import qgrid
import webbrowser
import requests
from bs4 import BeautifulSoup

page = requests.get('http://www.meteo.gr/cf.cfm?city_id=14') #sending the request to take the html file.
soup = BeautifulSoup(page.content, 'html.parser') #creating beautifulSoup object of the html code.

four_days = soup.find(id="prognoseis")#PINPOINTING to the section that i want to focus (the outer).

#Selecting specific elements , having as my base the seven_day.
periods = [p.get_text() for p in four_days.select(".perhour-rowmargin .innerTableCell-fulltime")]


#creating a Data Frame via pandas to print it TABLE-like.
import pandas as pd
weather = pd.DataFrame({"period ": periods})
print weather
我查阅了一本很好的教程,开始了解它的窍门。 在four_days对象中,我持有“Prognosis”中包含的html代码部分,这是我想要的信息所在。 在periods对象之后,我选择包含所需信息的元素,并作为第二个参数指定要提取的ExExExtly文本


代码运行并给我空的

您正在类名之间添加破折号,但不存在此类破折号。您选择的
元素有两个类,
每小时
行边距
,但您选择的是不存在的类
每小时行边距
。这同样适用于
td
元素;它们有单独的类
fulltime
innerTableCell

只需选择一个或另一个;下面返回所需的单元格:

four_days.select(".perhour .fulltime")
您可能还希望删除每个单元格数据周围的额外换行符;将
strip=True
添加到
get_text()
调用:

[p.get_text(strip=True) for p in four_days.select(".perhour .fulltime")]

元素有两个类,
perhour
rowmargin
,没有一个类名为
perhour rowmargin
lol,我不知道,谢谢@MartijnPieters解决了这个问题。正如martin所说,这是答案,我接受它。我只是不知道注释是如何工作的。谢谢,你是救命恩人