Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/magento/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何检索雅虎搜索结果?_Python_Parsing_Beautifulsoup_Python Requests - Fatal编程技术网

Python 如何检索雅虎搜索结果?

Python 如何检索雅虎搜索结果?,python,parsing,beautifulsoup,python-requests,Python,Parsing,Beautifulsoup,Python Requests,我正在尝试使用以下代码搜索Yahoo的查询: 导入请求 从bs4导入BeautifulSoup query=“deep” 雅虎=”https://search.yahoo.com/search?q=“+query+”&n=“+str(10) 原始页面=requests.get(yahoo) soup=BeautifulSoup(原始页面文本) 查找所有(属性={“类”:“ac algo fz-l ac-21th lh-24”}): 打印(link.text、link.get('href'))

我正在尝试使用以下代码搜索Yahoo的查询:

导入请求
从bs4导入BeautifulSoup
query=“deep”
雅虎=”https://search.yahoo.com/search?q=“+query+”&n=“+str(10)
原始页面=requests.get(yahoo)
soup=BeautifulSoup(原始页面文本)
查找所有(属性={“类”:“ac algo fz-l ac-21th lh-24”}):
打印(link.text、link.get('href'))

但这不起作用,结果是空的。如何获得10个首次搜索结果?

以下是您的代码的主要问题:

  • 使用靓汤时,您应始终包括(例如,
    BeautifulSoup(raw_page.text,“lxml”)

  • 您搜索的类是错误的,它是
    “ac algo fz-l ac-21th lh-24”
    而不是
    “ac algo fz-l ac-21th lh-24”
    (注意开头的空格)

总之,您的代码应该如下所示:

import requests
from bs4 import BeautifulSoup

query = "deep"
yahoo = "https://search.yahoo.com/search?q=" + query + "&n=" + str(10)
raw_page = requests.get(yahoo)

soup = BeautifulSoup(raw_page.text, "lxml")
for link in soup.find_all(attrs={"class": " ac-algo fz-l ac-21th lh-24"}):
    print(link.text, link.get('href'))

希望这对您有所帮助

您可以使用Css选择器查找所有必须更快的链接

import requests
from bs4 import BeautifulSoup

query = "deep"
yahoo = "https://search.yahoo.com/search?q=" + query + "&n=" + str(10)
raw_page = requests.get(yahoo)

soup = BeautifulSoup(raw_page.text,'lxml')

for link in soup.select(".ac-algo.fz-l.ac-21th.lh-24"):
    print (link.text, link['href'])
输出:

(Deep | Definition of Deep by Merriam-Webster', 'https://www.merriam-webster.com/dictionary/deep')
(Connecticut Department of Energy & Environmental Protection', 'https://www.ct.gov/deep/site/default.asp')
(Deep | Define Deep at Dictionary.com', 'https://www.dictionary.com/browse/deep')
(Deep - definition of deep by The Free Dictionary', 'https://www.thefreedictionary.com/deep')
(Deep (2017) - IMDb', 'https://www.imdb.com/title/tt4105584/')
(Deep Synonyms, Deep Antonyms | Merriam-Webster Thesaurus', 'https://www.merriam-webster.com/thesaurus/deep')
(Deep Synonyms, Deep Antonyms | Thesaurus.com', 'https://www.thesaurus.com/browse/deep')
(DEEP: Fishing - Connecticut', 'https://www.ct.gov/deep/cwp/view.asp?q=322708')
(Deep Deep Deep - YouTube', 'https://www.youtube.com/watch?v=oZhwagxWzOc')
(deep - English-Spanish Dictionary - WordReference.com', 'https://www.wordreference.com/es/translation.asp?tranword=deep')

谢谢你的回答。事实上,我一开始试过使用空格,但没有成功。当我删除空间,它的工作!!!在soup中是否有任何选项可以过滤链接,从而忽略广告或youtube链接?@Hadigharemannezhad据我所知,你不能,如果我是你,我会在for循环中这样做(使用if语句):)谁否决了这个?更快的是soup。选择(“.ac algo”)@QHarr:是的,完全同意你的观点。