Python beautifulSoup只显示最后一个结果
我目前正在尝试学习python网页抓取。我做的每件事都和教程中的一模一样。但是循环不起作用。如果我用print测试它,它只显示最后一个条目Python beautifulSoup只显示最后一个结果,python,for-loop,web-scraping,beautifulsoup,Python,For Loop,Web Scraping,Beautifulsoup,我目前正在尝试学习python网页抓取。我做的每件事都和教程中的一模一样。但是循环不起作用。如果我用print测试它,它只显示最后一个条目 import requests from bs4 import BeautifulSoup as soup import lxml url = "https://www.moviepilot.de/dvd/dvds-neu" agent = {"User-Agent":'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleW
import requests
from bs4 import BeautifulSoup as soup
import lxml
url = "https://www.moviepilot.de/dvd/dvds-neu"
agent = {"User-Agent":'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36'}
page = requests.get(url, headers=agent)
page_soup = soup(page.content, "html.parser")
ergebnisse = page_soup.findAll("li", {"class":"movie"})
for container in ergebnisse:
filmname = container.a["title"]
print(filmname)
每次循环时,都会覆盖filmname的值,因此它可能只打印最后一个值。您需要从循环内部将每个电影名添加到空列表中 试试这个:
import requests
from bs4 import BeautifulSoup as soup
import lxml
url = "https://www.moviepilot.de/dvd/dvds-neu"
agent = {"User-Agent":'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36'}
page = requests.get(url, headers=agent)
page_soup = soup(page.content, "html.parser")
ergebnisse = page_soup.findAll("li", {"class":"movie"})
films = []
for container in ergebnisse:
filmname = container.a["title"]
films.append(filmname)
print(films)
目前,您的
for
循环正在重新分配filmname
,因此,只会存储上一次迭代的值。但是,您可以使用列表理解来存储所有必要的值:
ergebnisse = page_soup.findAll("li", {"class":"movie"})
names = [container.a["title"] for container in ergebnisse]
您也可以使用选择器执行相同的操作:
import requests
from bs4 import BeautifulSoup as soup
URL = "https://www.moviepilot.de/dvd/dvds-neu"
page = requests.get(URL, headers={"User-Agent":"Mozilla/5.0"})
page_soup = soup(page.content, "html.parser")
container = '\n'.join([item["title"] for item in page_soup.select("li.movie a")])
print(container)