Python Can';当我把两个类别打印在一起时,我得不到所有的信息
我已经用python编写了一些代码来获取一些电影名称和一些与这些电影相关的附加信息。如果我考虑把这两个项目分开打印,如在<代码>打印(电影)<代码>中,我已经写得很好了。 脚本的中间部分和位于底部的Python Can';当我把两个类别打印在一起时,我得不到所有的信息,python,python-3.x,web-scraping,beautifulsoup,Python,Python 3.x,Web Scraping,Beautifulsoup,我已经用python编写了一些代码来获取一些电影名称和一些与这些电影相关的附加信息。如果我考虑把这两个项目分开打印,如在打印(电影)中,我已经写得很好了。 脚本的中间部分和位于底部的print(addinfo) 但是,当我尝试在底部同时打印这两个名称时,我只会得到包含附加信息的电影名称(附加信息是从附加到每个电影名称的链接中检索的。问题是大多数电影名称不包含附加链接) 例如,如果有5个电影名称,其中只有3个具有附加链接,那么当我将它们一起打印时,我会得到这3个电影名称和附加信息,而我应该得到5个
print(addinfo)
但是,当我尝试在底部同时打印这两个名称时,我只会得到包含附加信息的电影名称(附加信息是从附加到每个电影名称的链接中检索的。问题是大多数电影名称不包含附加链接)
例如,如果有5个电影名称,其中只有3个具有附加链接,那么当我将它们一起打印时,我会得到这3个电影名称和附加信息,而我应该得到5个电影名称。我希望打印那些没有额外信息的名字。我怎样才能解决这个问题?提前谢谢。我认为网站地址和html信息是不相关的,因为代码运行良好。但是,我粘贴完整的代码供您考虑
我尝试过的脚本:
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin
URL = "https://in.bookmyshow.com/vizag/movies"
res = requests.get(URL)
soup = BeautifulSoup(res.text, 'lxml')
for item in soup.select(".card-container"):
movie = item.select_one(".__movie-name").text.strip()
print(movie) ####I do not wish to print it here. I expect to print both (movie and addinfo) together
blink = item.select_one(".book-button a")
if blink:
req = requests.get(urljoin(URL,blink['href']))
soup = BeautifulSoup(req.text,"lxml")
addinfo = ' '.join([item.select_one(".__venue-name").text.strip() for item in soup.select(".listing-info")])
print(movie,addinfo) ####if i print both of them together then I only get those movies which have items informations as well
代码:
输出:
Tholi Prema Gokul A/C DTS: Vizag
Howrah Bridge INOX: CMR Central, Maddilapalem INOX: Varun Beach, Beach Road INOX: Vizag Chitralaya Mall Satyam A/C Dts: Gopalapatnam V Max: Vizag
Chalo Ganesh A/C Dts: Tagarapuvalasa INOX: CMR Central, Maddilapalem INOX: Varun Beach, Beach Road INOX: Vizag Chitralaya Mall Mukta A2 Cinemas: Vizag Central, Vizag Mohini Mini: Gajuwaka Mohini 70mm Dolby Atmos: Gajuwaka Narasimha a/c Dts: Gopalapatnam Sri Lakshmi Narasimha Picture Palace: Vizag Sri Venkateshwara Screen 1: Vizag Sarat Theater - 4K Dolby Atmos: Vizag
Touch Chesi Chudu INOX: CMR Central, Maddilapalem INOX: Varun Beach, Beach Road INOX: Vizag Chitralaya Mall Mukta A2 Cinemas: Vizag Central, Vizag Raja Cine Max 2K A/c Dts: Kothavalasa Sharada 4K: Vizag Sri Rama Picture Palace: Vizag Tata Picture Palace A/c Dts: Tagarapuvalasa V Max: Vizag
Bhaagamathie INOX: CMR Central, Maddilapalem INOX: Varun Beach, Beach Road INOX: Vizag Chitralaya Mall Jagadamba 4k: Vizag Kinnera Cinema: Maddilapalem Mukta A2 Cinemas: Vizag Central, Vizag Sri Ramulamma Theatre, Thagarapuvalasa: Vizag Sri Lakshmi Narasimha Picture Palace: Vizag Shankara A/C Dts: Gopalapatnam Sri Jaya A/c Dts: Kothavalasa
Padmaavat
Gang Gokul A/C DTS: Vizag Sri Parameswari Picture Palace: Kancharapalem
Jai Simha Mourya Theatre: Gopalapatnam Sree Leela Mahal: Vizag Saptagiri Theatre: Chittivalasa
Maze Runner: The Death Cure INOX: Varun Beach, Beach Road INOX: Vizag Chitralaya Mall Ramadevi 4K: Vizag
Jumanji: Welcome To The Jungle INOX: Vizag Chitralaya Mall
Hey Jude INOX: Varun Beach, Beach Road
Green Apple
Sollividava
Tagaru
Savarakathi
KEE
Prema Baraha
Befaam
Shadow
Rosapoo
Aapla Manus
Kalakalappu 2
Kumari 21 F
Karu
Kirrak Party
Gayatri
Inttelligent
KEY
Downup The Exit 796
Pad Man
The Boy and The World
The 15:17 to Paris
Leera The Soulmates
Aiyaary
Kanam
代码:
输出:
Tholi Prema Gokul A/C DTS: Vizag
Howrah Bridge INOX: CMR Central, Maddilapalem INOX: Varun Beach, Beach Road INOX: Vizag Chitralaya Mall Satyam A/C Dts: Gopalapatnam V Max: Vizag
Chalo Ganesh A/C Dts: Tagarapuvalasa INOX: CMR Central, Maddilapalem INOX: Varun Beach, Beach Road INOX: Vizag Chitralaya Mall Mukta A2 Cinemas: Vizag Central, Vizag Mohini Mini: Gajuwaka Mohini 70mm Dolby Atmos: Gajuwaka Narasimha a/c Dts: Gopalapatnam Sri Lakshmi Narasimha Picture Palace: Vizag Sri Venkateshwara Screen 1: Vizag Sarat Theater - 4K Dolby Atmos: Vizag
Touch Chesi Chudu INOX: CMR Central, Maddilapalem INOX: Varun Beach, Beach Road INOX: Vizag Chitralaya Mall Mukta A2 Cinemas: Vizag Central, Vizag Raja Cine Max 2K A/c Dts: Kothavalasa Sharada 4K: Vizag Sri Rama Picture Palace: Vizag Tata Picture Palace A/c Dts: Tagarapuvalasa V Max: Vizag
Bhaagamathie INOX: CMR Central, Maddilapalem INOX: Varun Beach, Beach Road INOX: Vizag Chitralaya Mall Jagadamba 4k: Vizag Kinnera Cinema: Maddilapalem Mukta A2 Cinemas: Vizag Central, Vizag Sri Ramulamma Theatre, Thagarapuvalasa: Vizag Sri Lakshmi Narasimha Picture Palace: Vizag Shankara A/C Dts: Gopalapatnam Sri Jaya A/c Dts: Kothavalasa
Padmaavat
Gang Gokul A/C DTS: Vizag Sri Parameswari Picture Palace: Kancharapalem
Jai Simha Mourya Theatre: Gopalapatnam Sree Leela Mahal: Vizag Saptagiri Theatre: Chittivalasa
Maze Runner: The Death Cure INOX: Varun Beach, Beach Road INOX: Vizag Chitralaya Mall Ramadevi 4K: Vizag
Jumanji: Welcome To The Jungle INOX: Vizag Chitralaya Mall
Hey Jude INOX: Varun Beach, Beach Road
Green Apple
Sollividava
Tagaru
Savarakathi
KEE
Prema Baraha
Befaam
Shadow
Rosapoo
Aapla Manus
Kalakalappu 2
Kumari 21 F
Karu
Kirrak Party
Gayatri
Inttelligent
KEY
Downup The Exit 796
Pad Man
The Boy and The World
The 15:17 to Paris
Leera The Soulmates
Aiyaary
Kanam
如果使用
else
块,则另一种方法可能如下所示:
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin
URL = "https://in.bookmyshow.com/vizag/movies"
res = requests.get(URL)
soup = BeautifulSoup(res.text, 'lxml')
for item in soup.select(".card-container"):
movie = item.select_one(".__movie-name").text.strip()
blink = item.select_one(".book-button a")
if blink:
req = requests.get(urljoin(URL,blink['href']))
soup = BeautifulSoup(req.text,"lxml")
addinfo = ' '.join([item.select_one(".__venue-name").text.strip() for item in soup.select(".listing-info")])
print(movie,addinfo)
else:
print(movie)
如果使用
else
块,则另一种方法可能如下所示:
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin
URL = "https://in.bookmyshow.com/vizag/movies"
res = requests.get(URL)
soup = BeautifulSoup(res.text, 'lxml')
for item in soup.select(".card-container"):
movie = item.select_one(".__movie-name").text.strip()
blink = item.select_one(".book-button a")
if blink:
req = requests.get(urljoin(URL,blink['href']))
soup = BeautifulSoup(req.text,"lxml")
addinfo = ' '.join([item.select_one(".__venue-name").text.strip() for item in soup.select(".listing-info")])
print(movie,addinfo)
else:
print(movie)
谢谢你的回复@Ali。如果您查看上面的输出,您可以看到(在
嘿Jude
之后)所有电影都在INOX:Varun Beach,Beach Road
中填写了相同的信息,而它们都是空白的。更清楚地说:在嘿,裘德
之后,电影名称没有任何附加信息。这就是我正在努力解决的问题。我刚刚更新了我的答案,我希望这能为你解决问题。没门。它确实解决了问题。你绝对值得投一票。谢谢你的修复。没问题@Topto:d谢谢你的回复@Ali。如果您查看上面的输出,您可以看到(在嘿Jude
之后)所有电影都在INOX:Varun Beach,Beach Road
中填写了相同的信息,而它们都是空白的。更清楚地说:在嘿,裘德
之后,电影名称没有任何附加信息。这就是我正在努力解决的问题。我刚刚更新了我的答案,我希望这能为你解决问题。没门。它确实解决了问题。你绝对值得投一票。谢谢你的修复。没问题@Topto:D