Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/307.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Can';t以自定义方式排列和打印网页中的某些字段_Python_Python 3.x_Web Scraping_Beautifulsoup - Fatal编程技术网

Python Can';t以自定义方式排列和打印网页中的某些字段

Python Can';t以自定义方式排列和打印网页中的某些字段,python,python-3.x,web-scraping,beautifulsoup,Python,Python 3.x,Web Scraping,Beautifulsoup,我已经创建了一个脚本来解析电影名、演员阵容、制作人和演员阵容。我可以从该页面解析上述字段。然而,当考虑到这四个项目时,我不能做的是以某种定制的方式安排和打印这些项目。到目前为止,我编写的脚本可以完全按照我想要的方式打印项目,只包括moviename和cast。我希望包括由制作的和由铸造的以及您在中看到的 到目前为止,我已经尝试过: import requests from bs4 import BeautifulSoup link = 'https://www.imdb.com/title/t

我已经创建了一个脚本来解析电影名、演员阵容、制作人和演员阵容。我可以从该页面解析上述字段。然而,当考虑到这四个项目时,我不能做的是以某种定制的方式安排和打印这些项目。到目前为止,我编写的脚本可以完全按照我想要的方式打印项目,只包括
moviename
cast
。我希望包括由制作的
和由
铸造的
以及您在中看到的

到目前为止,我已经尝试过:

import requests
from bs4 import BeautifulSoup

link = 'https://www.imdb.com/title/tt0068646/fullcredits?ref_=tt_cl_sm#cast'

with requests.Session() as s:
    s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36'
    r = s.get(link)
    soup = BeautifulSoup(r.text,"lxml")
    movie_name = soup.select_one("h3[itemprop='name'] > a").get_text(strip=True)
    for item in soup.select("h4#cast + table.cast_list tr:has(:not(.castlist_label))"):
        performer = item.select_one("td:not(.primary_photo) > a[href^='/name/']").get_text(strip=True)
        character = ' '.join(item.select_one("td.character").text.split())
        print(movie_name,performer,character)
我得到的输出(
电影名
演员阵容
):

我希望在上述打印的底部添加以下结果(取自您在图像中看到的由
产生的
和由
铸造的
两个字段):

The Godfather Gray Frederickson associate producer
The Godfather Al Ruddy producer (as Albert S. Ruddy) (produced by)
The Godfather Robert Evans studio executive (uncredited)
The Godfather Louis DiGiaimo (casting)
The Godfather Andrea Eastman (casting)
The Godfather Fred Roos (casting)
如何让脚本以上面显示的方式打印字段?

印刷品:

...
Krstný Otec Matthew Vlahakis Clemenza's Son (uncredited)
Krstný Otec Conrad Yama Fruit Vendor (uncredited)
Krstný Otec Gray Frederickson associate producer
Krstný Otec Al Ruddy producer (as Albert S. Ruddy) (produced by)
Krstný Otec Robert Evans studio executive (uncredited)
Krstný Otec Louis DiGiaimo (casting)
Krstný Otec Andrea Eastman (casting)
Krstný Otec Fred Roos (casting)

注意:
KrstnýOtec
在斯洛伐克语中的意思是
goddaver
(我得到了斯洛伐克版的HTML,因为我的国家是IP)。

谢谢Andrej Kesely,谢谢你的回答。这当然有帮助,但问题是我希望同时打印它们,而不是在不同的循环中单独使用打印。谢谢。@robots.txt也许我不明白,您想如何同时打印它们?在你的问题中,你说你希望这些结果在打印的底部…我已经把它分类了。谢谢你的帮助。
import requests
from bs4 import BeautifulSoup

link = 'https://www.imdb.com/title/tt0068646/fullcredits?ref_=tt_cl_sm#cast'

with requests.Session() as s:
    s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36'
    r = s.get(link)
    soup = BeautifulSoup(r.text,"lxml")
    movie_name = soup.select_one("h3[itemprop='name'] > a").get_text(strip=True)
    for item in soup.select("h4#cast + table.cast_list tr:has(:not(.castlist_label))"):
        performer = item.select_one("td:not(.primary_photo) > a[href^='/name/']").get_text(strip=True)
        character = ' '.join(item.select_one("td.character").text.split())
        print(movie_name,performer,character)
    for row in soup.select('h4:contains("Produced by") + table tr'):
        name = row.select_one('.name').get_text(strip=True)
        credit = row.select_one('.credit').get_text(strip=True)
        print(movie_name, name, credit)
    for row in soup.select('h4:contains("Casting By") + table tr'):
        name = row.select_one('.name').get_text(strip=True)
        credit = row.select_one('.credit').get_text(strip=True)
        print(movie_name, name, credit)
...
Krstný Otec Matthew Vlahakis Clemenza's Son (uncredited)
Krstný Otec Conrad Yama Fruit Vendor (uncredited)
Krstný Otec Gray Frederickson associate producer
Krstný Otec Al Ruddy producer (as Albert S. Ruddy) (produced by)
Krstný Otec Robert Evans studio executive (uncredited)
Krstný Otec Louis DiGiaimo (casting)
Krstný Otec Andrea Eastman (casting)
Krstný Otec Fred Roos (casting)