Python 刮削停止在第一线_Python_Web Scraping_Beautifulsoup_Screen Scraping

Python 刮削停止在第一线

python web-scraping

Python 刮削停止在第一线,python,web-scraping,beautifulsoup,screen-scraping,Python,Web Scraping,Beautifulsoup,Screen Scraping,我需要放弃一个网站，以获得一些信息，如电影的标题和相关链接。我的代码运行正常，但它在网站的第一行停止。这是我的代码，提前感谢您的帮助，如果这不是一个聪明的问题，我很抱歉，但我是一个新手 import requests from bs4 import BeautifulSoup URL= 'http://www.simplyscripts.com/genre/horror-scripts.html' def scarica_pagina(URL): page = requests.g

我需要放弃一个网站，以获得一些信息，如电影的标题和相关链接。我的代码运行正常，但它在网站的第一行停止。这是我的代码，提前感谢您的帮助，如果这不是一个聪明的问题，我很抱歉，但我是一个新手

import requests

from bs4 import BeautifulSoup

URL= 'http://www.simplyscripts.com/genre/horror-scripts.html'

def scarica_pagina(URL):
    page = requests.get(URL)
    html = page.text
    soup = BeautifulSoup(html, 'lxml') l
    films = soup.find_all("div",{"id": "movie_wide"})
    for film in films:
        link = film.find('p').find("a").attrs['href']
        title = film.find('p').find("a").text.strip('>')
        print (link)
        print(title)

试试下面的方法。我稍微修改了你的脚本以达到目的并使它看起来更好。如果您遇到任何其他问题，请告诉我：

import requests
from bs4 import BeautifulSoup

URL = 'http://www.simplyscripts.com/genre/horror-scripts.html'

def scarica_pagina(link):
    page = requests.get(link)
    soup = BeautifulSoup(page.text, 'lxml')
    for film in soup.find(id="movie_wide").find_all("p"):
        link = film.find("a")['href']
        title = film.find("a").text
        print (link,title)

if __name__ == '__main__':
    scarica_pagina(URL)

告诉网站，这里一切看起来都很好。听起来你有一个网站相关的问题。你可以发布链接到该网站，并通过编辑你的问题来完成。我做到了，谢谢！只有一个

div#movie_wide

，多个

s包含

s。