Python 文本文件中的多个URL-美化组抓取_Python_Beautifulsoup

Python 文本文件中的多个URL-美化组抓取

python

Python 文本文件中的多个URL-美化组抓取,python,beautifulsoup,Python,Beautifulsoup,我的script.py urls.txt有多个URL的列表，每行一个我正在尝试一次抓取所有URL，并提取特定div 此div在每个URL上出现多次这是我的剧本 import requests from bs4 import BeautifulSoup from urllib import urlopen with open('urls.txt') as inf: urls = (line.strip() for line in inf) for url in urls:

我的

script.py

urls.txt

有多个URL的列表，每行一个

我正在尝试一次抓取所有URL，并提取特定

div

此

div

在每个URL上出现多次

这是我的剧本

import requests
from bs4 import BeautifulSoup
from urllib import urlopen

with open('urls.txt') as inf:
    urls = (line.strip() for line in inf)
    for url in urls:
        site = urlopen(url)   
        soup = BeautifulSoup(site, "lxml")
        for item in soup.find_all("div", {"class": "vm-product-descr-container-1"}):
            print item.text

脚本只返回列表中最后一个url的内容，而不是从

url.txt

中的所有url返回内容

我的脚本没有返回任何错误，所以我不确定哪里出错了

谢谢你的意见

似乎是一个小的识别错误：看看这个街区：

for url in urls:
    site = urlopen(url)   
    soup = BeautifulSoup(site, "lxml")
    for item in soup.find_all("div", {"class": "vm-product-descr-container-1"}):
    print item.text

将其更改为此：

for url in urls:
    site = urlopen(url)   
    soup = BeautifulSoup(site, "lxml")
    for item in soup.find_all("div", {"class": "vm-product-descr-container-1"}):
        print item.text

这样，打印将在内部for循环中的每次迭代中执行。

@danidee您可能刚刚从他的代码中编辑了错误^^@RobBenz，您接受的答案实际上是如何回答您的问题的，这意味着您在说获取最后一行时出现了语法错误？不，您的答案将出现缩进错误