Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 无法让脚本解析出现在特定文本之后的其余结果_Python_Python 3.x_Web Scraping - Fatal编程技术网

Python 无法让脚本解析出现在特定文本之后的其余结果

Python 无法让脚本解析出现在特定文本之后的其余结果,python,python-3.x,web-scraping,Python,Python 3.x,Web Scraping,我正在尝试用python创建一个脚本,在满足特定条件时,从网页中刮取不同帖子的标题和链接。我希望脚本打印在特定文本之后可用的其余结果,就像本例中的Chromedriver一样。但是,我当前的尝试仅打印此文本以替代Chromedriver 如何让脚本解析出现在特定文本之后的其余结果?因为您有一个ì语句:如果标题不是Chromedriver的替代品,那么继续循环下一个标题,这就是为什么它只在标题=='Chromedriver的替代品'时打印出来。要打印所有标题,请按如下方式更改代码 for item

我正在尝试用python创建一个脚本,在满足特定条件时,从网页中刮取不同帖子的标题和链接。我希望脚本打印在特定文本之后可用的其余结果,就像本例中的Chromedriver一样。但是,我当前的尝试仅打印此文本以替代Chromedriver

如何让脚本解析出现在特定文本之后的其余结果?

因为您有一个ì语句:如果标题不是Chromedriver的替代品,那么继续循环下一个标题,这就是为什么它只在标题=='Chromedriver的替代品'时打印出来。要打印所有标题,请按如下方式更改代码

for item in soup.select(".summary .question-hyperlink"):
    title = item.get_text(strip=True)
    link = item.get("href")
    if check_title == title:
        print(f'Found it: {title}, {link}')
    else:
        print(title,link)

#Web scrapting of image with Python /questions/61035199/web-scrapting-of-image-with-python
#Imported functions not working in puppeteer /questions/61035043/imported-functions-not-working-in-puppeteer
#Trying to build a webscraper following a tutorial and keep getting attribute error for findall /questions/61034690/trying-to-build-a-webscraper-following-a-tutorial-and-keep-getting-attribute-err
#Python Selenium Web Scraping Hidden Div /questions/61034439/python-selenium-web-scraping-hidden-div
#Found it: Alternative to Chromedriver, /questions/61034224/alternative-to-chromedriver
看看上面的输出,最后一个标题euqal为“Chromedriver的替代品”,所以它打印出“找到了”,其他人只打印标题,链接

试试:

import requests
from bs4 import BeautifulSoup

URL = "https://stackoverflow.com/questions/tagged/web-scraping?tab=Newest"
check_title = "Alternative to Chromedriver"

res = requests.get(URL)
soup = BeautifulSoup(res.text,'html.parser')

# Initialise a flag to track where to start printing from 
start_printing = False

for item in soup.select(".summary .question-hyperlink"):
    title = item.get_text(strip=True)

    # Keep iterating until the required text is found. Initialise it only once
    if not start_printing and check_title == title:
        start_printing = True
        continue
    if start_printing:
        link = item.get("href")
        print(title,link)
import requests
from bs4 import BeautifulSoup

URL = "https://stackoverflow.com/questions/tagged/web-scraping?tab=Newest"
check_title = "Alternative to Chromedriver"

res = requests.get(URL)
soup = BeautifulSoup(res.text,'html.parser')

# Initialise a flag to track where to start printing from 
start_printing = False

for item in soup.select(".summary .question-hyperlink"):
    title = item.get_text(strip=True)

    # Keep iterating until the required text is found. Initialise it only once
    if not start_printing and check_title == title:
        start_printing = True
        continue
    if start_printing:
        link = item.get("href")
        print(title,link)