Python 无法让脚本解析出现在特定文本之后的其余结果_Python_Python 3.x_Web Scraping

Python 无法让脚本解析出现在特定文本之后的其余结果

python python-3.x web-scraping

Python 无法让脚本解析出现在特定文本之后的其余结果,python,python-3.x,web-scraping,Python,Python 3.x,Web Scraping,我正在尝试用python创建一个脚本，在满足特定条件时，从网页中刮取不同帖子的标题和链接。我希望脚本打印在特定文本之后可用的其余结果，就像本例中的Chromedriver一样。但是，我当前的尝试仅打印此文本以替代Chromedriver 如何让脚本解析出现在特定文本之后的其余结果？因为您有一个ì语句：如果标题不是Chromedriver的替代品，那么继续循环下一个标题，这就是为什么它只在标题=='Chromedriver的替代品'时打印出来。要打印所有标题，请按如下方式更改代码 for item

我正在尝试用python创建一个脚本，在满足特定条件时，从网页中刮取不同帖子的标题和链接。我希望脚本打印在特定文本之后可用的其余结果，就像本例中的Chromedriver一样。但是，我当前的尝试仅打印此文本以替代Chromedriver

如何让脚本解析出现在特定文本之后的其余结果？

因为您有一个ì语句：如果标题不是Chromedriver的替代品，那么继续循环下一个标题，这就是为什么它只在标题=='Chromedriver的替代品'时打印出来。要打印所有标题，请按如下方式更改代码

for item in soup.select(".summary .question-hyperlink"):
    title = item.get_text(strip=True)
    link = item.get("href")
    if check_title == title:
        print(f'Found it: {title}, {link}')
    else:
        print(title,link)

#Web scrapting of image with Python /questions/61035199/web-scrapting-of-image-with-python
#Imported functions not working in puppeteer /questions/61035043/imported-functions-not-working-in-puppeteer
#Trying to build a webscraper following a tutorial and keep getting attribute error for findall /questions/61034690/trying-to-build-a-webscraper-following-a-tutorial-and-keep-getting-attribute-err
#Python Selenium Web Scraping Hidden Div /questions/61034439/python-selenium-web-scraping-hidden-div
#Found it: Alternative to Chromedriver, /questions/61034224/alternative-to-chromedriver

看看上面的输出，最后一个标题euqal为“Chromedriver的替代品”，所以它打印出“找到了”，其他人只打印标题，链接

试试：

import requests
from bs4 import BeautifulSoup

URL = "https://stackoverflow.com/questions/tagged/web-scraping?tab=Newest"
check_title = "Alternative to Chromedriver"

res = requests.get(URL)
soup = BeautifulSoup(res.text,'html.parser')

# Initialise a flag to track where to start printing from 
start_printing = False

for item in soup.select(".summary .question-hyperlink"):
    title = item.get_text(strip=True)

    # Keep iterating until the required text is found. Initialise it only once
    if not start_printing and check_title == title:
        start_printing = True
        continue
    if start_printing:
        link = item.get("href")
        print(title,link)

import requests
from bs4 import BeautifulSoup

URL = "https://stackoverflow.com/questions/tagged/web-scraping?tab=Newest"
check_title = "Alternative to Chromedriver"

res = requests.get(URL)
soup = BeautifulSoup(res.text,'html.parser')

# Initialise a flag to track where to start printing from 
start_printing = False

for item in soup.select(".summary .question-hyperlink"):
    title = item.get_text(strip=True)

    # Keep iterating until the required text is found. Initialise it only once
    if not start_printing and check_title == title:
        start_printing = True
        continue
    if start_printing:
        link = item.get("href")
        print(title,link)