Selenium 使用BeautifulSoup在两个h2标记之间显示文本

Selenium 使用BeautifulSoup在两个h2标记之间显示文本,selenium,web-scraping,beautifulsoup,Selenium,Web Scraping,Beautifulsoup,我试图学习使用BS4 soup的“html.parser”解析页面源代码时使用selenium进行刮片。我有所有包含h2标记和class名称的标记,但是提取中间的文本似乎不起作用 import os import re from selenium import webdriver from selenium.webdriver.common.keys import Keys from selenium.webdriver.chrome.options import Options from bs

我试图学习使用BS4 soup的
“html.parser”
解析
页面源代码时使用
selenium
进行刮片。我有所有包含
h2
标记和
class
名称的标记,但是提取中间的文本似乎不起作用

import os
import re
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup as soup

opts = webdriver.ChromeOptions()
opts.binary_location = os.environ.get('GOOGLE_CHROME_BIN', None)
opts.add_argument("--headless")
opts.add_argument("--disable-dev-shm-usage")
opts.add_argument("--no-sandbox")
browser = webdriver.Chrome(executable_path="chromedriver", options=opts)

url1='https://www.animechrono.com/date-a-live-series-watch-order'
browser.get(url1)
req = browser.page_source
sou = soup(req, "html.parser")
h = sou.find_all('h2', class_='heading-5')
p = sou.find_all('div', class_='text-block-5')
for i in range(len(h)):
    h[i] == h[i].getText()
for j in range(len(p)):
    p[j] = p[j].getText()
print(h)
print(p)
browser.quit()
我的输出:

[<h2 class="heading-5">Season 1</h2>, <h2 class="heading-5">Date to Date OVA</h2>, <h2 class="heading-5">Season 2</h2>, <h2 class="heading-5">Kurumi Star Festival OVA</h2>, <h2 class="heading-5">Date A Live Movie: Mayuri Judgement</h2>, <h2 class="heading-5">Season 3</h2>, <h2 class="heading-5">Date A Bullet: Dead or Bullet Movie</h2>, <h2 class="heading-5">Date A Bullet: Nightmare or Queen Movie</h2>]
['Episodes 1-12', 'Date to Date OVA', 'Episodes 1-10', 'Kurumi Star Festival OVA', 'Date A Live Movie: Mayuri Judgement', 'Episodes 1-12', 'Date A Bullet: Dead or Bullet Movie', 'Date A Bullet: Nightmare or Queen Movie']
[第1季,日期到日期OVA,第2季,库鲁米明星节OVA,现场电影日期:玛尤里审判,第3季,子弹日期:死亡或子弹电影,子弹日期:噩梦或女王电影]
[‘第1-12集’、‘约会到约会OVA’、‘第1-10集’、‘库鲁米影星节OVA’、‘约会一部现场电影:玛尤里审判’、‘第1-12集’、‘约会一颗子弹:死亡或子弹电影’、‘约会一颗子弹:噩梦或女王电影’]

driver.quit()之前添加此行。

完整代码:

import os
import re
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup as soup

opts = webdriver.ChromeOptions()
opts.binary_location = os.environ.get('GOOGLE_CHROME_BIN', None)
opts.add_argument("--headless")
opts.add_argument("--disable-dev-shm-usage")
opts.add_argument("--no-sandbox")
browser = webdriver.Chrome(executable_path="chromedriver", options=opts)

url1='https://www.animechrono.com/date-a-live-series-watch-order'
browser.get(url1)
req = browser.page_source
sou = soup(req, "html.parser")
h = sou.find_all('h2', class_='heading-5')
p = sou.find_all('div', class_='text-block-5')
for j in range(len(p)):
    p[j] = p[j].getText()
h = [elem.text for elem in h]
print(h)
browser.quit()
输出:

['Season 1', 'Date to Date OVA', 'Season 2', 'Kurumi Star Festival OVA', 'Date A Live Movie: Mayuri Judgement', 'Season 3', 'Date A Bullet: Dead or Bullet Movie', 'Date A Bullet: Nightmare or Queen Movie']

driver.quit()之前添加此行。

完整代码:

import os
import re
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup as soup

opts = webdriver.ChromeOptions()
opts.binary_location = os.environ.get('GOOGLE_CHROME_BIN', None)
opts.add_argument("--headless")
opts.add_argument("--disable-dev-shm-usage")
opts.add_argument("--no-sandbox")
browser = webdriver.Chrome(executable_path="chromedriver", options=opts)

url1='https://www.animechrono.com/date-a-live-series-watch-order'
browser.get(url1)
req = browser.page_source
sou = soup(req, "html.parser")
h = sou.find_all('h2', class_='heading-5')
p = sou.find_all('div', class_='text-block-5')
for j in range(len(p)):
    p[j] = p[j].getText()
h = [elem.text for elem in h]
print(h)
browser.quit()
输出:

['Season 1', 'Date to Date OVA', 'Season 2', 'Kurumi Star Festival OVA', 'Date A Live Movie: Mayuri Judgement', 'Season 3', 'Date A Bullet: Dead or Bullet Movie', 'Date A Bullet: Nightmare or Queen Movie']

哟!你能接受我的ans作为最好的ans吗?谢谢哟!你能接受我的ans作为最好的ans吗?谢谢