I';我正在尝试使用python3为我的wordpress网站创建一个爬虫程序
我正在尝试复制和打印文章标题及其相关的href链接。您需要从标题中提取I';我正在尝试使用python3为我的wordpress网站创建一个爬虫程序,python,python-3.x,python-2.7,web-crawler,Python,Python 3.x,Python 2.7,Web Crawler,我正在尝试复制和打印文章标题及其相关的href链接。您需要从标题中提取标记: import requests from bs4 import BeautifulSoup def page(current_page): current = "h2" while current == current_page: url = 'https://vishrantkhanna.com/?s=' + str(current) source_code = re
标记:
import requests
from bs4 import BeautifulSoup
def page(current_page):
current = "h2"
while current == current_page:
url = 'https://vishrantkhanna.com/?s=' + str(current)
source_code = requests.get(url)
plain_text = source_code.txt
soup = BeautifulSoup(plain_text)
for link in soup.findAll('h2', {'class': 'entry-title'}):
href = "https://vishrantkhanna.com/" + link.get('href')
title = link.string
print(href)
print(title)
page("h2")
代码的预期和实际结果是什么?你面临的错误/问题是什么?任何日志/示例结果都会对ppl非常有帮助,如果您使用的是旧版本的
请求
,请检查此项。
import requests
from bs4 import BeautifulSoup
URL = 'https://vishrantkhanna.com/?s=1'
html = requests.get(URL).text
bs = BeautifulSoup(html, 'html.parser')
for link in bs.find_all('h2', {'class': 'entry-title'}):
a = link.find('a', href=True)
href = "https://vishrantkhanna.com/" + a.get('href')
title = link.string
print(href)
print(title)