Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/278.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/maven/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
I';我正在尝试使用python3为我的wordpress网站创建一个爬虫程序_Python_Python 3.x_Python 2.7_Web Crawler - Fatal编程技术网

I';我正在尝试使用python3为我的wordpress网站创建一个爬虫程序

I';我正在尝试使用python3为我的wordpress网站创建一个爬虫程序,python,python-3.x,python-2.7,web-crawler,Python,Python 3.x,Python 2.7,Web Crawler,我正在尝试复制和打印文章标题及其相关的href链接。您需要从标题中提取标记: import requests from bs4 import BeautifulSoup def page(current_page): current = "h2" while current == current_page: url = 'https://vishrantkhanna.com/?s=' + str(current) source_code = re

我正在尝试复制和打印文章标题及其相关的href链接。

您需要从标题中提取
标记:

import requests
from bs4 import BeautifulSoup

def page(current_page):
    current = "h2"
    while current == current_page:
        url = 'https://vishrantkhanna.com/?s=' + str(current)
        source_code = requests.get(url)
        plain_text = source_code.txt
        soup = BeautifulSoup(plain_text)
        for link in soup.findAll('h2', {'class': 'entry-title'}):
            href = "https://vishrantkhanna.com/" + link.get('href')
            title = link.string
            print(href)
            print(title)

page("h2")

代码的预期和实际结果是什么?你面临的错误/问题是什么?任何日志/示例结果都会对ppl非常有帮助,如果您使用的是旧版本的
请求
,请检查此项。
import requests
from bs4 import BeautifulSoup

URL = 'https://vishrantkhanna.com/?s=1'

html = requests.get(URL).text
bs = BeautifulSoup(html, 'html.parser')
for link in bs.find_all('h2', {'class': 'entry-title'}):
    a = link.find('a', href=True)
    href = "https://vishrantkhanna.com/" + a.get('href')
    title = link.string
    print(href)
    print(title)