Python 我在海盗湾的网络刮板没有返回激流，可能是什么？_Python_Web Scraping

Python 我在海盗湾的网络刮板没有返回激流，可能是什么？

python web-scraping

Python 我在海盗湾的网络刮板没有返回激流，可能是什么？,python,web-scraping,Python,Web Scraping,我的网络刮板在海盗湾，不返回激流，可能是什么 import requests import lxml.html as html import os import datetime import time def thepiratebay(book): PB_MIRRORS = f'https://pirateproxy.surf/search.php?q={book}&all=on&search=Pirate+Search&page=0&orderby=

我的网络刮板在海盗湾，不返回激流，可能是什么

import requests
import lxml.html as html
import os
import datetime
import time

def thepiratebay(book):
    PB_MIRRORS = f'https://pirateproxy.surf/search.php?q={book}&all=on&search=Pirate+Search&page=0&orderby='
    LINKS_PATH = '//span[@class="list-item item-name item-title"]/a/@href'
    try:
        response = requests.get(PB_MIRRORS)
        if response.status_code == 200:
            home = response.content.decode('utf-8')
            parsed = html.fromstring(home)
            torrents = parsed.xpath(LINKS_PATH)
            complete_torrent = 'https://pirateproxy.surf'
            links_torrents = []
            for t in torrents:
                links_torrents.append(complete_torrent + t)
            print(f'THE PIRATE BAY: found {len(links_torrents)} torrents')
            return links_torrents
        else:
            raise ValueError('Error the mirror link doesnt work any more:  \n Change it in tbt.py ')
    except ValueError as ve:
        print(f'Error: {ve}')

代码不返回任何torrent，它可能是xpath，但在chrome中它检测链接。路径是：

 //span[@class="list-item item-name item-title"]/a/@href

《小岛屿：\

What book are you looking for?: small island
THE PIRATE BAY: found 0 torrents

结果似乎来自此API：

GET https://pirateproxy.surf/api?url=/q.php?q={book}&cat=

所有链接看起来都是这样的

/description.php？id=28037371

，上面的API为您提供了id。因此，您可以使用以下内容：

import requests

search = "book"

r = requests.get("https://pirateproxy.surf/api",
    params = {
        "url": f"/q.php?q={search}&cat="
    })

links = [ 
    f'https://pirateproxy.surf/description.php?id={t["id"]}' 
    for t in r.json()
]
print(links)

结果似乎来自此API：

GET https://pirateproxy.surf/api?url=/q.php?q={book}&cat=

所有链接看起来都是这样的

/description.php？id=28037371

，上面的API为您提供了id。因此，您可以使用以下内容：

import requests

search = "book"

r = requests.get("https://pirateproxy.surf/api",
    params = {
        "url": f"/q.php?q={search}&cat="
    })

links = [ 
    f'https://pirateproxy.surf/description.php?id={t["id"]}' 
    for t in r.json()
]
print(links)

links\u torrents.append（complete\u torrent+torrents）

应该是

links\u torrents.append（complete\u torrent+t）

是的，我更改了它，但问题是torrents列表是空的。您的代码到底输出了什么？另外，您的

PB\u镜像

不是一个，但您尝试在其中使用格式（

{book}

）-因此这也是一个坏链接。我修复了fstring错误，但继续给出0 torrents的结果。请相应地更新您的代码，并发布终端

links\u torrents.append的结果（complete\u torrent+torrents）

应该是

链接\u torrents.append（complete\u torrent+t）

是的，我更改了它，但问题是torrents列表是空的，你的代码输出的确切内容是什么？你的

PB\u镜像

不是一个，但是你尝试在那里使用格式（

{book}

）-所以这也是一个坏链接。我修复了fstring错误，但继续给出0 Torrents的结果。请相应地更新代码，并从终端发布结果