Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/335.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何使用python从网站中的链接页面提取数据_Python_Python 3.x_Web_Web Scraping_Web Scraping Language - Fatal编程技术网

如何使用python从网站中的链接页面提取数据

如何使用python从网站中的链接页面提取数据,python,python-3.x,web,web-scraping,web-scraping-language,Python,Python 3.x,Web,Web Scraping,Web Scraping Language,我一直在尝试从数据分析项目的网页中获取数据,并成功地从单个页面获取数据 import requests from bs4 import BeautifulSoup import concurrent.futures from urllib.parse import urlencode from scraper_api import ScraperAPIClient client = ScraperAPIClient('key') results = client.get(ur

我一直在尝试从数据分析项目的网页中获取数据,并成功地从单个页面获取数据

import requests
from bs4 import BeautifulSoup
import concurrent.futures
from urllib.parse import urlencode
from scraper_api import ScraperAPIClient


    client = ScraperAPIClient('key')
    results = client.get(url = "https://www.essex.ac.uk/course-search?query=&f.Level%7CcourseLevel=Undergraduate").text
    
    print(results)
如需有关网站的示例“https://www.essex.ac.uk/course-search?query=&f.Level%7CcourseLevel=Undergraduate“我需要在每个课程中导航,并从该页面获取一个名为duration的数据。

请尝试以下操作:

client=client('key'))
结果=[]
对于范围(10)内的i:
results.append(client.get)(url=f)https://www.essex.ac.uk/course-search?query=&f.Level%7CcourseLevel=Undergraduate&start_rank={i} (1)文本)
打印(结果)
循环浏览10个结果页面,并将每个文本响应放入结果列表中

尝试以下操作:

import requests
from bs4 import BeautifulSoup
import concurrent.futures
from urllib.parse import urlencode
from scraper_api import ScraperAPIClient

client = ScraperAPIClient('key')
total_pages = 12
for page_no in range(total_pages):
    # you control this page_no variable.
    # go to the website and see how the api go to the next page
    # it depends on the 'start_rank' at the end of the URL
    # for example start_rank=10, start_rank=20 will get you one page after another
    rank = page_no * 10
    results = client.get(url="https://www.essex.ac.uk/course-search?query=&f.Level%7CcourseLevel=Undergraduate&start_rank={0}".format(rank)).text
    print(results)
client=client('key'))
结果=[]
对于范围(10)内的i:
results.append(client.get)(url=f)https://www.essex.ac.uk/course-search?query=&f.Level%7CcourseLevel=Undergraduate&start_rank={i} (1)文本)
打印(结果)
循环浏览10个结果页面,并将每个文本响应放在结果列表中

import requests
from bs4 import BeautifulSoup
import concurrent.futures
from urllib.parse import urlencode
from scraper_api import ScraperAPIClient

client = ScraperAPIClient('key')
total_pages = 12
for page_no in range(total_pages):
    # you control this page_no variable.
    # go to the website and see how the api go to the next page
    # it depends on the 'start_rank' at the end of the URL
    # for example start_rank=10, start_rank=20 will get you one page after another
    rank = page_no * 10
    results = client.get(url="https://www.essex.ac.uk/course-search?query=&f.Level%7CcourseLevel=Undergraduate&start_rank={0}".format(rank)).text
    print(results)