如何使用python从网站中的链接页面提取数据
我一直在尝试从数据分析项目的网页中获取数据,并成功地从单个页面获取数据如何使用python从网站中的链接页面提取数据,python,python-3.x,web,web-scraping,web-scraping-language,Python,Python 3.x,Web,Web Scraping,Web Scraping Language,我一直在尝试从数据分析项目的网页中获取数据,并成功地从单个页面获取数据 import requests from bs4 import BeautifulSoup import concurrent.futures from urllib.parse import urlencode from scraper_api import ScraperAPIClient client = ScraperAPIClient('key') results = client.get(ur
import requests
from bs4 import BeautifulSoup
import concurrent.futures
from urllib.parse import urlencode
from scraper_api import ScraperAPIClient
client = ScraperAPIClient('key')
results = client.get(url = "https://www.essex.ac.uk/course-search?query=&f.Level%7CcourseLevel=Undergraduate").text
print(results)
如需有关网站的示例“https://www.essex.ac.uk/course-search?query=&f.Level%7CcourseLevel=Undergraduate“我需要在每个课程中导航,并从该页面获取一个名为duration的数据。请尝试以下操作:
client=client('key'))
结果=[]
对于范围(10)内的i:
results.append(client.get)(url=f)https://www.essex.ac.uk/course-search?query=&f.Level%7CcourseLevel=Undergraduate&start_rank={i} (1)文本)
打印(结果)
循环浏览10个结果页面,并将每个文本响应放入结果列表中尝试以下操作:
import requests
from bs4 import BeautifulSoup
import concurrent.futures
from urllib.parse import urlencode
from scraper_api import ScraperAPIClient
client = ScraperAPIClient('key')
total_pages = 12
for page_no in range(total_pages):
# you control this page_no variable.
# go to the website and see how the api go to the next page
# it depends on the 'start_rank' at the end of the URL
# for example start_rank=10, start_rank=20 will get you one page after another
rank = page_no * 10
results = client.get(url="https://www.essex.ac.uk/course-search?query=&f.Level%7CcourseLevel=Undergraduate&start_rank={0}".format(rank)).text
print(results)
client=client('key'))
结果=[]
对于范围(10)内的i:
results.append(client.get)(url=f)https://www.essex.ac.uk/course-search?query=&f.Level%7CcourseLevel=Undergraduate&start_rank={i} (1)文本)
打印(结果)
循环浏览10个结果页面,并将每个文本响应放在结果列表中
import requests
from bs4 import BeautifulSoup
import concurrent.futures
from urllib.parse import urlencode
from scraper_api import ScraperAPIClient
client = ScraperAPIClient('key')
total_pages = 12
for page_no in range(total_pages):
# you control this page_no variable.
# go to the website and see how the api go to the next page
# it depends on the 'start_rank' at the end of the URL
# for example start_rank=10, start_rank=20 will get you one page after another
rank = page_no * 10
results = client.get(url="https://www.essex.ac.uk/course-search?query=&f.Level%7CcourseLevel=Undergraduate&start_rank={0}".format(rank)).text
print(results)