Python 美丽的汤不获取所有html
我对Scraping和python是新手,我已经写了一段代码来scrape网页 。使用下面给出的代码。但是在响应中,它并没有包含所有的html。页面中间的数据未被提取。我尝试过lxml和html.parser,但没有区别Python 美丽的汤不获取所有html,python,python-3.x,web-scraping,beautifulsoup,Python,Python 3.x,Web Scraping,Beautifulsoup,我对Scraping和python是新手,我已经写了一段代码来scrape网页 。使用下面给出的代码。但是在响应中,它并没有包含所有的html。页面中间的数据未被提取。我尝试过lxml和html.parser,但没有区别 from bs4 import BeautifulSoup import requests url = 'http://www.hl.co.uk/funds/fund-discounts,-prices--and--factsheets/search-results/a' r
from bs4 import BeautifulSoup
import requests
url = 'http://www.hl.co.uk/funds/fund-discounts,-prices--and--factsheets/search-results/a'
response = requests.get(url)
soup = BeautifulSoup(response.content,'lxml')
print(soup)
我不知道原因,可能我遗漏了任何关键点或任何东西
from bs4 import BeautifulSoup
import requests
url = 'http://www.hl.co.uk/funds/fund-discounts,-prices--and--factsheets/search-results/a'
response = requests.get(url)
soup = BeautifulSoup(response.content,'html.parser')
for fund in soup.select("ul[class='list-unstyled list-indent'] > li > a"):
print(fund.attrs['title'])
结果将是
Aberdeen Asia Pacific and Japan Equity (Class I) Accumulation
Aberdeen Asia Pacific and Japan Equity Accumulation Inclusive
Aberdeen Asia Pacific Equity (Class I) Accumulation
Aberdeen Asia Pacific Equity (Class I) Income
.
.
.
AXA WF Framlington Robotech (Class F) Accumulation
AXA WF Framlington Robotech (Class F) Income
AXA WF Framlington UK (Class L) Accumulation
AXA WF Global Strategic Bonds (Class I H) Accumulation
数据可能是通过JS加载的。您可以使用
selenium
获取数据。您确定吗,我知道selenium,但我认为它比BS4慢。“资金”下的一些链接没有被获取吗?或者页面中的其他数据?是,“资金”下的链接不可用fetched@AyyanKhan你是说“资金”菜单下的链接?