Python 拉出href'；s与beautifulsoup属性_Python_Beautifulsoup

Python 拉出href'；s与beautifulsoup属性

python

Python 拉出href'；s与beautifulsoup属性,python,beautifulsoup,Python,Beautifulsoup,我正在尝试一些新的东西，拉出a标签中的所有href。不过，它并没有退出HREF，也无法找出原因 import requests from bs4 import BeautifulSoup url = "https://www.brightscope.com/ratings/" page = requests.get(url) soup = BeautifulSoup(page.text, 'html.parser') for href in soup.findAll('a'): h

我正在尝试一些新的东西，拉出

标签中的所有href。不过，它并没有退出HREF，也无法找出原因

import requests
from bs4 import BeautifulSoup

url = "https://www.brightscope.com/ratings/"
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html.parser')

for href in soup.findAll('a'):
    h = href.attrs['href']
    print(h)

您应该检查该键是否存在，因为它也可能不存在于

标记之间的href

import requests
from bs4 import BeautifulSoup

url = "https://www.brightscope.com/ratings/"
page = requests.get(url)
print(page.text)
soup = BeautifulSoup(page.text, 'html.parser')

for a in soup.findAll('a'):
    if 'href' in a.attrs:
        print(a.attrs['href'])

您应该检查该键是否存在，因为它也可能不存在于

标记之间的href

import requests
from bs4 import BeautifulSoup

url = "https://www.brightscope.com/ratings/"
page = requests.get(url)
print(page.text)
soup = BeautifulSoup(page.text, 'html.parser')

for a in soup.findAll('a'):
    if 'href' in a.attrs:
        print(a.attrs['href'])

查看

页面。text

变量并确保这些元素实际存在。现在大多数网站在页面加载后动态加载内容，因此对页面的简单GET请求不会获取动态数据。当我打印soup变量时，它们都在那里。那么BS是否真的在查找

标记？你的for循环真的有什么东西需要循环吗？是的。它正在查找

标记，并在

for

循环中将它们全部打印出来，但是当我试图提取HREF时，我得到了一个错误

h=href.attrs['href']keyrerror:'href'

查看您的

页面.text

变量，确保这些元素实际存在。现在大多数网站在页面加载后动态加载内容，因此对页面的简单GET请求不会获取动态数据。当我打印soup变量时，它们都在那里。那么BS是否真的在查找

标记？你的for循环真的有什么东西需要循环吗？是的。它正在查找

标记，并在

for

循环中将它们全部打印出来，但是当我试图提取HREF时，我得到了一个错误

h=href.attrs['href']keyrerror:'href'

Nice！就这样，很好！就这样。