按名称、beautiful soup和python获取元标记内容_Python_Html_Web Scraping_Beautifulsoup_Metadata

按名称、beautiful soup和python获取元标记内容

python html web-scraping

按名称、beautiful soup和python获取元标记内容,python,html,web-scraping,beautifulsoup,metadata,Python,Html,Web Scraping,Beautifulsoup,Metadata,我正试图从这个网站上获取元数据这是代码 import requests from bs4 import BeautifulSoup source = requests.get('https://www.svpboston.com/').text soup = BeautifulSoup(source, features="html.parser") title = soup.find("meta", name="description&qu

我正试图从这个网站上获取元数据这是代码

import requests
from bs4 import BeautifulSoup

source = requests.get('https://www.svpboston.com/').text

soup = BeautifulSoup(source, features="html.parser")

title = soup.find("meta", name="description")
image = soup.find("meta", name="og:image")

print(title["content"] if title else "No meta title given")
print(image["content"]if title else "No meta title given")

但是我得到了这个错误

Traceback (most recent call last):
  File "C:/Users/User/PycharmProjects/Work/Web Scraping/Selenium/sadsaddas.py", line 9, in <module>
    title = soup.find("meta", name="description")
TypeError: find() got multiple values for argument 'name'

有什么想法吗？

查找只需要一个参数。改用这个：

meta = soup.findall("meta")
title = meta.find(name="description")
image = meta.find(name="og:image")

你可以这样试试

title = soup.find("meta", attrs={"name":"description"})
image = soup.find("meta", attrs={"name":"og:image"})
print(title)
print(image)
print(title["content"] if title else "No meta title given")
print(image["content"] if image else "No meta for image given")

或发件人：

不能使用关键字参数进行搜索对于HTML的name元素，因为Beautiful Soup使用该名称参数，以包含标记本身的名称。相反，你可以给予 attrs参数中“name”的值

要通过特定属性获取标记，我建议您将其放入dictionary并将该dictionary传递到.find作为attrs参数。但您传递了错误的属性来获取标题和图像。您应该使用property=而不是name=抓取元标记。下面是获取所需内容的最终代码：

import requests
import requests
from bs4 import BeautifulSoup

source = requests.get('https://www.svpboston.com/').text

soup = BeautifulSoup(source, features="html.parser")

title = soup.find("meta", attrs={'property': 'og:title'})
image = soup.find("meta", attrs={'property': 'og:image'})

print(title["content"] if title is not None else "No meta title given")
print(image["content"] if title is not None else "No meta title given")

非常感谢，伙计，它成功了！

import requests
import requests
from bs4 import BeautifulSoup

source = requests.get('https://www.svpboston.com/').text

soup = BeautifulSoup(source, features="html.parser")

title = soup.find("meta", attrs={'property': 'og:title'})
image = soup.find("meta", attrs={'property': 'og:image'})

print(title["content"] if title is not None else "No meta title given")
print(image["content"] if title is not None else "No meta title given")