使用python抓取html元素的内部标记时出错

使用python抓取html元素的内部标记时出错,python,web-scraping,beautifulsoup,python-requests,data-science,Python,Web Scraping,Beautifulsoup,Python Requests,Data Science,最近我正在做练习,在练习中我提取了整个网页的源数据。我对区域标签很感兴趣。在区域标记中,我对onclick属性非常感兴趣。现在我们如何从特定元素中提取onclick属性。 现在我们提取的数据是这样的 <area class="borderimage" coords="21.32,14.4,933.96,180.56" href="javascript:void(0);" onclick="return show_pop('78545','51022929357','1')" onmouse

最近我正在做练习,在练习中我提取了整个网页的源数据。我对区域标签很感兴趣。在区域标记中,我对onclick属性非常感兴趣。现在我们如何从特定元素中提取onclick属性。 现在我们提取的数据是这样的

<area class="borderimage" coords="21.32,14.4,933.96,180.56" href="javascript:void(0);" onclick="return show_pop('78545','51022929357','1')" onmouseover="borderit(this,'black','<b>इंदौर, गुरुवार, 10 मई , 2018  <b><br><bआप पढ़ रहे हैं देश का सबसे व...')" onmouseout="borderit(this,'white')" alt="<b>इंदौर, गुरुवार, 10 मई , 2018  <b><br><bआप पढ़ रहे हैं देश का सबसे व..." shape="rect">

如果我答对了-您需要在页面上单击
标记的所有
属性

试着这样做:

import requests
from bs4 import BeautifulSoup

TAG_NAME = 'area'
ATTR_NAME = 'onclick'

url = 'http://epaper.bhaskar.com/indore/129/10052018/mpcg/1/'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.90 Safari/537.36'}
page = requests.get(url, headers=headers)
soup = BeautifulSoup(page.text, 'html.parser')

# there are 3 <area> tags on page; putting them into a list
area_onclick_attrs = [x[ATTR_NAME] for x in soup.findAll(TAG_NAME)]
print(area_onclick_attrs)
import requests
from bs4 import BeautifulSoup

TAG_NAME = 'area'
ATTR_NAME = 'onclick'

url = 'http://epaper.bhaskar.com/indore/129/10052018/mpcg/1/'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.90 Safari/537.36'}
page = requests.get(url, headers=headers)
soup = BeautifulSoup(page.text, 'html.parser')

# there are 3 <area> tags on page; putting them into a list
area_onclick_attrs = [x[ATTR_NAME] for x in soup.findAll(TAG_NAME)]
print(area_onclick_attrs)
[
    "return show_pophead('78545','51022929357','1')", 
    "return show_pop('78545','51022928950','4')", 
    "return show_pop('78545','51022929357','1')",
]