Python Beautifulsoup刮取数据段塞_Python_Beautifulsoup_Screen Scraping

Python Beautifulsoup刮取数据段塞

python

Python Beautifulsoup刮取数据段塞,python,beautifulsoup,screen-scraping,Python,Beautifulsoup,Screen Scraping,我想从如下页面内容中获取数据段塞值： ... <div class="my_class" data-slug="I_want_to_scrap_it" data-title="Title"> <br> Some text </div> ... 。。。一些文本 ... 我是通过find_all（class=“my_class”）方法找到它的，但我不知道如何从中提取“我想要废弃它”。当然，我可以将其

我想从如下页面内容中获取数据段塞值：

...
<div class="my_class" data-slug="I_want_to_scrap_it" data-title="Title">
<br> Some text </div>
...

。。。

一些文本
...

我是通过find_all（class=“my_class”）方法找到它的，但我不知道如何从中提取“我想要废弃它”。当然，我可以将其转换为字符串并获取子字符串，但可能有一个非常简单的Beautifulsoup方法来实现这一点

谢谢你，祝你今天愉快

以下是一个示例：

html = '''<div class="my_class" data-slug="I_want_to_scrap_it" data-title="Title">
<br> Some text </div>
'''

# solution using BeautifulSoup
from bs4 import BeautifulSoup

soup = BeautifulSoup(html, 'html5lib')

div = soup.select('div.my_class')[0]
data_slug = div.get('data-slug')
print(data_slug)

html=''

一些文本
'''
#使用BeautifulSoup解决方案
从bs4导入BeautifulSoup
soup=BeautifulSoup（html，“html5lib”）
div=soup.select（'div.my_class'）[0]
data\u slug=div.get（'data-slug'））
打印（数据块）

数据段塞

是一个属性，可以使用函数

get（）

检索它是“刮”而不是“刮”…好的，谢谢。现在很难了，让我们看看你的代码也没什么坏处。会帮我们拍到照片谢谢你Alexandra！