Python “美丽集团”；刮削；使用他们的名字和身份证_Python_Python 2.7_Beautifulsoup

Python “美丽集团”；刮削；使用他们的名字和身份证

python python-2.7

Python “美丽集团”；刮削；使用他们的名字和身份证,python,python-2.7,beautifulsoup,Python,Python 2.7,Beautifulsoup,我正在使用beautifulsoup，但我不确定如何正确使用find、findall和其他函数如果我有： <div class="hey"></div> 任何想法都将不胜感激：）请查看以下代码： from bs4 import BeautifulSoup html = """ <h3 id="me"></h3> <li id="test1"></li> <li custom="test2321"></

我正在使用beautifulsoup，但我不确定如何正确使用find、findall和其他函数

如果我有：

<div class="hey"></div>

任何想法都将不胜感激：）

请查看以下代码：

from bs4 import BeautifulSoup

html = """
<h3 id="me"></h3>
<li id="test1"></li>
<li custom="test2321"></li>
<li id="test1" class="tester"></li>
<ul class="here"></ul>
"""

soup = BeautifulSoup(html)

# This tells BS to look at all the h3 tags, and find the ones that have an ID of me
# This however should not be done because IDs are supposed to be unique, so
# soup.find_all(id="me") should be used
one = soup.find_all("h3", {"id": "me"})
print one

# Same as above, if something has an ID, just use the ID
two = soup.find_all("li", {"id": "test1"})  # ids should be unique
print two

# Tells BS to look at all the li tags and find the node with a custom attribute
three = soup.find_all("li", {"custom": "test2321"})
print three

# Again ID, should have been enough
four = soup.find_all("li", {"id": "test1", "class": "tester"})
print four

# Look at ul tags, and find the one with a class attribute of "here"
four = soup.find_all("ul", {"class": "here"})
print four

从bs4导入美化组
html=”“”




"""
soup=BeautifulSoup（html）
#这告诉BS查看所有h3标签，并找到具有我ID的标签
#但是不应该这样做，因为ID应该是唯一的，所以
#应该使用soup.find_all（id=“me”）
一=汤。查找所有（“h3”，{“id”：“me”}）
打印一张
#和上面一样，如果某物有ID，只需使用ID即可
two=soup.find_all（“li”，“id”：“test1”}）#id应该是唯一的
打印两张
#告诉BS查看所有li标记并找到具有自定义属性的节点
three=soup.find_all（“li”，“custom”：“test2321”}）
打印三张
#再说一次，应该已经足够了
four=soup.find_all（“li”，“id”：“test1”，“class”：“tester”}）
打印四
#查看ul标签，找到一个class属性为“here”的标签
four=soup.find_all（“ul”，“class”：“here”}）
打印四

输出：

[<h3 id="me"></h3>]
[<li id="test1"></li>, <li class="tester" id="test1"></li>]
[<li custom="test2321"></li>]
[<li class="tester" id="test1"></li>]
[<ul class="here"></ul>]

[]
[
，]
[]
[]
[]

应提供所需的文档。

查看以下代码：

from bs4 import BeautifulSoup

html = """
<h3 id="me"></h3>
<li id="test1"></li>
<li custom="test2321"></li>
<li id="test1" class="tester"></li>
<ul class="here"></ul>
"""

soup = BeautifulSoup(html)

# This tells BS to look at all the h3 tags, and find the ones that have an ID of me
# This however should not be done because IDs are supposed to be unique, so
# soup.find_all(id="me") should be used
one = soup.find_all("h3", {"id": "me"})
print one

# Same as above, if something has an ID, just use the ID
two = soup.find_all("li", {"id": "test1"})  # ids should be unique
print two

# Tells BS to look at all the li tags and find the node with a custom attribute
three = soup.find_all("li", {"custom": "test2321"})
print three

# Again ID, should have been enough
four = soup.find_all("li", {"id": "test1", "class": "tester"})
print four

# Look at ul tags, and find the one with a class attribute of "here"
four = soup.find_all("ul", {"class": "here"})
print four

从bs4导入美化组
html=”“”




"""
soup=BeautifulSoup（html）
#这告诉BS查看所有h3标签，并找到具有我ID的标签
#但是不应该这样做，因为ID应该是唯一的，所以
#应该使用soup.find_all（id=“me”）
一=汤。查找所有（“h3”，{“id”：“me”}）
打印一张
#和上面一样，如果某物有ID，只需使用ID即可
two=soup.find_all（“li”，“id”：“test1”}）#id应该是唯一的
打印两张
#告诉BS查看所有li标记并找到具有自定义属性的节点
three=soup.find_all（“li”，“custom”：“test2321”}）
打印三张
#再说一次，应该已经足够了
four=soup.find_all（“li”，“id”：“test1”，“class”：“tester”}）
打印四
#查看ul标签，找到一个class属性为“here”的标签
four=soup.find_all（“ul”，“class”：“here”}）
打印四

输出：

[<h3 id="me"></h3>]
[<li id="test1"></li>, <li class="tester" id="test1"></li>]
[<li custom="test2321"></li>]
[<li class="tester" id="test1"></li>]
[<ul class="here"></ul>]

[]
[
，]
[]
[]
[]

应提供所需的文档。

来自帮助：

In [30]: soup.find_all?
Type:       instancemethod
String Form:
<bound method BeautifulSoup.find_all 
File:       /usr/lib/python2.7/site-packages/bs4/element.py
Definition: soup.find_all(self, name=None, attrs={}, recursive=True, text=None, limit=None, **kwargs)
Docstring:
Extracts a list of Tag objects that match the given
criteria.  You can specify the name of the Tag and any
attributes you want the Tag to have.

The value of a key-value pair in the 'attrs' map can be a
string, a list of strings, a regular expression object, or a
callable that takes a string and returns whether or not the
string matches for some custom definition of 'matches'. The
same is true of the tag name.

[30]中的

：soup.find_all？
类型：instancemethod
字符串形式：
从帮助：
In [30]: soup.find_all?
Type:       instancemethod
String Form:
<bound method BeautifulSoup.find_all 
File:       /usr/lib/python2.7/site-packages/bs4/element.py
Definition: soup.find_all(self, name=None, attrs={}, recursive=True, text=None, limit=None, **kwargs)
Docstring:
Extracts a list of Tag objects that match the given
criteria.  You can specify the name of the Tag and any
attributes you want the Tag to have.

The value of a key-value pair in the 'attrs' map can be a
string, a list of strings, a regular expression object, or a
callable that takes a string and returns whether or not the
string matches for some custom definition of 'matches'. The
same is true of the tag name.

[30]中的：soup.find_all？
类型：instancemethod
字符串形式：
我只是把所有东西都放在attrs
：P.对我来说最简单的方法：PI.把所有东西都放在attrs
：P.对我来说最简单的方法：P