Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/342.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何在Beautifulsoup中提取标记的子级?_Python_Beautifulsoup - Fatal编程技术网

Python 如何在Beautifulsoup中提取标记的子级?

Python 如何在Beautifulsoup中提取标记的子级?,python,beautifulsoup,Python,Beautifulsoup,我有以下代码,我想提取里面的内容 <p><strong>1. Start big</strong><br><br> Make a slam dunk right away. Boom! Just do it! Start strong! If you’re making a list article about poodle outerwear, don’t save the best for last: put that sporty

我有以下代码,我想提取里面的内容

<p><strong>1. Start big</strong><br><br>
Make a slam dunk right away. Boom! Just do it! Start strong! If you’re making a list article about poodle outerwear, don’t save the best for last: put that sporty little pool-vest idea right up there at the top. </p>

<p><strong>2. Hook them and hook them good</strong><br><br>
A recent study of lists (included in another article about the top ten research studies, natch), assembled by some guy you’ve never heard of from an obscure European university in his spare time, found that Web readers usually don’t make it past the first few items on a list. Sad, isn’t it? I bet you’re already thinking about stopping. Yes, it sucks to know people have shorter attention spans than an overly-caffeinated Himalayan fruit-fly. Make the first few count, okay?</p>

<p><strong>3. Stay on message</strong><br><br>
Let’s say you’re writing a list article about the top movies starring Naomi Watts that don’t suck. It’s a short list, if you remember anything about King Kong or her early indie films. I see this kind of thing pop up on <a href="http://www.foxnews.com" rel="nofollow">Fox News</a> and <a href="http://www.metacritic.com" rel="nofollow">Metacritic</a> once in awhile, and I usually can’t stop myself from clicking on them. You get into sort of a click-trance. In fact, hang on a second. I think there might be one on the top opening acts when The Bieb performs in space. Oh yes there is! Okay, back. So, in your article list of the top movies that use a Meatloaf song in the soundtrack, adding that one from Black Sabbath is just not proper usage. We want Meatloaf and Meatloaf only, people! Besides, Black Sabbath is for sissies.</p>
我应该添加什么来提取浓味

我已经尝试了
汤。从bs4导入BeautifulSoup中查找所有('p>strong')


第“”页
1.从大做起

马上灌篮。砰!就这么做!坚强地开始!如果你正在写一篇关于贵宾犬外套的文章,不要把最好的留到最后:把运动型小游泳衣的想法放在最上面

2.勾住他们,把他们勾好

最近对列表的研究(另一篇关于前十大研究的文章,natch)由一个你从未听说过的人在业余时间从一所默默无闻的欧洲大学召集,他发现网络读者通常无法通过列表上的前几项。很遗憾,不是吗?我打赌你已经在考虑停止。是的,知道人们的注意力跨度比一个过度咖啡因的喜马拉雅fru要短真是糟糕它飞起来了,先数一数,好吗

3.保持信息畅通

假设你正在写一篇关于Naomi Watts主演的顶级电影的列表文章,这些电影并不差劲。如果你还记得《金刚》或她早期的独立电影的话,这是一个简短的列表。我偶尔会看到这种东西突然出现,我通常无法阻止自己点击它们。你会进入一种点击恍惚的状态。在fact,等一下。我想当Bieb在太空中表演时,可能会有一个在顶部的开场表演。哦,是的,有!好的,回来。所以,在你的文章列表中,在配乐中使用肉糕歌曲的顶级电影中,添加来自黑色安息日的那一个是不正确的用法。我们只想要肉糕和肉糕,各位!此外,黑色安息日是给娘娘腔的

""" 汤=美汤(第“lxml”页) 对于汤中的内容物,选择('p>strong'): 打印(内容)
输出:

<strong>1. Start big</strong>
<strong>2. Hook them and hook them good</strong>
<strong>3. Stay on message</strong>
1。大起点
2。勾住他们,把他们勾好
3。保持信息畅通
对于CSS选择器,您需要使用
.select
方法,而不是
.find


您可以在
上找到bs4文档。选择
,以及一些来自w3schools的CSS选择器文档。

或者再次查找,
打印(content.find('strong'))
是的,特别是如果您想同时对
内容
做其他事情的话
from bs4 import BeautifulSoup

page = """
<p><strong>1. Start big</strong><br><br>
Make a slam dunk right away. Boom! Just do it! Start strong! If you’re making a list article about poodle outerwear, don’t save the best for last: put that sporty little pool-vest idea right up there at the top. </p>

<p><strong>2. Hook them and hook them good</strong><br><br>
A recent study of lists (included in another article about the top ten research studies, natch), assembled by some guy you’ve never heard of from an obscure European university in his spare time, found that Web readers usually don’t make it past the first few items on a list. Sad, isn’t it? I bet you’re already thinking about stopping. Yes, it sucks to know people have shorter attention spans than an overly-caffeinated Himalayan fruit-fly. Make the first few count, okay?</p>

<p><strong>3. Stay on message</strong><br><br>
Let’s say you’re writing a list article about the top movies starring Naomi Watts that don’t suck. It’s a short list, if you remember anything about King Kong or her early indie films. I see this kind of thing pop up on <a href="http://www.foxnews.com" rel="nofollow">Fox News</a> and <a href="http://www.metacritic.com" rel="nofollow">Metacritic</a> once in awhile, and I usually can’t stop myself from clicking on them. You get into sort of a click-trance. In fact, hang on a second. I think there might be one on the top opening acts when The Bieb performs in space. Oh yes there is! Okay, back. So, in your article list of the top movies that use a Meatloaf song in the soundtrack, adding that one from Black Sabbath is just not proper usage. We want Meatloaf and Meatloaf only, people! Besides, Black Sabbath is for sissies.</p>
"""

soup = BeautifulSoup(page, 'lxml')

for content in soup.select('p > strong'):
    print(content)
<strong>1. Start big</strong>
<strong>2. Hook them and hook them good</strong>
<strong>3. Stay on message</strong>