Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/python-2.7/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 2.7 如何使用BeautifulSoup从Python中的字符串中删除html标记_Python 2.7_Beautifulsoup - Fatal编程技术网

Python 2.7 如何使用BeautifulSoup从Python中的字符串中删除html标记

Python 2.7 如何使用BeautifulSoup从Python中的字符串中删除html标记,python-2.7,beautifulsoup,Python 2.7,Beautifulsoup,这里是编程新手:) 我想用BeautifulSoup打印网站上的价格。这是我的代码: #!/usr/bin/env python # -*- coding: utf-8 -*- from bs4 import BeautifulSoup, SoupStrainer from urllib2 import urlopen url = "Some retailer's url" html = urlopen(url).read() product = SoupStrainer('span',{

这里是编程新手:)

我想用BeautifulSoup打印网站上的价格。这是我的代码:

#!/usr/bin/env python
# -*- coding: utf-8 -*-


from bs4 import BeautifulSoup, SoupStrainer
from urllib2 import urlopen

url = "Some retailer's url"
html = urlopen(url).read()
product = SoupStrainer('span',{'style': 'color:red;'})
soup = BeautifulSoup(html, parse_only=product)
print soup.prettify()
并按以下顺序打印价格:

<span style="color:red;">
 180
</span>
<span style="color:red;">
 1250
</span>
<span style="color:red;">
 380
</span>

180
1250
380
我尝试了打印soup.text.strip()但它返回了
1801250380

请帮我打印每行的价格:)


非常感谢

这将获得转换为整数的字符串列表:

>>> print "\n".join([p.get_text(strip=True) for p in soup.find_all(product)])
180
1250
380
>>> [int(span.text) for span in soup.find_all('span')]
[180, 1250, 380]

如果标记从
span
更改为,比方说,
div
,这将停止工作。那么您的意思是,如果代码结构更改,则需要更新web抓取代码?这应该不言而喻。