Python BeautifulSoup无法选择特定标记
我的问题是解析一个网站,然后用BS加载数据树。如何查找标记的内容?我试过了Python BeautifulSoup无法选择特定标记,python,beautifulsoup,Python,Beautifulsoup,我的问题是解析一个网站,然后用BS加载数据树。如何查找标记的内容?我试过了 for first in soup.find_all("li", class_="li-in"): print first.select("em.fl.in-date").string #or print first.select("em.fl.in-date").contents 但它不起作用。请帮忙 我正在tutti.ch上寻找汽车 这是我的全部代码: #C
for first in soup.find_all("li", class_="li-in"):
print first.select("em.fl.in-date").string
#or
print first.select("em.fl.in-date").contents
但它不起作用。请帮忙
我正在tutti.ch上寻找汽车
这是我的全部代码:
#Crawl tutti.ch
import urllib
thisurl = "http://www.tutti.ch/stgallen/fahrzeuge/autos"
handle = urllib.urlopen(thisurl)
html_gunk = handle.read()
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_gunk, 'html.parser')
for first in soup.find_all("li", class_="li-in"):
if first.a.string and "Audi" and "BMW" in first.a.string:
print "Geschafft: %s" % first.a.contents
print first.select("em.fl.in-date").string
else:
print first.a.contents
当它发现一辆宝马或奥迪时,它应该检查汽车是何时插入的。时间位于em标记中,如下所示:
今日
13:59
假设您的选择器是正确的。您没有提供要删除的URL,所以我不能确定
>>> url = "http://stackoverflow.com/questions/38187213/python-beautifulsoup"
>>> from bs4 import BeautifulSoup
>>> import urllib2
>>> html = urllib2.urlopen(url).read()
>>> soup = BeautifulSoup(html)
>>> soup.find_all("p")[0].text
u'My problem is when parsing a website and then loading the data tree with BS. How can I look for the content of an <em> Tag? I tried '
非常感谢亚当·巴恩斯。你的代码工作得很好!!奥迪永远都是真的
>>> url = "http://stackoverflow.com/questions/38187213/python-beautifulsoup"
>>> from bs4 import BeautifulSoup
>>> import urllib2
>>> html = urllib2.urlopen(url).read()
>>> soup = BeautifulSoup(html)
>>> soup.find_all("p")[0].text
u'My problem is when parsing a website and then loading the data tree with BS. How can I look for the content of an <em> Tag? I tried '
#Crawl tutti.ch
import urllib
thisurl = "http://www.tutti.ch/stgallen/fahrzeuge/autos"
handle = urllib.urlopen(thisurl)
html_gunk = handle.read()
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_gunk, 'html.parser')
for first in soup.find_all("li", class_="li-in"):
if first.a.string and "Audi" and "BMW" in first.a.string:
print "Geschafft: %s" % first.a.contents
print first.select("em.fl.in-date")[0].text
else:
print first.a.contents