Python 出现BeautifulSoup问题并从findAll函数打印字符串
我正在尝试从/r/Askreddit获取线程标题。下面的代码返回None而不是线程标题Python 出现BeautifulSoup问题并从findAll函数打印字符串,python,beautifulsoup,Python,Beautifulsoup,我正在尝试从/r/Askreddit获取线程标题。下面的代码返回None而不是线程标题 from BeautifulSoup import BeautifulSoup import urllib2, json site='http://www.reddit.com/r/AskReddit/' soup=BeautifulSoup(urllib2.urlopen(site)) questions=soup.findAll('p',{"class":"title"}) for i in q
from BeautifulSoup import BeautifulSoup
import urllib2, json
site='http://www.reddit.com/r/AskReddit/'
soup=BeautifulSoup(urllib2.urlopen(site))
questions=soup.findAll('p',{"class":"title"})
for i in questions:
print i.string
break
标题位于
a
标记的string
属性中,而不是p
标记中。
另外,请注意标题
后面的空格:
questions=soup.findAll('a',{"class":"title "})
以上内容是通过查看此HTML片段发现的:
<p class="title"><a class="title " href="http://www.reddit.com/r/AskReddit/comments/l5157/whats_the_best_face_you_can_pull_before_and_after/">What's the best face you can pull? Before and after please.</a> <span class="domain">(<a href="http://www.reddit.com/r/AskReddit/">self.AskReddit</a>)</span></p>
()