Python 在Beautifulsoup中使用select函数将返回None值_Python_Python 2.7_Css Selectors_Beautifulsoup_Web Crawler

Python 在Beautifulsoup中使用select函数将返回None值

python python-2.7 web-crawler

Python 在Beautifulsoup中使用select函数将返回None值,python,python-2.7,css-selectors,beautifulsoup,web-crawler,Python,Python 2.7,Css Selectors,Beautifulsoup,Web Crawler,我正在使用Python-2.7和BeautifulSoup 关于我的问题，我试图从几乎同名的div标签中获取内容。因此，我需要严格检查div标记的类名下面是我的代码- list = ['Link1','Link2','Link3','Link4',....etc] for i in list: mech = Browser() mech.set_handle_robots(False) mech.set_handle_equiv(False) hadr = {'

我正在使用Python-2.7和BeautifulSoup

关于我的问题，我试图从几乎同名的div标签中获取内容。因此，我需要严格检查div标记的类名

下面是我的代码-

list = ['Link1','Link2','Link3','Link4',....etc]
for i in list:
    mech = Browser()
    mech.set_handle_robots(False)
    mech.set_handle_equiv(False)
    hadr = {'User-Agent':'Agent'}
    req = urllib2.Request(i,headers=hadr)
    try:
            pan = urllib2.urlopen(req)
            soup = BeautifulSoup(pan, "lxml") 
            tag1 = soup.select("div[class=profile-container abc-profile-container]")
            print "TAG_1",tag1
            tag2 = soup.select("div[class=profile-container]")
            print "TAG_2",tag2
    except Exception as e:
            print e
            print(type(e))

我想进一步说明的是，列表中的任何随机链接都包含tag1的div类，但其输出为空

所有我想要的链接都有

（“div[class=profile container abc profile container]”

应该接受tag1并相应地工作，而不是给出一个空白列表作为输出。

在

中使用。select（）

我试过了，但它给了我以下输出-

TAG\u 1[]TAG\u 2[---实际内容--]

从tag1中获得类的链接没有显示任何内容，正如它为TAG\u 2[]显示的那样。你确定汤中有

div

标记和

配置文件容器

abc配置文件容器类吗？我为我的测试网页测试了tag1 select和tag2 select，效果非常好。我将整个输出粘贴到记事本中，它显示

TAG_1[]TAG_2[---来自div.profile-container的实际内容-----TAG u 1[---来自div.profile-container.abc profile container的实际内容---]TAG_2[---div.profile-container.abc profile container中的实际内容----

您的输出表明，您通过两个链接进行了测试，第一个链接有

.profile container

，但没有

。你还想要什么？根据你的问题，这是你想要的输出。它就像一个链接有上面提到的任何一个类，而不是两个，因此我为它制作了两个单独的tag_变量。我无法理解为什么它与第二个标签一起工作，而不是与第一个标签一起工作，其中第一个链接包含配置文件容器abc profile container
类，因为您最初说任何形式的指导/帮助都是值得赞赏的
我建议您查看回溯。print_exc
而不是print（e），print（type（e）），它的信息量大得惊人。您好，非常感谢，但有人建议我编辑它，因此我这样做了。虽然我一定会尝试一下，但它吸引了像我这样的离题评论P
tag1 = soup.select("div.profile-container.abc-profile-container")
tag2 = soup.select("div.profile-container")