Python 为什么BeautifulSoup和lxml不'；不行？_Python_Request_Html Parsing_Response

Python 为什么BeautifulSoup和lxml不'；不行？

python

Python 为什么BeautifulSoup和lxml不'；不行？,python,request,html-parsing,response,Python,Request,Html Parsing,Response,我正在使用mechanize库登录网站。我检查过了，效果很好。但问题是我不能将response.read（）与BeautifulSoup和“lxml”一起使用 #BeautifulSoup response = browser.open(url) source = response.read() soup = BeautifulSoup(source) #source.txt doesn't work either for link in soup.findAll('a', {'class':

我正在使用

mechanize

库登录网站。我检查过了，效果很好。但问题是我不能将

response.read（）

与

BeautifulSoup

和“lxml”一起使用

#BeautifulSoup
response = browser.open(url)
source = response.read()
soup = BeautifulSoup(source)  #source.txt doesn't work either
for link in soup.findAll('a', {'class':'someClass'}):
    some_list.add(link)

这不起作用，实际上找不到任何标签。当我使用

requests.get（url）

时，它工作得很好

没有打印任何东西。我知道它的返回类型

response

有问题，因为它可以很好地处理

请求。open（）

。我能做什么？请提供html解析中使用的

response.read（）

的示例代码

顺便问一下，

响应

和

请求

对象之间有什么区别

谢谢我找到了解决办法。这是因为

mechanize.browser

是模拟浏览器，它只获取原始html。我想要抓取的页面在JavaScript的帮助下将类添加到标记中，因此这些类不在原始html上。最好的选择是使用webdriver。我在Python中使用了Selenium。以下是代码：

from selenium import webdriver

profile = webdriver.FirefoxProfile()
profile.set_preference('network.http.phishy-userpass-length', 255)
driver = webdriver.Firefox(firefox_profile=profile)

driver.get(url)
list = driver.find_elements_by_xpath('//a[@class="someClass"]')

注意：您需要安装Firefox。或者，您可以根据要使用的浏览器选择其他配置文件

请求是web客户端发送到服务器的内容，其中包含有关客户端想要的URL、要使用的http动词（get/post等）的详细信息，如果您提交表单，则请求通常包含您在表单中输入的数据。响应是web服务器对客户机请求的回复。响应有一个状态代码，指示请求是否成功（如果没有问题，通常为代码200，或者错误代码404或500）。响应通常包含数据，如页面中的html或jpeg中的二进制数据。响应还具有标题，这些标题提供有关响应中数据的更多信息（例如，“内容类型”标题说明数据的格式）

引用@davidbuxton对此的回答

祝你好运

from selenium import webdriver

profile = webdriver.FirefoxProfile()
profile.set_preference('network.http.phishy-userpass-length', 255)
driver = webdriver.Firefox(firefox_profile=profile)

driver.get(url)
list = driver.find_elements_by_xpath('//a[@class="someClass"]')