Python urllib2+；美丽之群_Python_Beautifulsoup_Urllib2

Python urllib2+；美丽之群

python

Python urllib2+；美丽之群,python,beautifulsoup,urllib2,Python,Beautifulsoup,Urllib2,所以我正在努力在我当前的python项目中实现Beauty，好吧，为了保持这个简单明了，我将降低我当前脚本的复杂性不带美化的脚本组- import urllib2 def check(self, name, proxy): urllib2.install_opener( urllib2.build_opener( urllib2.ProxyHandler({'http': 'http://%s' % proxy}

所以我正在努力在我当前的python项目中实现Beauty，好吧，为了保持这个简单明了，我将降低我当前脚本的复杂性

不带美化的脚本组-

import urllib2

    def check(self, name, proxy):
        urllib2.install_opener(
            urllib2.build_opener(
                urllib2.ProxyHandler({'http': 'http://%s' % proxy}),
                urllib2.HTTPHandler()
                )
            )

        req = urllib2.Request('http://example.com' ,"param=1")
        try:
            resp = urllib2.urlopen(req) 
        except:
            self.insert()
        try:
            if 'example text' in resp.read()
               print 'success'

现在，缩进当然是错误的，这只是我正在做的事情的草图，你可以简单地说，我正在发送一个post请求到“example.com”&然后如果example.com在resp.read print success中包含“example text”

但我真正想要的是检查一下

if ' example ' in resp.read()

然后输出 td align from example.com请求中的文本使用

soup.find_all('td', {'align':'right'})[4]

现在我实现beautifulsoup的方式不起作用，例如-

import urllib2
from bs4 import BeautifulSoup as soup

main_div = soup.find_all('td', {'align':'right'})[4]

    def check(self, name, proxy):
        urllib2.install_opener(
            urllib2.build_opener(
                urllib2.ProxyHandler({'http': 'http://%s' % proxy}),
                urllib2.HTTPHandler()
                )
            )

        req = urllib2.Request('http://example.com' ,"param=1")
        try:
            resp = urllib2.urlopen(req) 
            web_soup = soup(urllib2.urlopen(req), 'html.parser')
        except:
            self.insert()
        try:
            if 'example text' in resp.read()
               print 'success' + main_div

现在你看，我添加了4条新的线条/调整

from bs4 import BeautifulSoup as soup

web_soup = soup(urllib2.urlopen(url), 'html.parser')

main_div = soup.find_all('td', {'align':'right'})[4]

aswell as " + main_div " on print

然而，它似乎不起作用，我在调整一些错误时出现了一些错误，其中一些错误说“赋值前引用的局部变量”和“必须使用beautifulsoup实例作为第一个参数调用unbound method find_all”

关于上一个代码段：

from bs4 import BeautifulSoup as soup

web_soup = soup(urllib2.urlopen(url), 'html.parser')
main_div = soup.find_all('td', {'align':'right'})[4]

您应该在web上调用

find\u all

。在使用

url

变量之前，请确保定义该变量：

from bs4 import BeautifulSoup as soup

url = "url to be opened"
web_soup = soup(urllib2.urlopen(url), 'html.parser')
main_div = web_soup.find_all('td', {'align':'right'})[4]

请尽量提供一个简单的例子。还要尝试正确缩进python代码。关于上一个代码示例，您应该调用

web\u-soup

变量：

web\u-soup.find\u-all（'td'，{'align'：'right'}）

。我确实说过这只是一个草图，缩进是错误的，但我主要想知道如何将其组合在一起，而不考虑缩进，因为我可以自己修复它。。我也不明白你的意思，你的意思是把main\u div=soup切换到main\u div=web\u soup吗？如果你想回答你的问题，你应该提供一个最小的例子，让其他人能够理解你的问题。您提供了3个代码片段，它们都不完整或缩进错误，因此很难重建您的问题。