python urllib2中的代理和URL
这是我的代码,但它给了我一些我无法解决的错误。尽管同一代码在使用单个url和单个代理时运行良好,但在使用代理和url的文件时不会运行python urllib2中的代理和URL,python,python-2.7,proxy,urllib2,Python,Python 2.7,Proxy,Urllib2,这是我的代码,但它给了我一些我无法解决的错误。尽管同一代码在使用单个url和单个代理时运行良好,但在使用代理和url的文件时不会运行 import urllib2 import time #bangalore, boston,china with open('urls.txt') as f: urls = [line.strip() for line in f] print "list of urls",urls with open('proxies.txt') as pro
import urllib2
import time
#bangalore, boston,china
with open('urls.txt') as f:
urls = [line.strip() for line in f]
print "list of urls",urls
with open('proxies.txt') as proxies:
for proxy in proxies:
print proxy
proxy = proxy.rstrip()
print proxy
proxy_handler = urllib2.ProxyHandler(proxy)
opener = urllib2.build_opener(proxy_handler)
urllib2.install_opener(opener)
try:
for url in urls:
request=urllib2.Request(url)
start=time.time()
try:
print "from try block"
response=urllib2.urlopen(urls[0])
response.read(1)
ttfb = time.time() - start
print "Latency:", ttfb
print "Status Code:", response.code
print "Headers:", response.headers
print "Redirected url:", response.url
except urllib2.URLError as e:
print "From except"
print "Error Reason:", e.reason
print "Error Message:", e.message
# print "Redirected URL:", e.url
except urllib2.HTTPError as e:
print e.reason
except Exception,e:
print e
替换为:
proxy = json.loads(proxy.rstrip())
(并导入json)
URL.txt行如下所示:
http://www.google.com
proxies.txt行如下所示:
{“http”:http://ip:port“}
此外,根据我对您文章的评论,这将始终指第一个url:
response=urllib2.urlopen(urls[0])
urls.txt类似于:“”,proxies.txt类似于:{'http':'ipaddress:8000'}如果要将字符串加载到代理处理程序中,请尝试使用json.loads加载proxies.txt中的行以创建dict对象。我想格式应该是{“http”:“}。可能还有其他问题。
response=urllib2.urlopen(url[0])
应该是response=urllib2.urlopen(url)
?是的。我认为这是键入错误。正如你告诉我的那样,上面提到的使用json加载的方法是resonse=urllib2.urlopen(url)。load()已经尝试过了。。它给了我一个错误作为obj,end=self.scan_once(s,idx)ValueError:应该是属性名:第1行第2列(char 1)