Python url开放进程下的多线程处理_Python_Multithreading_Multiprocess

Python url开放进程下的多线程处理

python multithreading

Python url开放进程下的多线程处理,python,multithreading,multiprocess,Python,Multithreading,Multiprocess,我完成了一个脚本的编辑，该脚本检查url是否需要WWW web基本身份验证，并按照以下脚本为用户打印结果： #!/usr/bin/python # Importing libraries from urllib2 import urlopen, HTTPError import socket import urllib2 import threading import time # Setting up variables url = open("oo.txt",'r') response

我完成了一个脚本的编辑，该脚本检查url是否需要WWW web基本身份验证，并按照以下脚本为用户打印结果：

#!/usr/bin/python

# Importing libraries
from urllib2 import urlopen, HTTPError
import socket
import urllib2
import threading
import time

# Setting up variables
url = open("oo.txt",'r')
response = None
start = time.time()

# Excuting Coommands
start = time.time()
for line in url:
    try:
        response = urlopen(line, timeout=1)
    except HTTPError as exc:
        # A 401 unauthorized will raise an exception
        response = exc
    except socket.timeout:
        print ("{0} | Request timed out !!".format(line))
    except urllib2.URLError:
        print ("{0} | Access error !!".format(line))

    auth = response and response.info().getheader('WWW-Authenticate')
    if auth and auth.lower().startswith('basic'):
        print "requires basic authentication"
    elif socket.timeout or urllib2.URLError:
        print "Yay"
    else:
        print "Not requires basic authentication"

print "Elapsed Time: %s" % (time.time() - start)

我有一些小事情需要你的帮助来编辑脚本。。我希望脚本一起检查每10个URL，并在一个文本文件中一次性给出所有URL的结果。我读过关于多线程和处理的文章，但我没有从我的案例中找到一个匹配的例子来简化代码

此外，当出现超时或url错误时，我的结果也有问题，脚本将以两行形式给出结果，如下所示：

http://www.test.test
 | Access error !!

我希望它在一行，为什么它显示在两个

在这个问题上有什么帮助吗

提前感谢

concurrent.futures包提供了一些功能，使得在Python中使用并发非常容易。您定义了一个应该为每个url调用的函数

check\u url

。然后，您可以使用

map

函数将该函数并行应用于每个URL并迭代返回值

#! /usr/bin/env python3

import concurrent.futures
import urllib.error
import urllib.request
import socket

def load_urls(pathname):
    with open(pathname, 'r') as f:
        return [ line.rstrip('\n') for line in f ]

class BasicAuth(Exception): pass

class CheckBasicAuthHandler(urllib.request.BaseHandler):
    def http_error_401(self, req, fp, code, msg, hdrs):
        if hdrs.get('WWW-Authenticate', '').lower().startswith('basic'):
            raise BasicAuth()
        return None

def check_url(url):
    try:
        opener = urllib.request.build_opener(CheckBasicAuthHandler())
        with opener.open(url, timeout=1) as u:
            return 'requires no authentication'
    except BasicAuth:
        return 'requires basic authentication'
    except socket.timeout:
        return 'request timed out'
    except urllib.error.URLError as e:
        return 'access error ({!r})'.format(e.reason)

if __name__ == '__main__':
    urls = load_urls('/tmp/urls.txt')
    with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
        for url, result in zip(urls, executor.map(check_url, urls)):
            print('{}: {}'.format(url, result))

谢谢，但很抱歉，我不理解“def”函数，所以我应该在这个def中输入什么代码？我提到的行问题的解决方案是什么？我已经努力写下了整个程序。不过，它是基于Python 3的。您好，感谢您的代码，它正在工作，但有一个问题，我尝试了代码，但是当页面需要auth时，它重新运行值“access error”，因为auth error包含在urllib.error.urleror中，您能告诉我如何解决这个问题吗？检查基本auth有点复杂，因为urllib是一个更高级别的api，它用于在内部处理状态代码。我已经更新了urllib的答案，但也许最好使用http.client.related:（可以根据您的情况轻松简化）