Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/335.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python,请求,线程,Python请求关闭套接字的速度有多快?_Python_Multithreading_Sockets_Python Requests - Fatal编程技术网

Python,请求,线程,Python请求关闭套接字的速度有多快?

Python,请求,线程,Python请求关闭套接字的速度有多快?,python,multithreading,sockets,python-requests,Python,Multithreading,Sockets,Python Requests,我正在尝试使用Python请求进行操作。这是我的密码: import threading import resource import time import sys #maximum Open File Limit for thread limiter. maxOpenFileLimit = resource.getrlimit(resource.RLIMIT_NOFILE)[0] # For example, it shows 50. # Will use one session for

我正在尝试使用Python请求进行操作。这是我的密码:

import threading
import resource
import time
import sys

#maximum Open File Limit for thread limiter.
maxOpenFileLimit = resource.getrlimit(resource.RLIMIT_NOFILE)[0] # For example, it shows 50.

# Will use one session for every Thread.
requestSessions = requests.Session()
# Making requests Pool bigger to prevent [Errno -3] when socket stacked in CLOSE_WAIT status.
adapter = requests.adapters.HTTPAdapter(pool_maxsize=(maxOpenFileLimit+100))
requestSessions.mount('http://', adapter)
requestSessions.mount('https://', adapter)

def threadAction(a1, a2):
    global number
    time.sleep(1) # My actions with Requests for each thread.
    print number = number + 1

number = 0 # Count of complete actions

ThreadActions = [] # Action tasks.
for i in range(50): # I have 50 websites I need to do in parallel threads.
    a1 = i
    for n in range(10): # Every website I need to do in 3 threads
        a2 = n
        ThreadActions.append(threading.Thread(target=threadAction, args=(a1,a2)))


for item in ThreadActions:
    # But I can't do more than 50 Threads at once, because of maxOpenFileLimit.
    while True:
        # Thread limiter, analogue of BoundedSemaphore.
        if (int(threading.activeCount()) < threadLimiter):
            item.start()
            break
        else:
            continue

for item in ThreadActions:
    item.join()
导入线程
导入资源
导入时间
导入系统
#线程限制器的最大打开文件限制。
maxOpenFileLimit=resource.getrlimit(resource.RLIMIT_NOFILE)[0]#例如,它显示50。
#将为每个线程使用一个会话。
requestSessions=请求。会话()
#使请求池更大,以防止套接字堆叠在关闭等待状态时出现[Errno-3]。
adapter=requests.adapters.HTTPAdapter(pool_maxsize=(maxOpenFileLimit+100))
requestSessions.mount('http://',适配器)
requestSessions.mount('https://',适配器)
def螺纹作用(a1、a2):
全局数
时间。睡眠(1)#我的动作和对每个线程的请求。
打印编号=编号+1
数量=0#完成操作的计数
ThreadActions=[]#操作任务。
对于范围(50)内的i:#我有50个网站需要并行处理。
a1=i
对于范围(10)中的n:#我需要在3个线程中完成的每个网站
a2=n
附加(threading.Thread(target=threadAction,args=(a1,a2)))
对于ThreadActions中的项目:
#但由于maxOpenFileLimit的限制,我不能一次执行超过50个线程。
尽管如此:
#线程限制器,模拟有界信号量。
如果(int(threading.activeCount())

但问题是,在我启动50个线程后,
线程限制器开始等待某个线程完成其工作。这就是问题所在。在scrit进入限制器后,
lsof-i | grep python | wc-l
显示的活动连接远少于50个。但在限制器之前,它显示了所有的限制器都是一个紧循环,占用了您大部分的处理时间。使用线程池来限制工作线程的数量

import multiprocessing.pool

# Will use one session for every Thread.
requestSessions = requests.Session()
# Making requests Pool bigger to prevent [Errno -3] when socket stacked in CLOSE_WAIT status.
adapter = requests.adapters.HTTPAdapter(pool_maxsize=(maxOpenFileLimit+100))
requestSessions.mount('http://', adapter)
requestSessions.mount('https://', adapter)

def threadAction(a1, a2):
    global number
    time.sleep(1) # My actions with Requests for each thread.
    print number = number + 1 # DEBUG: This doesn't update number and wouldn't be
                              # thread safe if it did

number = 0 # Count of complete actions

pool = multiprocessing.pool.ThreadPool(50, chunksize=1)

ThreadActions = [] # Action tasks.
for i in range(50): # I have 50 websites I need to do in parallel threads.
    a1 = i
    for n in range(10): # Every website I need to do in 3 threads
        a2 = n
        ThreadActions.append((a1,a2))

pool.map(ThreadActons)
pool.close()

你的线程限制器进入了一个紧密的循环,占用了你大部分的处理时间。尝试类似于
sleep(.1)
的方法来降低速度。更好的是,使用一个限制为50个请求的队列,让你的线程读取这些请求。完成此操作后,从python内部增加限制时,请查找。当然,请确保您在不必要的循环中没有忙着运行,并且正确地多路复用了您的代码。但是为什么在脚本达到极限后,
lsof-i | grep python | wc-l
显示的数字要低得多?多处理比线程处理快吗?这对处理器负载有何影响?这是一种权衡。。。在windows上与linux不同。对于多处理,数据需要在父级和子级之间序列化(在windows上,通常需要序列化更多的上下文,因为子级没有父级内存空间的克隆),但这样就不用担心通过GIL的单线程。较高的CPU和/或较低的数据开销有利于多处理。但是,如果您大部分是I/O绑定的,线程池也可以。