Python，请求，线程，Python请求关闭套接字的速度有多快？_Python_Multithreading_Sockets_Python Requests

Python，请求，线程，Python请求关闭套接字的速度有多快？

python multithreading sockets

Python，请求，线程，Python请求关闭套接字的速度有多快？,python,multithreading,sockets,python-requests,Python,Multithreading,Sockets,Python Requests,我正在尝试使用Python请求进行操作。这是我的密码： import threading import resource import time import sys #maximum Open File Limit for thread limiter. maxOpenFileLimit = resource.getrlimit(resource.RLIMIT_NOFILE)[0] # For example, it shows 50. # Will use one session for

我正在尝试使用Python请求进行操作。这是我的密码：

import threading
import resource
import time
import sys

#maximum Open File Limit for thread limiter.
maxOpenFileLimit = resource.getrlimit(resource.RLIMIT_NOFILE)[0] # For example, it shows 50.

# Will use one session for every Thread.
requestSessions = requests.Session()
# Making requests Pool bigger to prevent [Errno -3] when socket stacked in CLOSE_WAIT status.
adapter = requests.adapters.HTTPAdapter(pool_maxsize=(maxOpenFileLimit+100))
requestSessions.mount('http://', adapter)
requestSessions.mount('https://', adapter)

def threadAction(a1, a2):
    global number
    time.sleep(1) # My actions with Requests for each thread.
    print number = number + 1

number = 0 # Count of complete actions

ThreadActions = [] # Action tasks.
for i in range(50): # I have 50 websites I need to do in parallel threads.
    a1 = i
    for n in range(10): # Every website I need to do in 3 threads
        a2 = n
        ThreadActions.append(threading.Thread(target=threadAction, args=(a1,a2)))


for item in ThreadActions:
    # But I can't do more than 50 Threads at once, because of maxOpenFileLimit.
    while True:
        # Thread limiter, analogue of BoundedSemaphore.
        if (int(threading.activeCount()) < threadLimiter):
            item.start()
            break
        else:
            continue

for item in ThreadActions:
    item.join()

导入线程
导入资源
导入时间
导入系统
#线程限制器的最大打开文件限制。
maxOpenFileLimit=resource.getrlimit（resource.RLIMIT_NOFILE）[0]#例如，它显示50。
#将为每个线程使用一个会话。
requestSessions=请求。会话（）
#使请求池更大，以防止套接字堆叠在关闭等待状态时出现[Errno-3]。
adapter=requests.adapters.HTTPAdapter（pool_maxsize=（maxOpenFileLimit+100））
requestSessions.mount（'http://'，适配器）
requestSessions.mount（'https://'，适配器）
def螺纹作用（a1、a2）：
全局数
时间。睡眠（1）#我的动作和对每个线程的请求。
打印编号=编号+1
数量=0#完成操作的计数
ThreadActions=[]#操作任务。
对于范围（50）内的i:#我有50个网站需要并行处理。
a1=i
对于范围（10）中的n:#我需要在3个线程中完成的每个网站
a2=n
附加（threading.Thread（target=threadAction，args=（a1，a2）））
对于ThreadActions中的项目：
#但由于maxOpenFileLimit的限制，我不能一次执行超过50个线程。
尽管如此：
#线程限制器，模拟有界信号量。
如果（int（threading.activeCount（））


但问题是，在我启动50个线程后，线程限制器开始等待某个线程完成其工作。这就是问题所在。在scrit进入限制器后，lsof-i | grep python | wc-l
显示的活动连接远少于50个。但在限制器之前，它显示了所有的限制器都是一个紧循环，占用了您大部分的处理时间。使用线程池来限制工作线程的数量
import multiprocessing.pool

# Will use one session for every Thread.
requestSessions = requests.Session()
# Making requests Pool bigger to prevent [Errno -3] when socket stacked in CLOSE_WAIT status.
adapter = requests.adapters.HTTPAdapter(pool_maxsize=(maxOpenFileLimit+100))
requestSessions.mount('http://', adapter)
requestSessions.mount('https://', adapter)

def threadAction(a1, a2):
    global number
    time.sleep(1) # My actions with Requests for each thread.
    print number = number + 1 # DEBUG: This doesn't update number and wouldn't be
                              # thread safe if it did

number = 0 # Count of complete actions

pool = multiprocessing.pool.ThreadPool(50, chunksize=1)

ThreadActions = [] # Action tasks.
for i in range(50): # I have 50 websites I need to do in parallel threads.
    a1 = i
    for n in range(10): # Every website I need to do in 3 threads
        a2 = n
        ThreadActions.append((a1,a2))

pool.map(ThreadActons)
pool.close()

你的线程限制器进入了一个紧密的循环，占用了你大部分的处理时间。尝试类似于sleep（.1）
的方法来降低速度。更好的是，使用一个限制为50个请求的队列，让你的线程读取这些请求。完成此操作后，从python内部增加限制时，请查找。当然，请确保您在不必要的循环中没有忙着运行，并且正确地多路复用了您的代码。但是为什么在脚本达到极限后，lsof-i | grep python | wc-l
显示的数字要低得多？多处理比线程处理快吗？这对处理器负载有何影响？这是一种权衡。。。在windows上与linux不同。对于多处理，数据需要在父级和子级之间序列化（在windows上，通常需要序列化更多的上下文，因为子级没有父级内存空间的克隆），但这样就不用担心通过GIL的单线程。较高的CPU和/或较低的数据开销有利于多处理。但是，如果您大部分是I/O绑定的，线程池也可以。