Python中的线程本地存储_Python_Multithreading_Thread Local Storage

Python中的线程本地存储

python multithreading

Python中的线程本地存储,python,multithreading,thread-local-storage,Python,Multithreading,Thread Local Storage,如何在Python中使用线程本地存储相关的 -此线程似乎更关注共享变量的时间 -Alex Martelli给出了一个很好的解决方案如问题中所述，Alex Martelli给出了一个解决方案。此函数允许我们使用工厂函数为每个线程生成默认值 #Code originally posted by Alex Martelli #Modified to use standard Python variable name conventions import threading threadlocal

如何在Python中使用线程本地存储

相关的

-此线程似乎更关注共享变量的时间
-Alex Martelli给出了一个很好的解决方案

如问题中所述，Alex Martelli给出了一个解决方案。此函数允许我们使用工厂函数为每个线程生成默认值

#Code originally posted by Alex Martelli
#Modified to use standard Python variable name conventions
import threading
threadlocal = threading.local()    

def threadlocal_var(varname, factory, *args, **kwargs):
  v = getattr(threadlocal, varname, None)
  if v is None:
    v = factory(*args, **kwargs)
    setattr(threadlocal, varname, v)
  return v

也会写

import threading
mydata = threading.local()
mydata.x = 1

mydata.x将仅存在于当前线程中。例如，如果您有一个线程工作线程池，并且每个线程都需要访问其自己的资源（如网络或数据库连接），则线程本地存储非常有用。请注意，

threading

模块使用线程的常规概念（可以访问进程全局数据），但由于全局解释器锁，这些线程没有太大用处。不同的

多处理

模块为每个进程创建一个新的子进程，因此任何全局线程都是本地线程

线程模块下面是一个简单的例子：

import threading
from threading import current_thread

threadLocal = threading.local()

def hi():
    initialized = getattr(threadLocal, 'initialized', None)
    if initialized is None:
        print("Nice to meet you", current_thread().name)
        threadLocal.initialized = True
    else:
        print("Welcome back", current_thread().name)

hi(); hi()

import threading

class Worker(threading.Thread):
    ns = threading.local()
    def run(self):
        self.ns.val = 0
        for i in range(5):
            self.ns.val += 1
            print("Thread:", self.name, "value:", self.ns.val)

w1 = Worker()
w2 = Worker()
w1.start()
w2.start()
w1.join()
w2.join()

这将打印出：

Nice to meet you MainThread
Welcome back MainThread

一件容易被忽略的重要事情是：

threading.local（）

对象只需要创建一次，而不是每个线程创建一次，也不是每个函数调用创建一次。

global

或

class

级别是理想的位置

原因如下：

threading.local（）

实际上每次调用它时都会创建一个新实例（就像任何工厂或类调用一样），因此多次调用

threading.local（）

会不断覆盖原始对象，这很可能不是我们想要的。当任何线程访问现有的

threadLocal

变量（或调用它的任何内容）时，它都会获得该变量的私有视图

这将无法按预期工作：

import threading
from threading import current_thread

def wont_work():
    threadLocal = threading.local() #oops, this creates a new dict each time!
    initialized = getattr(threadLocal, 'initialized', None)
    if initialized is None:
        print("First time for", current_thread().name)
        threadLocal.initialized = True
    else:
        print("Welcome back", current_thread().name)

wont_work(); wont_work()

将产生以下输出：

First time for MainThread
First time for MainThread

多处理模块所有全局变量都是线程局部变量，因为

多处理

模块为每个线程创建一个新进程

#Code originally posted by Alex Martelli
#Modified to use standard Python variable name conventions
import threading
threadlocal = threading.local()    

def threadlocal_var(varname, factory, *args, **kwargs):
  v = getattr(threadlocal, varname, None)
  if v is None:
    v = factory(*args, **kwargs)
    setattr(threadlocal, varname, v)
  return v

考虑这个例子，其中

已处理的计数器是线程本地存储的一个例子：
from multiprocessing import Pool
from random import random
from time import sleep
import os

processed=0

def f(x):
    sleep(random())
    global processed
    processed += 1
    print("Processed by %s: %s" % (os.getpid(), processed))
    return x*x

if __name__ == '__main__':
    pool = Pool(processes=4)
    print(pool.map(f, range(10)))

它将输出如下内容：
Processed by 7636: 1
Processed by 9144: 1
Processed by 5252: 1
Processed by 7636: 2
Processed by 6248: 1
Processed by 5252: 2
Processed by 6248: 2
Processed by 9144: 2
Processed by 7636: 3
Processed by 5252: 3
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

。。。当然，每个线程和顺序的线程ID和计数会因运行而异。
线程本地存储可以简单地看作是一个名称空间（通过属性表示法访问值）。不同之处在于每个线程透明地获取自己的一组属性/值，因此一个线程看不到另一个线程的值
与普通对象一样，您可以在代码中创建多个线程.local
实例。它们可以是局部变量、类或实例成员或全局变量。每个名称空间都是一个单独的名称空间
下面是一个简单的例子：
import threading
from threading import current_thread

threadLocal = threading.local()

def hi():
    initialized = getattr(threadLocal, 'initialized', None)
    if initialized is None:
        print("Nice to meet you", current_thread().name)
        threadLocal.initialized = True
    else:
        print("Welcome back", current_thread().name)

hi(); hi()

import threading

class Worker(threading.Thread):
    ns = threading.local()
    def run(self):
        self.ns.val = 0
        for i in range(5):
            self.ns.val += 1
            print("Thread:", self.name, "value:", self.ns.val)

w1 = Worker()
w2 = Worker()
w1.start()
w2.start()
w1.join()
w2.join()

输出：
Thread: Thread-1 value: 1
Thread: Thread-2 value: 1
Thread: Thread-1 value: 2
Thread: Thread-2 value: 2
Thread: Thread-1 value: 3
Thread: Thread-2 value: 3
Thread: Thread-1 value: 4
Thread: Thread-2 value: 4
Thread: Thread-1 value: 5
Thread: Thread-2 value: 5

注意每个线程如何维护自己的计数器，即使ns
属性是类成员（因此在线程之间共享）
同一个示例可以使用实例变量或局部变量，但这不会显示太多，因为没有共享（dict也可以）。在某些情况下，您需要将线程本地存储作为实例变量或本地变量，但它们往往相对较少（而且非常微妙）。
我在模块/文件之间执行线程本地存储的方法。以下内容已在Python 3.5中进行了测试-
import threading
from threading import current_thread

# fileA.py 
def functionOne:
    thread = Thread(target = fileB.functionTwo)
    thread.start()

#fileB.py
def functionTwo():
    currentThread = threading.current_thread()
    dictionary = currentThread.__dict__
    dictionary["localVar1"] = "store here"   #Thread local Storage
    fileC.function3()

#fileC.py
def function3():
    currentThread = threading.current_thread()
    dictionary = currentThread.__dict__
    print (dictionary["localVar1"])           #Access thread local Storage

在fileA中，我启动了一个线程，该线程在另一个模块/文件中具有目标函数
在fileB中，我在该线程中设置了一个局部变量
在fileC中，我访问当前线程的线程局部变量
此外，只需打印“dictionary”变量，以便查看可用的默认值，如kwargs、args等。
如果要执行此操作，您真正想要的可能是defaultdict+ThreadLocalDict，但我认为没有现成的实现。（defaultdict应该是dict的一部分，例如，dict（default=int）
，这样就不需要“ThreadLocalDefaultDict”。@Glenn，dict（default=int）
的问题是dict（）
构造函数接受kwargs并将它们添加到dict中。因此，如果实现了这一点，人们将无法指定一个名为“default”的键。但我实际上认为，对于您展示的实现来说，这是一个很小的代价。毕竟，还有其他方法可以给字典添加一个键。@Evan-我同意这种设计会更好，但它会向后断裂compatibility@Glenn，我使用这种方法处理大量不是defaultdict
s的线程局部变量，如果这是您的意思的话。如果您的意思是它有一个类似于defaultdict
应有的接口（为工厂函数提供可选的位置参数和命名参数：每次可以存储回调时，您都应该能够为它选择性地传递参数！-），那么，sorta，除了我通常为不同的变量名使用不同的工厂和参数之外，我给出的方法在Python 2.4上也可以很好地工作（不要问…！-）@Casebash：调用threadlocal=threading.local（）
是否应该在threadlocal\u var（）
函数中，这样它就可以得到调用它的线程的本地值？我不确定你问的是什么--threading.local是否有文档记录，你已经或多或少地粘贴了下面的文档…@Glenn我将文档粘贴到了我的一个答案中。我在另一篇文章中引用了亚历克斯的解决方案。我只是想让这些内容更易于访问。想象一下，批评一些有帮助的志愿者将关键文档重新格式化为移动可访问的StackOverflow答案，以前只有在交互式CLI REPL中手动键入模糊的Python语句（例如，import\u threading\u local as tl\nhelp（tl）
）才能阅读<代码>
与其将此类代码放在自己的答案中，为什么不编辑您的问题？@Evan:因为有两种基本方法，它们实际上是独立的答案”请注意，线程模块