Python 如何将threadlocal变量与ThreadPoolExecutor一起使用？_Python_Multithreading_Threadpoolexecutor

Python 如何将threadlocal变量与ThreadPoolExecutor一起使用？

python multithreading

Python 如何将threadlocal变量与ThreadPoolExecutor一起使用？,python,multithreading,threadpoolexecutor,Python,Multithreading,Threadpoolexecutor,我希望线程有一些局部变量，带有thread。thread可以像这样优雅地完成： class TTT(threading.Thread): def __init__(self, lines, ip, port): threading.Thread.__init__(self) self._lines = lines; self._sock = initsock(ip, port) self._sts = 0 s

我希望线程有一些局部变量，带有

thread。thread

可以像这样优雅地完成：

class TTT(threading.Thread):
    def __init__(self, lines, ip, port):
        threading.Thread.__init__(self)
        self._lines = lines;
        self._sock = initsock(ip, port)
        self._sts = 0
        self._cts = 0

    def run(self):
        for line in self._lines:
            query = genquery(line)
            length = len(query)
            head = "0xFFFFFFFE"
            q = struct.pack('II%ds'%len(query),  head,  length, query)
            sock.send(q)
            sock.recv(4)
            length,  = struct.unpack('I',  sock.recv(4))
            result = ''
            remain = length
            while remain:
                t = sock.recv(remain)
                result+=t
                remain-=len(t)
            print(result)

如您所见，这些变量在每个线程中都是独立的

但是使用

concurrent.future.ThreadPoolExecutor

，似乎并不那么容易。使用

ThreadPoolExecutor

，我如何才能优雅地完成任务？（不再使用全局变量）

新编辑

class Processor(object):
    def __init__(self, host, port):
        self._sock = self._init_sock(host, port)

    def __call__(self, address, adcode):
        self._send_data(address, adcode)
        result = self._recv_data()
        return json.loads(result)

def main():
    args = parse_args()
    adcode = {"shenzhen": 440300}[args.city]

    if args.output:
        fo = open(args.output, "w", encoding="utf-8")
    else:
        fo = sys.stdout
    with open(args.file, encoding=args.encoding) as fi, fo,\
        ThreadPoolExecutor(max_workers=args.processes) as executor:
        reader = csv.DictReader(fi)
        writer = csv.DictWriter(fo, reader.fieldnames + ["crfterm"])
        test_set = AddressIter(args.file, args.field, args.encoding)
        func = Processor(args.host, args.port)
        futures = map(lambda x: executor.submit(func, x, adcode), test_set)
        for row, future in zip(reader, as_completed(futures)):
            result = future.result()
            row["crfterm"] = join_segs_tags(result["segs"], result["tags"])
            writer.writerow(row)

使用与现在非常相似的布局将是最简单的事情。使用普通对象代替

线程

，并在

调用

中实现逻辑，而不是

运行

：

class TTT:
    def __init__(self, lines, ip, port):
        self._lines = lines;
        self._sock = initsock(ip, port)
        self._sts = 0
        self._cts = 0

    def __call__(self):
        ...
        # do stuff to self

将方法添加到类中可以像调用常规函数一样调用实例。事实上，正规函数就是使用这种方法的对象。您现在可以将一组

TTT

实例传递给或

或者，您可以将初始化吸收到任务函数中：

def ttt(lines, ip, port):
    sock = initsock(ip, port)
    sts = cts = 0
    ...

现在，您可以使用正确的参数列表调用

submit

，或者使用每个参数的一系列值调用

map

对于本例，我更喜欢前一种方法，因为它在executor外部打开端口。executor任务中的错误报告有时会很棘手，我更愿意使打开端口的易出错操作尽可能透明

编辑

基于您的相关问题，我相信您真正要问的问题是关于函数局部变量（它们也自动成为线程局部变量），而不是在同一线程上的函数调用之间共享。但是，您始终可以在函数调用之间传递引用。

您的函数实际上可以是可调用对象。显示您现在对线程池所做的操作，我将告诉您如何修复它。@madpysicator稍等片刻minute@MadPhysicist这些细节够了吗？当然，我想了解更多关于你尝试使用线程池的信息，但是我的答案基本上是准确的。我添加了我的新代码，带有

\uu call\uu

和

submit

，但是如何使用这个类呢？在我的代码中，

处理器

只是初始化一次，如果我执行

executor.submit（处理器（args.host，args.port），x，adcode）

，它每次都会初始化。@roger#1这实际上意味着你所有的线程都共享状态，你不会问你认为自己是什么#2、您可以始终执行

proc=Processor（…）；submit（proc，…）

是，我的代码是所有线程共享相同的状态，并且

proc=Processor（…）；executor.submit（proc，…）

，这也意味着

proc

不是线程绑定，我想要的仍然是

\u sock

变量是线程绑定变量我结合我的问题，我在这里问一个新问题