python中多线程读写json文件的并发性_Python_Json_Multithreading_Concurrency_Thread Safety

python中多线程读写json文件的并发性

python json multithreading concurrency

python中多线程读写json文件的并发性,python,json,multithreading,concurrency,thread-safety,Python,Json,Multithreading,Concurrency,Thread Safety,我想在python中通过多个线程读写一个json文件每条线初始设置）打开（文件路径，“w+”）（如果文件为空，只转储空json文件）使用threading.lock写入时 1）在内存中加载json文件 2）通过新的键和值更新内存中加载的json 3）将当前json（内存上）转储到文件因为在写的时候有一个锁，我认为读写文件是安全的，即使多线程运行。但它会犯错误类编写器（对象）：定义初始化（自身，路径）： self.\u lock=threading.lock（） self.pat

我想在python中通过多个线程读写一个json文件

每条线

初始设置）打开（文件路径，“w+”）（如果文件为空，只转储空json文件）

使用threading.lock写入时

1）在内存中加载json文件

2）通过新的键和值更新内存中加载的json

3）将当前json（内存上）转储到文件

因为在写的时候有一个锁，我认为读写文件是安全的，即使多线程运行。但它会犯错误

类编写器（对象）：
定义初始化（自身，路径）：
self.\u lock=threading.lock（）
self.path=path
self.current_json=None
self.\u init\u opend\u文件（）
def_init_opend_文件（self）：
带自锁：
self.\u opened\u file=open（self.path，“w+”）
如果self.\u打开了\u file.read（）==“”：
json.dump（{}，self.\打开\文件）
其他：
通过
def写入（自身、键、值）：
带自锁：
self.\u已打开\u文件。seek（0）
self.current\u json=json.load（self.\u打开的\u文件）
self.current_json[key]=值
self.\u已打开\u文件。seek（0）
self.\u已打开\u文件。truncate（）
dump（self.current_json，self._opened_文件）
如果名称=“\uuuuu main\uuuuuuuu”：
path=r“D:\test.json”
def运行（名称、范围）：
writer=writer（路径）
对于范围内的i（范围1）：
作者（姓名，i）
t1=线程。线程（目标=运行，参数=（“一”，1000））
t2=threading.Thread（target=run，args=（“两个”，2000））
t1.start（）
t2.start（）

我希望在test.json中获得{“一”：1000，“二”：2000}。但我有{“一”：1}“二”：1}。似乎有多个线程同时访问该文件并写入不同的内容，但我不明白为什么使用threading.lock（）会发生这种情况

线程2中的异常：回溯（最近一次呼叫最后一次）：文件“D:\Anaconda3\u 64\envs\atom\lib\threading.py”，第917行，在\u bootstrap\u内部 self.run（）文件“D:\Anaconda3\u 64\envs\atom\lib\threading.py”，第865行，正在运行自我目标（*自我参数，**自我参数）文件“D:/Dropbox/000\u ComputerScience/000_개발/Quant/separator/json_test.py”，第37行，运行中作者（姓名，i）文件“D:/Dropbox/000\u ComputerScience/000_개발/Quant/separator/json_test.py”，第24行，以书面形式 self.current\u json=json.load（self.\u打开的\u文件）文件“D:\Anaconda3\u 64\envs\atom\lib\json\\uuuuu init\uuuuu.py”，第296行，已加载 parse_常量=parse_常量，object_pairs_hook=object_pairs_hook，**千瓦）文件“D:\Anaconda3\u 64\envs\atom\lib\json\\uuuuu init\uuuuuuu.py”，第348行，加载返回\u默认\u解码器。解码文件“D:\Anaconda3\u 64\envs\atom\lib\json\decoder.py”，第337行，解码中 obj，end=self.raw\u decode（s，idx=\u w（s，0.end（））文件“D:\Anaconda3\u 64\envs\atom\lib\json\decoder.py”，第355行，原始解码从None引发JSONDecodeError（“预期值”，s，err.value） json.decoder.JSONDecodeError:预期值：第1行第1列（字符0）

发生这种情况是因为两个线程不共享同一个锁。尝试使用

ThreadPoolExecutor

或将类扩展为

类编写器（threading.Thread）：

ThreadPoolExecutor

负责线程本身之间的公共共享资源。所以，你不必担心锁

ThreadPoolExecutor

穿线

：请参阅

线程池执行器的示例：
def data_write(z):
    sleep_wait = random.randint(0, 2)
    print("sleeping:", sleep_wait, ", data:", z)
    time.sleep(sleep_wait)
    print('{field: %s}' % z , file=f)
    return z

from concurrent.futures import ThreadPoolExecutor
with open('test', 'a') as f:
    data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    with ThreadPoolExecutor(max_workers=3) as executor:
        future = list(executor.map(data_write, data))
    print(future)

最好简化事情。您的类只做一些编写，因此您可以通过一个简单的函数来完成。您正在使用的w+
模式在打开文件时也会截断文件，因此您永远看不到它以前的状态。truncate（）。只有一个锁，它是在write
函数中获取和释放的。最后，范围（1000）
为您提供的值高达999
；）结果如下：
import threading
import json

def write(path, key, value):
    lock.acquire()
    with open(path, "r+") as opened_file:
        current_json = opened_file.read()
        if current_json == "":
            current_json = {}
        else:
            current_json = json.loads(current_json)
        current_json[key] = value
        opened_file.seek(0)
        opened_file.truncate(0)
        json.dump(current_json, opened_file)
    lock.release()

if __name__ == "__main__":
    path = r"test.json"
    lock = threading.Lock()

    def run(name, range_):
        for i in range(range_):
            write(path, name,i)

    t1 = threading.Thread(target=run, args=("one", 1001))
    t2 = threading.Thread(target=run, args=("two", 2001))

    t1.start()
    t2.start()
    t1.join()
    t2.join()