Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/363.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python:存储/检索/更新大量任意对象_Python_Pickle - Fatal编程技术网

Python:存储/检索/更新大量任意对象

Python:存储/检索/更新大量任意对象,python,pickle,Python,Pickle,我有数百万条记录要经常存储、检索和删除。这些记录中的每一条都有一个键,但该值不容易翻译成字典,因为它是从我没有编写的模块方法返回的任意Python对象。我知道,像json这样的分层数据结构作为字典工作得更好,而且不确定json在任何情况下是否是首选数据库 我正在考虑将每个条目分别放入一个单独的文件中。有更好的方法吗?使用该模块 您可以将其用作字典,就像在json中一样,但它使用pickle存储对象 从python官方文档: import shelve d = shelve.open(filen

我有数百万条记录要经常存储、检索和删除。这些记录中的每一条都有一个键,但该值不容易翻译成字典,因为它是从我没有编写的模块方法返回的任意Python对象。我知道,像json这样的分层数据结构作为字典工作得更好,而且不确定json在任何情况下是否是首选数据库

我正在考虑将每个条目分别放入一个单独的文件中。有更好的方法吗?

使用该模块

您可以将其用作字典,就像在json中一样,但它使用pickle存储对象

从python官方文档:

import shelve

d = shelve.open(filename) # open -- file may get suffix added by low-level
                          # library

d[key] = data   # store data at key (overwrites old data if
                # using an existing key)
data = d[key]   # retrieve a COPY of data at key (raise KeyError if no
                # such key)
del d[key]      # delete data stored at key (raises KeyError
                # if no such key)
flag = d.has_key(key)   # true if the key exists
klist = d.keys() # a list of all existing keys (slow!)

# as d was opened WITHOUT writeback=True, beware:
d['xx'] = range(4)  # this works as expected, but...
d['xx'].append(5)   # *this doesn't!* -- d['xx'] is STILL range(4)!

# having opened d without writeback=True, you need to code carefully:
temp = d['xx']      # extracts the copy
temp.append(5)      # mutates the copy
d['xx'] = temp      # stores the copy right back, to persist it

# or, d=shelve.open(filename,writeback=True) would let you just code
# d['xx'].append(5) and have it work as expected, BUT it would also
# consume more memory and make the d.close() operation slower.

d.close()       # close it
使用模块

您可以将其用作字典,就像在json中一样,但它使用pickle存储对象

从python官方文档:

import shelve

d = shelve.open(filename) # open -- file may get suffix added by low-level
                          # library

d[key] = data   # store data at key (overwrites old data if
                # using an existing key)
data = d[key]   # retrieve a COPY of data at key (raise KeyError if no
                # such key)
del d[key]      # delete data stored at key (raises KeyError
                # if no such key)
flag = d.has_key(key)   # true if the key exists
klist = d.keys() # a list of all existing keys (slow!)

# as d was opened WITHOUT writeback=True, beware:
d['xx'] = range(4)  # this works as expected, but...
d['xx'].append(5)   # *this doesn't!* -- d['xx'] is STILL range(4)!

# having opened d without writeback=True, you need to code carefully:
temp = d['xx']      # extracts the copy
temp.append(5)      # mutates the copy
d['xx'] = temp      # stores the copy right back, to persist it

# or, d=shelve.open(filename,writeback=True) would let you just code
# d['xx'].append(5) and have it work as expected, BUT it would also
# consume more memory and make the d.close() operation slower.

d.close()       # close it

我会评估像berkeleydb、kyoto cabinet或其他数据库这样的关键/价值数据库的使用情况。这将为您提供所有新奇的东西,并更好地处理磁盘空间。在块大小为4096B的文件系统中,一百万个文件占用~4GB,无论对象的大小如何,作为下限,如果对象大于4096B,则大小会增加。

我会评估像berkeleydb、kyoto cabinet或其他数据库这样的键/值数据库的使用情况。这将为您提供所有新奇的东西,并更好地处理磁盘空间。在块大小为4096B的文件系统中,一百万个文件占用~4GB—无论对象的大小是什么—作为下限,如果对象大于4096B,则大小会增加。

那么它会将所有内容pickle到单个文件中?那么它会将所有内容pickle到单个文件中?