使用Python在文件中存储大量浮动的最佳方法？_Python_Floating Point_Numbers_File Writing

使用Python在文件中存储大量浮动的最佳方法？

python floating-point

使用Python在文件中存储大量浮动的最佳方法？,python,floating-point,numbers,file-writing,Python,Floating Point,Numbers,File Writing,我有一个程序，可以生成一个非常大的浮点数序列，通常在数千万左右。我需要一个很好的方法来将它们存储在一个文件中。我将按顺序编写它们，并使用Python阅读它们。浮点数位于一维数组中，如下所示： [39534.543, 834759435.3445643, 1.003024032, 0.032543, 434.0208...] （这些数字都是例子，我只是把它们拼成键盘。）生成数字的代码： for x in range(16384): for y in range(16384):

我有一个程序，可以生成一个非常大的浮点数序列，通常在数千万左右。我需要一个很好的方法来将它们存储在一个文件中。我将按顺序编写它们，并使用Python阅读它们。浮点数位于一维数组中，如下所示：

[39534.543, 834759435.3445643, 1.003024032, 0.032543, 434.0208...]

（这些数字都是例子，我只是把它们拼成键盘。）

生成数字的代码：

for x in range(16384):
    for y in range(16384):
        float = <equation with x and y>
        <write float to file>

范围内x的

（16384）：
对于范围内的y（16384）：
浮动=

您可以使用

struct.pack

函数将浮点数存储为64位双精度：

from struct import pack, unpack

array = [39534.543, 834759435.3445643, 1.003024032, 0.032543, 434.0208]

with open('store', 'wb') as file:
    file.write(pack('d' * len(array) , *array))

以便以后可以使用

struct.unpack

检索数组的值：

with open('store', 'rb') as file:
    packed = file.read()
    array = unpack('d' * (len(packed) // 8), packed) # 8 bytes per double

你的一些数字看起来太短，不可能是随机的。因此，通过压缩，您可以将它们存储在每个浮点不到8个字节的内存中。例如：

商店：

import lzma

array = [39534.543, 834759435.3445643, 1.003024032, 0.032543, 434.0208]

with open('store', 'wb') as file:
    file.write(lzma.compress(repr(array).encode()))

负载：

即使使用随机数据，我平均得到的字节数也不到8个：

>>> n = 10**5
>>> a = [random.random() for _ in range(n)]
>>> len(lzma.compress(repr(a).encode())) / n
7.98948

诚然，这相当缓慢，尽管，至少在我的随机数据。对于非随机数据可能更快。或者尝试较低的压缩级别或其他压缩。

pickle

模块还提到了压缩，所以这可能值得一试。

如何衡量“最佳”呢？你做过任何研究或尝试过任何东西吗？最方便，但仍然相当快，文件大小也很小。我试着用纯文本编写它们，但文件非常大。一个“非常大”的序列导致了一个“非常大”的文件。令人惊讶的

>>> n = 10**5
>>> a = [random.random() for _ in range(n)]
>>> len(lzma.compress(repr(a).encode())) / n
7.98948