用Python将大数据写入文件的最快方法_Python_Numpy_File Io_Io_Fortran

用Python将大数据写入文件的最快方法

python numpy file-io io fortran

用Python将大数据写入文件的最快方法,python,numpy,file-io,io,fortran,Python,Numpy,File Io,Io,Fortran,我有一个大数组，大小为（10^8 x 5）。我希望能够编写这个数据文件，以便在其他程序中使用按照中的建议，我只使用一次写入磁盘，而不是重复调用write（）。但是，考虑到我的数据的大小，这既慢又会生成非常大的文件（~47 GB）我意识到我可以使用类似的东西，但我希望能够用Fortran读取输出文件。它没有必要是人类可读的最佳选择/最佳做法谢谢一些示例代码： import numpy as np from scipy.interpolate import interpld

我有一个大数组，大小为（10^8 x 5）。我希望能够编写这个数据文件，以便在其他程序中使用

按照中的建议，我只使用一次写入磁盘，而不是重复调用

write（）

。但是，考虑到我的数据的大小，这既慢又会生成非常大的文件（~47 GB）

我意识到我可以使用类似的东西，但我希望能够用Fortran读取输出文件。它没有必要是人类可读的

最佳选择/最佳做法

谢谢

一些示例代码：

    import numpy as np
    from scipy.interpolate import interpld as interpld

    #Load the data
    data = np.loadtxt('path/to/file/file.txt')

    #Extract the data - let's just say it is x,y,z coordinates
    x = data[:,0]
    y = data[:,1]
    z = data[:,2]

    # Do some interpolation to increase the 'resolution' of xyz coordinates

    baseline = np.zeros(len(x))
    for k in range(len(baseline)):
        baseline[k] = k

    N = 10000 # set resolution relative to baseline
    IntBaseline = np.linspace(0,baseline[-1],len(baseline)*N)


    gx = interpld(baseline,x)
    gy = interpld(baseline,y)
    gz = interpld(baseline,z)

    interpolated_x = gx(IntBaseline)
    interpolated_y = gy(IntBaseline)
    interpolated_z = gz(IntBaseline)


    # Now write everything to an array and save

    outfile = np.zeros((len(interpolated_x),3)
    outfile[:,0] = interpolated_x
    outfile[:,1] = interpolated_y
    outfile[:,2] = interpolated_z

    np.savetxt('Interpolated_Outfile.txt', outfile)

这取决于数据是如何产生的。没有你正在做的事情的细节，这是不容易回答的（注意你喜欢的代码有特定的代码，导致t <代码> NoMy.SaveTXT < /代码>在那里回答）。考虑使用HDF5格式——它相当快，并且在FORTRAN中支持AFAIK。如果您的数组可以转换为Pandas数据帧，那么它将非常容易使用

numpy.ndarray.tofile

？无论你走哪条路线，都要确保它是一条二进制路线。谢谢大家的评论——如果这有帮助的话，我现在已经给出了一个代码示例。我将调查HDF5和

np.ndarray.tofile

。这两个

loadtxt

和

savetxt

逐行读取/写入文本文件

savetxt

只需迭代数组中的行，格式化并写入它们

pandas

具有更快的

csv

阅读器；我不知道它的作者是谁。但这对插值没有帮助。我看不出

tofile

对Fortran read有什么帮助；它与Python一样特定于

np.save

。