Python 使用Numpy fromfile和给定偏移量读取二进制文件_Python_Arrays_Numpy_Binary

Python 使用Numpy fromfile和给定偏移量读取二进制文件

python arrays numpy binary

Python 使用Numpy fromfile和给定偏移量读取二进制文件,python,arrays,numpy,binary,Python,Arrays,Numpy,Binary,我有一个二进制文件，其中包含平面位置的记录。每个记录看起来像： 0x00: Time, float32 0x04: X, float32 // X axis position 0x08: Y, float32 // Y axis position 0x0C: Elevation, float32 0x10: float32*4 = Quaternion (x,y,z axis and w scalar) 0x20: Distance, float32 (unused) 因此，每条记录的长度为

我有一个二进制文件，其中包含平面位置的记录。每个记录看起来像：

0x00: Time, float32
0x04: X, float32 // X axis position
0x08: Y, float32 // Y axis position
0x0C: Elevation, float32
0x10: float32*4 = Quaternion (x,y,z axis and w scalar)
0x20: Distance, float32 (unused)

因此，每条记录的长度为32字节

我想要一个Numpy阵列

在偏移量1859处，有一个无符号int 32（4字节），表示数组的元素数。我的情况是12019

我不关心（现在）标题数据（偏移量1859之前）

阵列仅从偏移量1863（=1859+4）开始

我定义了自己的Numpy数据类型，如

dtype = np.dtype([
    ("time", np.float32),
    ("PosX", np.float32),
    ("PosY", np.float32),
    ("Alt", np.float32),
    ("Qx", np.float32),
    ("Qy", np.float32),
    ("Qz", np.float32),
    ("Qw", np.float32),
    ("dist", np.float32),
])

我用

fromfile

读取文件：

a_bytes = np.fromfile(filename, dtype=dtype)

但是我没有看到任何参数提供给

fromfile

来传递偏移量。

您可以在打开标准python文件的情况下打开文件，然后尝试跳过标题，然后将文件对象传递给

fromfile

。大概是这样的：

import numpy as np
import os

dtype = np.dtype([
    ("time", np.float32),
    ("PosX", np.float32),
    ("PosY", np.float32),
    ("Alt", np.float32),
    ("Qx", np.float32),
    ("Qy", np.float32),
    ("Qz", np.float32),
    ("Qw", np.float32),
    ("dist", np.float32),
])

f = open("myfile", "rb")
f.seek(1863, os.SEEK_SET)

data = np.fromfile(f, dtype=dtype)
print x

我面临着一个类似的问题，但上面的答案没有一个让我满意。我需要实现虚拟表之类的东西，其中包含大量的二进制记录，这些记录占用的内存可能比一个numpy数组所能承受的还要多。所以我的问题是如何在二进制文件中读写一小部分整数-一个文件的子集到numpy数组的子集

这是一个适合我的解决方案：

import numpy as np
recordLen = 10 # number of int64's per record
recordSize = recordLen * 8 # size of a record in bytes
memArray = np.zeros(recordLen, dtype=np.int64) # a buffer for 1 record

# Create a binary file and open it for write+read
with open('BinaryFile.dat', 'w+b') as file:
    # Writing the array into the file as record recordNo:
    recordNo = 200 # the index of a target record in the file
    file.seek(recordSize * recordNo)
    bytes = memArray.tobytes()
    file.write(bytes)

    # Reading a record recordNo from file into the memArray
    file.seek(recordSize * recordNo)
    bytes = file.read(recordSize)
    memArray = np.frombuffer(bytes, dtype=np.int64).copy()
    # Note copy() added to make the memArray mutable

谢谢它解决了我的问题。我还注意到

data=np.memmap（filename，dtype=dtype，mode='r'，offset=offset\u array，shape=N）

`就在前面，如果它是一个大文件，那么memmap可能是最好的选择。