Python 将FORTRAN90文件生成的数据读入NUMPY数组_Python_Numpy_Io_Fortran

Python 将FORTRAN90文件生成的数据读入NUMPY数组

python numpy io fortran

Python 将FORTRAN90文件生成的数据读入NUMPY数组,python,numpy,io,fortran,Python,Numpy,Io,Fortran,我有一个二进制文件，我想读入Python。该文件由三部分组成：波数列表、温度列表和作为温度和压力函数的不透明度列表。我想将前两个作为向量a和b导入，第三个作为2D数组c导入，这样c[x，y]对应于a[x]和b[y] 现有FORTRAN90代码可以实现以下功能： integer nS, nT parameter (nS = 3000) parameter (nT = 9) real(8) wn_arr(nS) ! wavenumber [cm^-1] real(8) temp_ar

我有一个二进制文件，我想读入Python。该文件由三部分组成：波数列表、温度列表和作为温度和压力函数的不透明度列表。我想将前两个作为向量a和b导入，第三个作为2D数组c导入，这样c[x，y]对应于a[x]和b[y]

现有FORTRAN90代码可以实现以下功能：

integer   nS, nT
parameter (nS = 3000)
parameter (nT = 9)

real(8) wn_arr(nS)      ! wavenumber [cm^-1]
real(8) temp_arr(nT)    ! temperature [K]
real(8) abs_arr(nS,nT)  ! absorption coefficient [cm^-1 / amagat^2]

open(33,file=trim(datadir)//'CO2_dimer_data',form='unformatted')
read(33) wn_arr
read(33) temp_arr
read(33) abs_arr
close(33)

我尝试了以下python代码：

f=scipy.io.FortranFile('file', 'r')
a_ref=f.read_reals(np.float64) #wavenumber (cm**-1)
b=f.read_reals(np.float64) #temperature (K)
c=f.read_reals(np.float64).reshape((3000,9))

但是，这会产生不正确的结果。我怀疑这是因为Fortran将数组写入文件的顺序与Python不同。但是，仅将order='F'添加到重塑命令不起作用。我怀疑这是因为读入时，abscoeff_ref已经变平

有什么想法吗？

为了让你了解我在第二次评论中的意思，我制作了一个模型：

testwrite.f90，使用gfortran 4.8.4编译：它基本上编写了一个未匹配的顺序文件，其中包含您指定的数组（只是要小得多，以便能够通过肉眼进行比较），其中填充了任意数据。它还打印阵列

implicit none
integer   nS, nT, i ,j
parameter (nS = 10)
parameter (nT = 3)

real(8) wn_arr(nS)      ! wavenumber [cm^-1]
real(8) temp_arr(nT)    ! temperature [K]
real(8) abs_arr(nS,nT)  ! absorption coefficient [cm^-1 / amagat^2]

wn_arr = (/ (i, i=1,nS) /)
temp_arr = (/ (270+i, i=1,nT) /)
abs_arr = reshape( (/ ((10*j+i, i=1,nS), j=1,nT) /), (/nS, nT/))

print*, wn_arr
print*, '-----------------'
print*, temp_arr
print*, '-----------------'
print*, abs_arr
print*, '-----------------'
print*, 'abs_arr(5,3) = ', abs_arr(5,3)

open(33,file='test.out',form='unformatted')
write(33) wn_arr
write(33) temp_arr
write(33) abs_arr
close(33)

end

py，使用Python 2.7.6进行测试，然后读取上面编写的文件并打印数组。对我来说，两个程序的输出是相同的。YMMV

import numpy as np

rec_delim = 4  # This value depends on the Fortran compiler
nS = 10
nT = 3

with open('test.out', 'rb') as infile:
    infile.seek(rec_delim, 1)  # begin record
    wn_arr = np.fromfile(file=infile, dtype=np.float64, count=nS)
    infile.seek(rec_delim, 1)  # end record
    infile.seek(rec_delim, 1)  # begin record
    temp_arr = np.fromfile(file=infile, dtype=np.float64, count=nT)
    infile.seek(rec_delim, 1)  # end record
    infile.seek(rec_delim, 1)  # begin record
    abs_arr = np.fromfile(file=infile,
                        dtype=np.float64).reshape((nS, nT), order='F')
    infile.seek(rec_delim, 1)  # end record

print(wn_arr)
print(temp_arr)
print(abs_arr)
# The array has the same shape, but Fortran starts index (per default at least)
# at 1 and Python at 0:
print('abs_arr(5,3) = ' + str(abs_arr[4,2]))

简要说明：我在with块中打开文件（Python中的良好实践），然后使用文件编写的知识逐步浏览文件。这使得它不可移植。seek（4，1）将Python的读取指针从当前位置（选项1）向前移动4个字节，因为我知道文件以一个4字节长的开始记录标记（gfortran）开始

然后我使用numpy.fromfile读取float64的count=10值，这是波数数组

接下来，我必须跳过结束记录和开始记录标记。这当然也可以通过infle.seel（8，1）实现

然后我读取温度数组，跳过结束记录和开始记录标记，再次读取2D数组。文件中的数据不知道它是二维的，所以我需要使用Fortran命令对其进行重塑。最后一个.seek（）是假的，我只是想强调一下结构

我再次强烈建议您不要在这样的代码上构建更大的系统。一次过就可以了，但对于你不得不再次使用或共享的东西来说却很糟糕。

Fortran未格式化文件是可移植性的噩梦。通常，只有在同一硬件上使用同一编译器读/写时，才能期望它们工作（因为记录分隔符没有标准化）。看起来

scipy.io.FortanFile

假设记录是在

x86_64

体系结构上使用

gfortran

编写的。这就是您正在使用的编译器/体系结构吗？答案取决于您需要读取此文件的频率以及整个系统的可移植性。最大的问题是，正如mgilson已经说过的，文件是一种非匹配的顺序格式，它不仅存储数据，而且还存储每个记录周围的开始记录和结束记录标记（=大致每个写入变量），其大小取决于编译器。如果您只想读取文件，而不关心可移植性，请尝试自己编写解析代码。我没有使用scipy.io.FortranFile方法的经验，但它的选项似乎有限，如果没有该文件，则很难帮助您编写代码，但我使用numpy.fromfile读取了未匹配（直接访问，而不是顺序）的Fortran文件。您必须使用file.seek（）方法跳过开始记录和结束记录标记，并播放它们的大小，直到找到正确的标记（使用4或1字节开始，这些应该是最常见的）。您可以向我们展示错误结果的样子。数组的读取大小是多少？它们的值是什么？相关/重复：搜索也可以从C和C++中找到更多。谢谢，这看起来很有帮助！我们将努力实施。