Julia 如何简单地读取包含字符串列的二进制数据表？_Julia

Julia 如何简单地读取包含字符串列的二进制数据表？

julia

Julia 如何简单地读取包含字符串列的二进制数据表？,julia,Julia,我试图（写，读）一个二进制文件中的许多表格数据表，数据是Integer，Float64和ascistring类型，我毫不费力地写它们，我lpadascistring使ascistring列具有相同的长度。现在我面临读取操作，我想通过调用read函数读取每个数据表，例如： read(myfile,Tuple{[UInt16;[Float64 for i=1:10];UInt8]...}, dim) # => works 编辑->我在实际解决方案中不使用上述代码行，因为我发现 sizeof

我试图（写，读）一个二进制文件中的许多表格数据表，数据是

Integer

，

Float64

和

ascistring

类型，我毫不费力地写它们，我

lpad

ascistring

使ascistring列具有相同的长度。现在我面临读取操作，我想通过调用

read

函数读取每个数据表，例如：

read(myfile,Tuple{[UInt16;[Float64 for i=1:10];UInt8]...}, dim) # => works

编辑->我在实际解决方案中不使用上述代码行，因为我发现

sizeof（元组{Float64，Int32}）=sizeof（Float64）+sizeof（Int32）

但是如何在my

Tuple

类型中包含

Ascistring

字段？检查此简化示例：

file=open("./testfile.txt","w");
ts1="5char";
ts2="7 chars";
write(file,ts1,ts2);
close(file);
file=open("./testfile.txt","r");
data=read(file,typeof(ts1)); # => Errror
close(file);

Julia是对的，因为

typeof（ts1）=ascistring

和

ascistring

是一个可变长度数组，所以Julia不知道必须读取多少字节。我必须在那里更换哪种型号？是否存在表示

常量长度字符串

或

字节

，

字符

的类型？还有更好的解决办法吗

编辑

我应该添加更完整的示例代码，其中包括我的最新进展，我的最新解决方案是将部分数据读入缓冲区（一行或多行），为一行数据分配内存，然后

重新解释

字节，并将结果值从缓冲区复制到输出位置：

#convert array of bits and copy them to out
function reinterpretarray!{ty}(out::Vector{ty}, buffer::Vector{UInt8}, pos::Int)
  count=length(out)
  out[1:count]=reinterpret(ty,buffer[pos:count*sizeof(ty)+pos-1])
  return count*sizeof(ty)+pos
end
file=open("./testfile.binary","w");
#generate test data 
infloat=ones(20);
instr=b"MyData";
inint=Int32[12];
#write tuple 
write(file,([infloat...],instr,inint)...);
close(file);

file=open("./testfile.binary","r");
#read data into a buffer
buffer=readbytes(file,sizeof(infloat)+sizeof(instr)+sizeof(inint));
close(file);
#allocate memory
outfloat=zeros(20)
outstr=b"123456"
outint=Int32[1]
outdata=(outfloat,outstr,outint)
#copy and convert
pos=1
for elm in outdata
  pos=reinterpretarray!(elm, buffer, pos)
end
assert(outdata==(infloat,instr,inint))

但是我在

语言中的实验告诉我，一定有更好、更方便、更快的解决方案存在，我想使用

风格

指针和引用
，我不喜欢将数据从一个位置复制到另一个位置
谢谢
您可以使用Array{UInt8}
作为ascistring
的替代类型，它是基础数据的类型
ts1="5chars"
print(ts1.data) #Array{UInt8}
someotherarray=ts1.data[:] #copies as new array
someotherstring=ASCIIString(somotherarray)
assert(someotherstring == ts1)

请注意，我正在x86_64系统中阅读UInt8
，这可能不是您的情况。出于安全考虑，您应该使用Array{eltype（ts1.data）}
。
在您的简化示例中，data=ascistring（readbytes（file，length（ts1））
将输出5char
，但在您的原始用例中似乎不起作用。是的，我试图用readbytes
代替read
，但实际数据是Vector{mytype}
，mytype是一个不同类型（所有不可变）int、floating和Char[ConstLength]的Tuple
，我不知道如何做转换部分。但是我确信有一种方法可以将已经读取的字节的指针转换为Vactor{mytype}
，因为数据行的大小是相同的。如果我理解你的属性，数据的类型是Array{Tuple{UInt16，Array{Float64,1}，UInt8，ascistring}，1}
，内容类似于（0x0001，[1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0]，0x01，“第一个”）
？如何通过一个write
调用将其写入二进制文件？我实际上是一个元素一个元素地写入，数组中I的；write（file，I）；end
我已经找到了丢失的数据类型，我认为可变长度类型是bitstype，例如，我可以创建一个新的自定义bitstype，大小如下：bitstype 8*5 Char5
然后使用我的Char5类型读取每5个char字段。但面临另一个问题，1-如何将其转换为Asistring
2-sizeof（元组{Float64，Char5}）！=sizeof（Float64）+sizeof（Char5）
但读取数组{UInt8}从stream中，您会遇到同样的ASCISTRING问题，因为大小未知。我认为正确的类型是bitstype
。但面对新的问题，正如我前面所评论的。felipe可能只是误解了您@RezaAfzalan，您最好使用更具体的示例，而不是误导性的“简化”示例；）是的，我做了……我以为他的问题是如何将读入ascitring
。也许我应该删除这个？我保留了它，因为它是相关的，但我现在知道这不是答案