File io 如何在Julia中逐行读取文件？_File Io_Julia

File io 如何在Julia中逐行读取文件？

file-io julia

File io 如何在Julia中逐行读取文件？,file-io,julia,File Io,Julia,如何打开文本文件并逐行读取？我对以下两种不同的情况感兴趣：一次获取数组中的所有行一次处理一行对于第二种情况，我不希望一次将所有行保存在内存中。将一个文件作为一个行数组一次全部读入内存只需调用readlines函数： julia> words = readlines("/usr/share/dict/words") 235886-element Array{String,1}: "A" "a" "aa" ⋮ "zythum" "Zyzomys" "Zyzzogeton"

如何打开文本文件并逐行读取？我对以下两种不同的情况感兴趣：

一次获取数组中的所有行

一次处理一行

对于第二种情况，我不希望一次将所有行保存在内存中。

将一个文件作为一个行数组一次全部读入内存只需调用

readlines

函数：

julia> words = readlines("/usr/share/dict/words")
235886-element Array{String,1}:
 "A"
 "a"
 "aa"
 ⋮
 "zythum"
 "Zyzomys"
 "Zyzzogeton"

julia> open("/usr/share/dict/words") do io
           readline(io) # throw out the first line
           readlines(io)
       end
235885-element Array{String,1}:
 "a"
 "aa"
 "aal"
 ⋮
 "zythum"
 "Zyzomys"
 "Zyzzogeton"

默认情况下，这将丢弃换行符，但如果要保留它们，可以传递关键字参数

keep=true

：

julia> words = readlines("/usr/share/dict/words", keep=true)
235886-element Array{String,1}:
 "A\n"
 "a\n"
 "aa\n"
 ⋮
 "zythum\n"
 "Zyzomys\n"
 "Zyzzogeton\n"

如果已经打开了文件对象，也可以将其传递给

readlines

函数：

julia> words = readlines("/usr/share/dict/words")
235886-element Array{String,1}:
 "A"
 "a"
 "aa"
 ⋮
 "zythum"
 "Zyzomys"
 "Zyzzogeton"

julia> open("/usr/share/dict/words") do io
           readline(io) # throw out the first line
           readlines(io)
       end
235885-element Array{String,1}:
 "a"
 "aa"
 "aal"
 ⋮
 "zythum"
 "Zyzomys"
 "Zyzzogeton"

这演示了

readline

函数，该函数从打开的I/O对象中读取一行，或者在给定文件名时，打开文件并从中读取第一行：

julia> readline("/usr/share/dict/words")
"A"

如果您不想一次加载所有文件内容（或者如果您正在处理来自网络套接字的流式数据），则可以使用

eachline

函数获取迭代器，该迭代器一次生成一行：

julia> for word in eachline("/usr/share/dict/words")
           if length(word) >= 24
               println(word)
           end
       end
formaldehydesulphoxylate
pathologicopsychological
scientificophilosophical
tetraiodophenolphthalein
thyroparathyroidectomize

eachline

函数也可以像

readlines

一样，被赋予一个打开的文件句柄以从中读取行。您也可以通过打开文件并反复调用

readline

来“滚动您自己的”迭代器：

julia> open("/usr/share/dict/words") do io
           while !eof(io)
               word = readline(io)
               if length(word) >= 24
                   println(word)
               end
           end
       end
formaldehydesulphoxylate
pathologicopsychological
scientificophilosophical
tetraiodophenolphthalein
thyroparathyroidectomize

这相当于

eachline

为您所做的事情，您很少需要自己做这件事，但如果您需要的话，这种能力就存在了。有关逐个字符读取文件的详细信息，请参阅此问题和答案：