Printing 在Julia上同时写入多个文件_Printing_Julia

Printing 在Julia上同时写入多个文件

printing julia

Printing 在Julia上同时写入多个文件,printing,julia,Printing,Julia,如何在Julia中同时打印到多个文件？是否有一种更干净的方法，而不是： for f in [open("file1.txt", "w"), open("file2.txt", "w")] write(f, "content") close(f) end 根据您的问题，我假设您不是指并行写入（由于操作可能是IO绑定的，因此并行写入可能不会加快速度）您的解决方案有一个小问题-如果write引发异常，它不能保证f关闭这里有三种方法可供选择，以确保即使出现错误也关闭文件： for

如何在Julia中同时打印到多个文件？是否有一种更干净的方法，而不是：

for f in [open("file1.txt", "w"), open("file2.txt", "w")]
    write(f, "content")
    close(f)
end

根据您的问题，我假设您不是指并行写入（由于操作可能是IO绑定的，因此并行写入可能不会加快速度）

您的解决方案有一个小问题-如果

write

引发异常，它不能保证

关闭

这里有三种方法可供选择，以确保即使出现错误也关闭文件：

for fname in ["file1.txt", "file2.txt"]
    open(fname, "w") do f
        write(f, "content")
    end
end

for fname in ["file1.txt", "file2.txt"]
    open(f -> write(f, "content"), fname, "w")
end

foreach(fn -> open(f -> write(f, "content"), fn, "w"),
        ["file1.txt", "file2.txt"])

它们给出相同的结果，因此选择是一个品味问题（您可以从类似的实现中获得更多的变体）

所有方法均基于以下

open

功能方法：

 open(f::Function, args...; kwargs....)

  Apply the function f to the result of open(args...; kwargs...)
  and close the resulting file descriptor upon completion.

请注意，如果在某个地方实际抛出异常，处理仍将终止（仅保证文件描述符将被关闭）。为了确保实际尝试了每个写入操作，您可以执行以下操作：

for fname in ["file1.txt", "file2.txt"]
    try
        open(fname, "w") do f
            write(f, "content")
        end
    catch ex
        # here decide what should happen on error
        # you might want to investigate the value of ex here
    end
end

有关

try/catch

的文档，请参阅。如果您确实希望并行编写（使用多个进程），可以按如下方式执行：

using Distributed
addprocs(4) # using, say, 4 examples

function ppwrite()
    @sync @distributed for i in 1:10
        open("file$(i).txt", "w") do f
            write(f, "content")
        end
    end
end

作为比较，串行版本应为

function swrite()
    for i in 1:10
        open("file$(i).txt", "w") do f
            write(f, "content")
        end
    end
end

在我的机器上（ssd+quadcore），这会导致约70%的加速：

julia> @btime ppwrite();
  3.586 ms (505 allocations: 25.56 KiB)

julia> @btime swrite();
  6.066 ms (90 allocations: 6.41 KiB)

但是，请注意，对于实际内容，这些时间安排可能会发生巨大变化，可能必须将其传输到不同的流程。此外，它们可能无法扩展，因为IO通常会在某个时候成为瓶颈

更新：较大的（字符串）内容

julia> using Distributed, Random, BenchmarkTools

julia> addprocs(4);

julia> global const content = [string(rand(1000,1000)) for _ in 1:10];

julia> function ppwrite()
           @sync @distributed for i in 1:10
               open("file$(i).txt", "w") do f
                   write(f, content[i])
               end
           end
       end
ppwrite (generic function with 1 method)

julia> function swrite()
           for i in 1:10
               open("file$(i).txt", "w") do f
                   write(f, content[i])
               end
           end
       end
swrite (generic function with 1 method)

julia> @btime swrite()
  63.024 ms (110 allocations: 6.72 KiB)

julia> @btime ppwrite()
  23.464 ms (509 allocations: 25.63 KiB) # ~ 2.7x speedup

对更大的10000x1000矩阵的字符串表示（3而不是10）执行相同的操作会导致

julia> @time swrite()
  7.189072 seconds (23.60 k allocations: 1.208 MiB)

julia> @time swrite()
  7.293704 seconds (37 allocations: 2.172 KiB)

julia> @time ppwrite();
 16.818494 seconds (2.53 M allocations: 127.230 MiB) # > 2x slowdown of first call

julia> @time ppwrite(); # 30%$ slowdown of second call
  9.729389 seconds (556 allocations: 35.453 KiB)

只需添加一个并行执行IO的协同程序版本，就像多进程版本一样，但也避免了数据复制和传输

julia> using Distributed, Random

julia> global const content = [randstring(10^8) for _ in 1:10];

julia> function swrite()
           for i in 1:10
               open("file$(i).txt", "w") do f
                   write(f, content[i])
               end
           end
       end
swrite (generic function with 1 method)

julia> @time swrite()
  1.339323 seconds (23.68 k allocations: 1.212 MiB)

julia> @time swrite()
  1.876770 seconds (114 allocations: 6.875 KiB)

julia> function awrite()
           @sync for i in 1:10
               @async open("file$(i).txt", "w") do f
                   write(f, "content")
               end
           end
       end
awrite (generic function with 1 method)

julia> @time awrite()
  0.243275 seconds (155.80 k allocations: 7.465 MiB)

julia> @time awrite()
  0.001744 seconds (144 allocations: 14.188 KiB)

julia> addprocs(4)
4-element Array{Int64,1}:
 2
 3
 4
 5

julia> function ppwrite()
           @sync @distributed for i in 1:10
               open("file$(i).txt", "w") do f
                   write(f, "content")
               end
           end
       end
ppwrite (generic function with 1 method)

julia> @time ppwrite()
  1.806847 seconds (2.46 M allocations: 123.896 MiB, 1.74% gc time)
Task (done) @0x00007f23fa2a8010

julia> @time ppwrite()
  0.062830 seconds (5.54 k allocations: 289.161 KiB)
Task (done) @0x00007f23f8734010

上帝的回答。我在下面添加了一个真正的多进程并行版本。但是，正如您所说，在您的示例中，大多数时间并没有花在将数据写入磁盘上，因为

“content”

是一个非常小的数据块：）。如果您添加了一个测试，并写入更多数据（用于在同一台机器上进行比较），这将非常有趣。我尝试过，请参阅更新的帖子，但我无法让并行版本变慢：）有什么想法吗？好的，更多的数据会使并行版本变慢。是的-在某个点上CPU停止成为瓶颈，IO开始成为瓶颈：）。