Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/EmptyTag/158.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Arrays 如何提高Julia中嵌套数组的速度?_Arrays_Nested_Julia - Fatal编程技术网

Arrays 如何提高Julia中嵌套数组的速度?

Arrays 如何提高Julia中嵌套数组的速度?,arrays,nested,julia,Arrays,Nested,Julia,下面的函数nested_array生成一个“depth”n的嵌套数组。但是,即使在运行n(2、3等)的小值时,运行和显示输出也需要相当长的时间 julia> nested_arrays(n) = n == 1 ? [1] : [nested_arrays(n - 1)] nested_arrays (generic function with 1 method) julia> nested_arrays(1) 1-element Array{Int64,1}: 1 julia&

下面的函数
nested_array
生成一个“depth”
n
的嵌套数组。但是,即使在运行
n
2
3
等)的小值时,运行和显示输出也需要相当长的时间

julia> nested_arrays(n) = n == 1 ? [1] : [nested_arrays(n - 1)]
nested_arrays (generic function with 1 method)

julia> nested_arrays(1)
1-element Array{Int64,1}:
 1

julia> nested_arrays(2)
1-element Array{Array{Int64,1},1}:
 [1]

julia> nested_arrays(3)
1-element Array{Array{Array{Int64,1},1},1}:
 Array{Int64,1}[[1]]

julia> nested_arrays(10)
1-element Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1}:
 Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1}[Array{Array{Array{Array{Array{Int64,1},1},1},1},1}[Array{Array{Array{Array{Int64,1},1},1},1}[Array{Array{Array{Int64,1},1},1}[Array{Array{Int64,1},1}[Array{Int64,1}[[1]]]]]]]]]
有趣的是,当使用
@time
宏或
在行的末尾,计算结果花费的时间相对较少。相反,在REPL中实际显示结果会占用大部分时间

例如,Python中没有显示这种奇怪的行为

In [1]: def nested_lists(n):
   ...:     if n == 1:
   ...:         return [1]
   ...:     return [nested_lists(n - 1)]
   ...: 

In [2]: nested_lists(10)
Out[2]: [[[[[[[[[[1]]]]]]]]]]

In [3]: %time nested_lists(100)
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 37.7 µs
Out[3]: [[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[1]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
为什么这个函数在Julia中如此缓慢?Julia是否正在为
数组{T,1}
中的不同类型
T
重新编译
display
函数?如果是,为什么会这样


这段代码的速度可以提高吗,还是不可以在Julia中实现?从实际意义上讲,我主要关心的是,例如,加载一个复杂的嵌套JSON文件,而仅仅使用
n
维数组是不可能的。

是的,这完全是由于编译时间。您可以通过
@time
-ing查看
显示屏
。第二次显示时速度很快:

julia> nested_arrays(n) = n == 1 ? [1] : [nested_arrays(n - 1)]
nested_arrays (generic function with 1 method)

julia> @time display(nested_arrays(15));
1-element Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1},1},1},1}:
 Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1}[Array{Array{Array{Array{Array{Int64,1},1},1},1},1}[Array{Array{Array{Array{Int64,1},1},1},1}[Array{Array{Array{Int64,1},1},1}[Array{Array{Int64,1},1}[Array{Int64,1}[[1]]]]]]]]]]]]]]
 11.682721 seconds (8.83 M allocations: 371.698 MB, 1.82% gc time)

julia> @time display(nested_arrays(15));
1-element Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1},1},1},1}:
 Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1},1}[Array{Array{Array{Array{Array{Array{Int64,1},1},1},1},1},1}[Array{Array{Array{Array{Array{Int64,1},1},1},1},1}[Array{Array{Array{Array{Int64,1},1},1},1}[Array{Array{Array{Int64,1},1},1}[Array{Array{Int64,1},1}[Array{Int64,1}[[1]]]]]]]]]]]]]]
  0.001688 seconds (2.38 k allocations: 102.766 KB)
那为什么这么慢?这里的显示递归地遍历所有数组,并打印嵌套在彼此内部的数组。这是用14种不同的类型递归调用
show
——一种是14个嵌套数组,然后是13个嵌套数组的元素,然后是12个嵌套数组的元素……以此类推!这些
show
方法中的每一个都是独立编译的。编译特定元素类型的专用方法是Julia如何生成高效代码的关键部分。这意味着它能够专门化对每个元素执行的每个操作,而无需任何运行时类型检查或分派。不幸的是,在这种情况下,它会成为阻碍

您可以使用
Any[]
数组解决此问题;在JSON文件的上下文中,这非常有意义,因为您不知道它是否包含字符串、数组或数字等。这要快得多,因为它只需要为
Any[]
数组编译show方法一次,然后递归使用它

# new session
julia> nested_arrays(n) = n == 1 ? Any[1] : Any[nested_arrays(n - 1)]
nested_arrays (generic function with 1 method)

julia> @time display(nested_arrays(15));
1-element Array{Any,1}:
 Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[1]]]]]]]]]]]]]]
  1.571632 seconds (767.12 k allocations: 32.472 MB, 1.04% gc time)

julia> @time display(nested_arrays(15));
1-element Array{Any,1}:
 Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[1]]]]]]]]]]]]]]
  0.000606 seconds (839 allocations: 30.859 KB)

julia> @time display(nested_arrays(100));
1-element Array{Any,1}:
 Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[Any[1]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
  0.002523 seconds (17.76 k allocations: 579.297 KB)

我想补充一点,这是一个Julia倾向于编译函数的专用版本的例子——这通常使Julia很快——是错误的:最好只为数组编译一个单一的、缓慢的、通用的show函数版本。Python总是这样做,在这种情况下,它恰好是正确的做法。在未来,专门化启发法可以很容易地变得更智能,而无需改变任何语言语义。