Scala 是否有任何函数式语言编译器/运行库可以优化链式迭代?

Scala 是否有任何函数式语言编译器/运行库可以优化链式迭代?,scala,optimization,functional-programming,compiler-optimization,Scala,Optimization,Functional Programming,Compiler Optimization,任何函数式语言编译器/运行时是否会在可应用时将所有链式迭代减少为一个?从程序员的角度来看,我们可以使用lazyness和streams等结构优化函数代码,但我有兴趣了解故事的另一面。 我的函数示例是用Scala编写的,但请不要将答案局限于该语言 功能方式: 我希望编译器优化到与以下命令等效的命令: /*仅一次迭代*/ 长和,我; 对于(i=1L,sum=0L;i,理论上正如一位评论者所写,编译器可以在编译时将其还原为结果。这在某些宏中是不可想象的,但在一般情况下不太可能 如果插入.view调

任何函数式语言编译器/运行时是否会在可应用时将所有链式迭代减少为一个?从程序员的角度来看,我们可以使用lazyness和streams等结构优化函数代码,但我有兴趣了解故事的另一面。 我的函数示例是用Scala编写的,但请不要将答案局限于该语言

功能方式:

我希望编译器优化到与以下命令等效的命令:

/*仅一次迭代*/
长和,我;

对于(i=1L,sum=0L;i,理论上正如一位评论者所写,编译器可以在编译时将其还原为结果。这在某些宏中是不可想象的,但在一般情况下不太可能

如果插入
.view
调用,则Scala中会出现惰性语义,因此只会执行一次迭代,尽管不像命令式代码那么简单:

val lz = (1L to 1000000L).view.filter(_ % 2 == 0) // SeqView (lazy)!
lz.sum


另外,您的假设是错误的,否则会有三次迭代。
(1L到1000000L)
创建了一个
numeriRange
,它不涉及对元素的任何迭代。因此
.view
为您节省了一次迭代。

Haskell从定义上讲是一种非严格语言,我所知道的所有实现都使用延迟求值来提供非严格语义

类似的代码(具有用于开始和结束的参数,因此不可能进行编译时计算)

只需一次遍历并在恒定的小内存中计算总和。
[low..high]
enumFromTo low-high
的语法糖,而
enumFromTo
对于
Int
的定义基本上是

enumFromTo x y
    | y < x     = []
    | otherwise = go x
      where
        go k = k : if k == y then [] else go (k+1)
总和:

sum     l       = sum' l 0
  where
    sum' []     a = a
    sum' (x:xs) a = sum' xs (a+x)
即使没有任何优化,评估仍将继续进行

sum' (filter even (enumFromTo 1 6)) 0
-- Now it must be determined whether the first argument of sum' is [] or not
-- For that, the application of filter must be evaluated
-- For that, enumFromTo must be evaluated
~> sum' (filter even (1 : go 2)) 0
-- Now filter knows which equation to use, unfortunately, `even 1` is False
~> sum' (filter even (go 2)) 0
~> sum' (filter even (2 : go 3)) 0
-- 2 is even, so
~> sum' (2 : filter even (go 3)) 0
~> sum' (filter even (go 3)) (0+2)
-- Once again, sum asks whether filter is done or not, so filter demands another value or []
-- from go
~> sum' (filter even (3 : go 4)) 2
~> sum' (filter even (go 4)) 2
~> sum' (filter even (4 : go 5)) 2
~> sum' (4 : filter even (go 5)) 2
~> sum' (filter even (go 5)) (2+4)
~> sum' (filter even (5 : go 6)) 6
~> sum' (filter even (go 6)) 6
~> sum' (filter even (6 : [])) 6
~> sum' (6 : filter even []) 6
~> sum' (filter even []) (6+6)
~> sum' [] 12
~> 12
这当然比循环效率低,因为对于枚举的每个元素,都必须生成一个列表单元格,然后对于通过过滤器的每个元素,都必须生成一个列表单元格,而列表单元格只能立即被总和消耗

让我们检查一下内存使用情况是否确实很小:

module Main (main) where

import System.Environment (getArgs)

main :: IO ()
main = do
    args <- getArgs
    let (low, high) = case args of
                        (a:b:_) -> (read a, read b)
                        _       -> error "Want two args"
    print $ sum $ filter even [low :: Int .. high]
它为50万个列表单元格分配了大约40MB的内存(1)和一些更改,但最大驻留空间约为44KB。以1000万的上限运行,总体分配(和运行时间)增长了10倍(减去常量),但最大驻留空间保持不变

(1) GHC将枚举和过滤器融合在一起,只生成类型为
Int
的范围内的偶数。不幸的是,它无法融合掉
sum
,因为这是一个左折叠,而GHC的融合框架只融合右折叠

现在,为了融合
,我们必须做大量的工作来教GHC如何使用重写规则。幸运的是,
向量
包中的许多算法都这样做了,如果我们使用它

module Main where

import qualified Data.Vector.Unboxed as U
import System.Environment (getArgs)

val :: Int -> Int -> Int
val low high = U.sum . U.filter even $ U.enumFromN low (high - low + 1)

main :: IO ()
main = do
    args <- getArgs
    let (low, high) = case args of
                        (a:b:_) -> (read a, read b)
                        _       -> error "Want two args"
    print $ val low high
以下是GHC为(val的工作人员)制作的核心,如果有人感兴趣:

Rec {
Main.main_$s$wfoldlM'_loop [Occ=LoopBreaker]
  :: GHC.Prim.Int# -> GHC.Prim.Int# -> GHC.Prim.Int# -> GHC.Prim.Int#
[GblId, Arity=3, Caf=NoCafRefs, Str=DmdType LLL]
Main.main_$s$wfoldlM'_loop =
  \ (sc_s303 :: GHC.Prim.Int#)
    (sc1_s304 :: GHC.Prim.Int#)
    (sc2_s305 :: GHC.Prim.Int#) ->
    case GHC.Prim.># sc1_s304 0 of _ {
      GHC.Types.False -> sc_s303;
      GHC.Types.True ->
        case GHC.Prim.remInt# sc2_s305 2 of _ {
          __DEFAULT ->
            Main.main_$s$wfoldlM'_loop
              sc_s303 (GHC.Prim.-# sc1_s304 1) (GHC.Prim.+# sc2_s305 1);
          0 ->
            Main.main_$s$wfoldlM'_loop
              (GHC.Prim.+# sc_s303 sc2_s305)
              (GHC.Prim.-# sc1_s304 1)
              (GHC.Prim.+# sc2_s305 1)
        }
    }
end Rec }

几年前,我发表了两篇关于这个话题的博客文章:


请注意,Scala编译器完成的专门化和优化从那时起有了很大的改进(可能在Hotspot中也有改进),所以今天的结果可能会更好。

我们甚至可以对其进行优化以进行编译时评估:
val sum=250000500000
。也许有些编译器会这样做?@leems这是正确的,但我对编译时不知道值的情况感兴趣。您能给出您希望编译器生成的伪代码吗?@tsenart Since你似乎很感兴趣,我已经写了一个部分详细的答案。我希望即使没有太多Haskell知识也可以访问。你在寻找“毁林”或“融合”优化。这些已经在Haskell的上下文中进行了广泛的研究,但在其他FP-ish编译器中也有一些更受限制的形式。就最终结果而言,您的解决方案确实是优化的。我的问题,如上文所述,是指编译器对任何函数中第一个代码段的等效优化nal语言。关于你的P.S:我还没有反编译生成的Java字节码,所以我不能确定,但列表必须以某种方式构造和初始化。我相信在较低的级别必须通过迭代来完成。如果我错了,请纠正我。表示
(1L到100000ml)
您需要存储
1L
1000000ML
的类型。为什么要迭代?当您尝试打印该对象(
toString
)时可能会发生这种情况当然。是的,但是严格语言呢?嗯,我认为严格语言的编译器也有可能将其重写到循环中。但我不知道是否有严格函数语言的编译器被教导过这一点。问题特别提到了懒惰,OP对此不感兴趣;示例在scala中。我了解到OP对手动插入的惰性不感兴趣,因为在这种情况下,同样可以编写循环并完成它,特别是在OP要求详细说明我的评论之后。循环融合已经在OCaml和至少一个专有的strict Haskell编译器中实现。
sum' (filter even (enumFromTo 1 6)) 0
-- Now it must be determined whether the first argument of sum' is [] or not
-- For that, the application of filter must be evaluated
-- For that, enumFromTo must be evaluated
~> sum' (filter even (1 : go 2)) 0
-- Now filter knows which equation to use, unfortunately, `even 1` is False
~> sum' (filter even (go 2)) 0
~> sum' (filter even (2 : go 3)) 0
-- 2 is even, so
~> sum' (2 : filter even (go 3)) 0
~> sum' (filter even (go 3)) (0+2)
-- Once again, sum asks whether filter is done or not, so filter demands another value or []
-- from go
~> sum' (filter even (3 : go 4)) 2
~> sum' (filter even (go 4)) 2
~> sum' (filter even (4 : go 5)) 2
~> sum' (4 : filter even (go 5)) 2
~> sum' (filter even (go 5)) (2+4)
~> sum' (filter even (5 : go 6)) 6
~> sum' (filter even (go 6)) 6
~> sum' (filter even (6 : [])) 6
~> sum' (6 : filter even []) 6
~> sum' (filter even []) (6+6)
~> sum' [] 12
~> 12
module Main (main) where

import System.Environment (getArgs)

main :: IO ()
main = do
    args <- getArgs
    let (low, high) = case args of
                        (a:b:_) -> (read a, read b)
                        _       -> error "Want two args"
    print $ sum $ filter even [low :: Int .. high]
$ ./sumEvens +RTS -s -RTS 1 1000000
250000500000
      40,071,856 bytes allocated in the heap
          12,504 bytes copied during GC
          44,416 bytes maximum residency (2 sample(s))
          21,120 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0        75 colls,     0 par    0.00s    0.00s     0.0000s    0.0000s
  Gen  1         2 colls,     0 par    0.00s    0.00s     0.0002s    0.0003s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    0.01s  (  0.01s elapsed)
  GC      time    0.00s  (  0.00s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    0.01s  (  0.01s elapsed)

  %GC     time       6.1%  (7.6% elapsed)

  Alloc rate    4,367,976,530 bytes per MUT second

  Productivity  91.8% of total user, 115.8% of total elapsed
module Main where

import qualified Data.Vector.Unboxed as U
import System.Environment (getArgs)

val :: Int -> Int -> Int
val low high = U.sum . U.filter even $ U.enumFromN low (high - low + 1)

main :: IO ()
main = do
    args <- getArgs
    let (low, high) = case args of
                        (a:b:_) -> (read a, read b)
                        _       -> error "Want two args"
    print $ val low high
$ ./sumFilter +RTS -s -RTS 1 10000000
25000005000000
          72,640 bytes allocated in the heap
           3,512 bytes copied during GC
          44,416 bytes maximum residency (1 sample(s))
          17,024 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0         0 colls,     0 par    0.00s    0.00s     0.0000s    0.0000s
  Gen  1         1 colls,     0 par    0.00s    0.00s     0.0001s    0.0001s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    0.01s  (  0.01s elapsed)
  GC      time    0.00s  (  0.00s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    0.01s  (  0.01s elapsed)

  %GC     time       1.0%  (1.2% elapsed)

  Alloc rate    7,361,805 bytes per MUT second

  Productivity  97.7% of total user, 111.5% of total elapsed
Rec {
Main.main_$s$wfoldlM'_loop [Occ=LoopBreaker]
  :: GHC.Prim.Int# -> GHC.Prim.Int# -> GHC.Prim.Int# -> GHC.Prim.Int#
[GblId, Arity=3, Caf=NoCafRefs, Str=DmdType LLL]
Main.main_$s$wfoldlM'_loop =
  \ (sc_s303 :: GHC.Prim.Int#)
    (sc1_s304 :: GHC.Prim.Int#)
    (sc2_s305 :: GHC.Prim.Int#) ->
    case GHC.Prim.># sc1_s304 0 of _ {
      GHC.Types.False -> sc_s303;
      GHC.Types.True ->
        case GHC.Prim.remInt# sc2_s305 2 of _ {
          __DEFAULT ->
            Main.main_$s$wfoldlM'_loop
              sc_s303 (GHC.Prim.-# sc1_s304 1) (GHC.Prim.+# sc2_s305 1);
          0 ->
            Main.main_$s$wfoldlM'_loop
              (GHC.Prim.+# sc_s303 sc2_s305)
              (GHC.Prim.-# sc1_s304 1)
              (GHC.Prim.+# sc2_s305 1)
        }
    }
end Rec }