List 在Haskell中动态减少列表

List 在Haskell中动态减少列表,list,haskell,on-the-fly,reduction,List,Haskell,On The Fly,Reduction,假设我有一个函数f,它接受一些输入并生成一个数字。在函数f中,根据输入创建一个列表,然后减少该列表(例如使用foldl'g)以生成最终输出编号。因为中间列表毕竟要减少,所以可以应用减少函数g,而不表示中间列表。这里的目标是限制用于存储(或表示,如果“存储”不太准确的单词)列表的内存 为了说明这一点,这个函数foldPairProduct为中间列表占用O(N1*N2)空间(由于表达式和延迟计算,消耗的空间可能更复杂,但我假设它是成比例的或更糟)。这里N1、N2是两个输入列表的大小 foldPair

假设我有一个函数
f
,它接受一些输入并生成一个数字。在函数
f
中,根据输入创建一个列表,然后减少该列表(例如使用
foldl'g
)以生成最终输出编号。因为中间列表毕竟要减少,所以可以应用减少函数
g
,而不表示中间列表。这里的目标是限制用于存储(或表示,如果“存储”不太准确的单词)列表的内存

为了说明这一点,这个函数
foldPairProduct
为中间列表占用
O(N1*N2)
空间(由于表达式和延迟计算,消耗的空间可能更复杂,但我假设它是成比例的或更糟)。这里
N1、N2
是两个输入列表的大小

foldPairProduct :: (Num a, Ord a)  => (a -> a -> a) -> [a] -> [a] -> a
foldPairProduct f xs ys = foldl1 f [ x*y | x <- xs, y <- ys]
这种情况在
foldCrossProduct
中更加严重,其实现与
foldPairProduct
类似,只是它接受多个列表作为输入。中间列表的空间复杂度(仍然是命令式语言的意义)是
O(N1*N2*…*Nk)
,其中
k
[[a]]
的长度

foldCrossProduct :: Num a => (a -> a -> a) -> [[a]]  -> a
foldCrossProduct f xss = foldl1 f (crossProduct xss)

crossProduct :: Num a => [[a]] -> [a]
crossProduct [] = []
crossProduct (xs:[]) = xs
crossProduct (xs:xss) = [x * y | x <- xs, y <- crossProduct xss] 
foldPairProduct[1..10000][1..n]

foldPairProduct[1..n][1..n]

折叠交叉积(最大)[[1..n],[1..100],[1..1000]]

折叠交叉积(最大)[[1..100],[1..n],[1..1000]]

foldPairProduct'[1..n][1..n]

(好吧,我错了,它不会在常量空间中工作,因为其中一个列表被多次使用,所以它很可能具有线性空间复杂性)

您是否尝试编译启用了优化的测试程序?您的
foldPairProduct
看起来很适合我,我希望它在恒定的空间中工作

加: 是的,它在恒定空间中工作(使用的总内存为3 MB):

shum@shum-笔记本电脑:/tmp/shum$cat test.hs

foldPairProduct f xs ys=foldl1 f[x*y | x对于创建/修改/使用调用的列表有特定的优化。因为Haskell是纯的、非严格的,所以有许多规律,比如
map f.mag==map(f.g)

如果编译器由于某种原因无法识别代码并生成次优代码(在传递
-O
标志后),我将详细研究流融合,看看是什么阻止了它

foldPairProduct :: (Num a, Ord a)  => (a -> a -> a) -> [a] -> [a] -> a
foldPairProduct f xs ys = foldl1 f [ x*y | x <- xs, y <- ys]
默认设置为

20,445,728 bytes copied during GC
对于
xs==ys=[1..10000]::[Int]
f=(+)
,从
GC时间4.88s
GC时间0.07s
的时间

但这取决于严格程度分析器的工作——如果它所使用的类型在编译过程中是已知的,例如
Int
,并且组合函数已知是严格的,那么严格程度分析器的工作做得很好。如果代码不是专门的,或者如果组合函数未知是严格的,那么折叠将产生
O的响声(长度xs*length ys)
size。使用更严格的
foldl1'
可以缓解该问题

foldPairProduct' :: Num a => (Maybe a -> Maybe a -> Maybe a) -> [a] -> [a] -> Maybe a  
foldPairProduct' _ _ [] = Nothing
foldPairProduct' _ [] _ = Nothing
foldPairProduct' f (x:xs) (y:ys) = 
  foldl1 f [Just $ x*y, foldPairProduct' f [x] ys, foldPairProduct' f xs [y], 
            foldPairProduct' f xs ys]
正面运行到严格性不足的问题,编译器无法在此处严格限制由
Just
构造函数包装的值,因为整体结果可能不需要它,因此折叠通常会产生
O(长度xs*长度ys)
Just
下的size thunk-当然,对于一些
f
,例如
const
,它的行为将保持原样。如果要成为一个好的内存公民,如果使用了所有值,则必须使用足够严格的组合函数
f
,同时强制在结果中使用
Just
下的值(如果它是
的话,只需
);使用
foldl1'
也有帮助。这样,它就可以具有
O(长度ys+长度xs)
空间复杂度(列表
xs
ys
被多次使用,因此可以重复使用)


帮助。然后,
crossProduct xss
不需要立即存储在内存中,因此可以以增量方式生成和使用,只有
xs
必须记住,因为它被多次使用。对于递归调用,必须共享剩余列表中的第一个,这样才能生成一个整体
O(N1+…+Nk-1)
空间复杂性。

我相信haskell懒惰的目的是,即使某些东西被表示为列表,如果不需要,列表也不一定被存储。您是否确实测试了空间消耗量?“观察GC复制的字节”为什么在GC过程中会如此关注复制的字节数?字节最大驻留时间或使用的总内存对于空间使用量测量应该更准确。您的程序使用不可变数据运行,所以很多分配都是可以的。GC复制很多可能是个问题,但这与内存使用量无关。我发现这与l的fusion有关这可能是理论上的问题。复制的字节没有那么有趣。打开更多统计信息,检查使用的最大堆是否增长。这是需要担心的数字。具体来说,它多次使用第二个参数。对于
n=10000000
foldPairProduct(+)[1..10][1..n]
使用1362MB内存,而
foldPairProduct(+)[1..n][1..10]
使用1MB内存。@sabauma我想知道是否有可能完全阻止sharingI。我不知道有什么简单的方法可以做到这一点。这个问题可能会引起兴趣。也就是说,这样做需要每次重新计算列表,所以你可能会在很大程度上权衡时间和空间。我现在开始理解和欣赏惰性评估。它与严格性的结合非常强大:高级编程具有低级语言的性能。GHC在这段代码上的行为称为“流/循环融合”和“毁林”吗,如其他答案中所述?我想知道允许足够严格的规则。因为,我通常会使用自定义类型和功能。例如,在foldCrossProduct中,
f
是自定义的n = 100 1,284,024 bytes allocated in the heap 15,440 bytes copied during GC 32,336 bytes maximum residency (1 sample(s)) 19,920 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation) n = 1000 120,207,224 bytes allocated in the heap 114,848 bytes copied during GC 68,336 bytes maximum residency (1 sample(s)) 24,832 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation) n = 10000 12,001,432,024 bytes allocated in the heap 5,708,472,592 bytes copied during GC 428,336 bytes maximum residency (5000 sample(s)) 99,960 bytes maximum slop 3 MB total memory in use (0 MB lost due to fragmentation) n = 100000 1,200,013,672,824 bytes allocated in the heap 816,574,713,664 bytes copied during GC 4,028,336 bytes maximum residency (100002 sample(s)) 770,264 bytes maximum slop 14 MB total memory in use (0 MB lost due to fragmentation)
n = 100
     105,131,320 bytes allocated in the heap 
      38,697,432 bytes copied during GC     
         427,832 bytes maximum residency (34 sample(s)) 
         209,312 bytes maximum slop 
               3 MB total memory in use (0 MB lost due to fragmentation)

n = 1000
   1,041,254,480 bytes allocated in the heap 
     374,148,224 bytes copied during GC 
         427,832 bytes maximum residency (334 sample(s))
         211,936 bytes maximum slop 
               3 MB total memory in use (0 MB lost due to fragmentation)

n = 10000
  10,402,479,240 bytes allocated in the heap 
   3,728,429,728 bytes copied during GC     
         427,832 bytes maximum residency (3334 sample(s))
         215,936 bytes maximum slop
               3 MB total memory in use (0 MB lost due to fragmentation)
n = 100
     105,131,344 bytes allocated in the heap 
      38,686,648 bytes copied during GC  
         431,408 bytes maximum residency (34 sample(s)) 
         205,456 bytes maximum slop 
               3 MB total memory in use (0 MB lost due to fragmentation)

n = 1000
   1,050,614,504 bytes allocated in the heap
     412,084,688 bytes copied during GC 
       4,031,456 bytes maximum residency (53 sample(s)) 
       1,403,976 bytes maximum slop
              15 MB total memory in use (0 MB lost due to fragmentation)    
n = 10000
    quit after over 1362 MB total memory in use (0 MB lost due to fragmentation)    
n = 100
 4,351,176 bytes allocated in the heap
    59,432 bytes copied during GC    
    74,296 bytes maximum residency (1 sample(s))
    21,320 bytes maximum slop                  
         1 MB total memory in use (0 MB lost due to fragmentation)

n = 1000
 527,009,960 bytes allocated in the heap 
  45,827,176 bytes copied during GC 
     211,680 bytes maximum residency (1 sample(s)) 
      25,760 bytes maximum slop 
           2 MB total memory in use (0 MB lost due to fragmentation)
shum@shum-laptop:/tmp/shum$ cat test.hs 

foldPairProduct f xs ys = foldl1 f [ x*y | x <- xs, y <- ys]

n :: Int
n = 10000

main = print $ foldPairProduct (+) [1..n] [1..n]
shum@shum-laptop:/tmp/shum$ ghc --make -fforce-recomp -O test.hs 
[1 of 1] Compiling Main             ( test.hs, test.o )
Linking test ...
shum@shum-laptop:/tmp/shum$ time ./test +RTS -s
2500500025000000
  10,401,332,232 bytes allocated in the heap
   3,717,333,376 bytes copied during GC
         428,280 bytes maximum residency (3335 sample(s))
         219,792 bytes maximum slop
               3 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0     16699 colls,     0 par    4.27s    4.40s     0.0003s    0.0009s
  Gen  1      3335 colls,     0 par    1.52s    1.52s     0.0005s    0.0012s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    2.23s  (  2.17s elapsed)
  GC      time    5.79s  (  5.91s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    8.02s  (  8.08s elapsed)

  %GC     time      72.2%  (73.2% elapsed)

  Alloc rate    4,659,775,665 bytes per MUT second

  Productivity  27.8% of total user, 27.6% of total elapsed


real    0m8.085s
user    0m8.025s
sys 0m0.040s
shum@shum-laptop:/tmp/shum$
foldPairProduct :: (Num a, Ord a)  => (a -> a -> a) -> [a] -> [a] -> a
foldPairProduct f xs ys = foldl1 f [ x*y | x <- xs, y <- ys]
3,717,333,376 bytes copied during GC
20,445,728 bytes copied during GC
foldPairProduct' :: Num a => (Maybe a -> Maybe a -> Maybe a) -> [a] -> [a] -> Maybe a  
foldPairProduct' _ _ [] = Nothing
foldPairProduct' _ [] _ = Nothing
foldPairProduct' f (x:xs) (y:ys) = 
  foldl1 f [Just $ x*y, foldPairProduct' f [x] ys, foldPairProduct' f xs [y], 
            foldPairProduct' f xs ys]
foldCrossProduct :: Num a => (a -> a -> a) -> [[a]]  -> a
foldCrossProduct f xss = foldl1 f (crossProduct xss)

crossProduct :: Num a => [[a]] -> [a]
crossProduct [] = []
crossProduct (xs:[]) = xs
crossProduct (xs:xss) = [x * y | x <- xs, y <- crossProduct xss]
crossProduct (xs:xss) = [x * y | y <- crossProduct xss, x <- xs]