List 在Haskell中动态减少列表
假设我有一个函数List 在Haskell中动态减少列表,list,haskell,on-the-fly,reduction,List,Haskell,On The Fly,Reduction,假设我有一个函数f,它接受一些输入并生成一个数字。在函数f中,根据输入创建一个列表,然后减少该列表(例如使用foldl'g)以生成最终输出编号。因为中间列表毕竟要减少,所以可以应用减少函数g,而不表示中间列表。这里的目标是限制用于存储(或表示,如果“存储”不太准确的单词)列表的内存 为了说明这一点,这个函数foldPairProduct为中间列表占用O(N1*N2)空间(由于表达式和延迟计算,消耗的空间可能更复杂,但我假设它是成比例的或更糟)。这里N1、N2是两个输入列表的大小 foldPair
f
,它接受一些输入并生成一个数字。在函数f
中,根据输入创建一个列表,然后减少该列表(例如使用foldl'g
)以生成最终输出编号。因为中间列表毕竟要减少,所以可以应用减少函数g
,而不表示中间列表。这里的目标是限制用于存储(或表示,如果“存储”不太准确的单词)列表的内存
为了说明这一点,这个函数foldPairProduct
为中间列表占用O(N1*N2)
空间(由于表达式和延迟计算,消耗的空间可能更复杂,但我假设它是成比例的或更糟)。这里N1、N2
是两个输入列表的大小
foldPairProduct :: (Num a, Ord a) => (a -> a -> a) -> [a] -> [a] -> a
foldPairProduct f xs ys = foldl1 f [ x*y | x <- xs, y <- ys]
这种情况在foldCrossProduct
中更加严重,其实现与foldPairProduct
类似,只是它接受多个列表作为输入。中间列表的空间复杂度(仍然是命令式语言的意义)是O(N1*N2*…*Nk)
,其中k
是[[a]]
的长度
foldCrossProduct :: Num a => (a -> a -> a) -> [[a]] -> a
foldCrossProduct f xss = foldl1 f (crossProduct xss)
crossProduct :: Num a => [[a]] -> [a]
crossProduct [] = []
crossProduct (xs:[]) = xs
crossProduct (xs:xss) = [x * y | x <- xs, y <- crossProduct xss]
foldPairProduct[1..10000][1..n]
foldPairProduct[1..n][1..n]
折叠交叉积(最大)[[1..n],[1..100],[1..1000]]
折叠交叉积(最大)[[1..100],[1..n],[1..1000]]
foldPairProduct'[1..n][1..n]
(好吧,我错了,它不会在常量空间中工作,因为其中一个列表被多次使用,所以它很可能具有线性空间复杂性)
您是否尝试编译启用了优化的测试程序?您的foldPairProduct
看起来很适合我,我希望它在恒定的空间中工作
加:
是的,它在恒定空间中工作(使用的总内存为3 MB):
shum@shum-笔记本电脑:/tmp/shum$cat test.hs
foldPairProduct f xs ys=foldl1 f[x*y | x对于创建/修改/使用调用的列表有特定的优化。因为Haskell是纯的、非严格的,所以有许多规律,比如map f.mag==map(f.g)
如果编译器由于某种原因无法识别代码并生成次优代码(在传递-O
标志后),我将详细研究流融合,看看是什么阻止了它
foldPairProduct :: (Num a, Ord a) => (a -> a -> a) -> [a] -> [a] -> a
foldPairProduct f xs ys = foldl1 f [ x*y | x <- xs, y <- ys]
默认设置为
20,445,728 bytes copied during GC
对于xs==ys=[1..10000]::[Int]
和f=(+)
,从GC时间4.88s
到GC时间0.07s
的时间
但这取决于严格程度分析器的工作——如果它所使用的类型在编译过程中是已知的,例如Int
,并且组合函数已知是严格的,那么严格程度分析器的工作做得很好。如果代码不是专门的,或者如果组合函数未知是严格的,那么折叠将产生O的响声(长度xs*length ys)
size。使用更严格的foldl1'
可以缓解该问题
foldPairProduct' :: Num a => (Maybe a -> Maybe a -> Maybe a) -> [a] -> [a] -> Maybe a
foldPairProduct' _ _ [] = Nothing
foldPairProduct' _ [] _ = Nothing
foldPairProduct' f (x:xs) (y:ys) =
foldl1 f [Just $ x*y, foldPairProduct' f [x] ys, foldPairProduct' f xs [y],
foldPairProduct' f xs ys]
正面运行到严格性不足的问题,编译器无法在此处严格限制由Just
构造函数包装的值,因为整体结果可能不需要它,因此折叠通常会产生O(长度xs*长度ys)Just
下的size thunk-当然,对于一些f
,例如const
,它的行为将保持原样。如果要成为一个好的内存公民,如果使用了所有值,则必须使用足够严格的组合函数f
,同时强制在结果中使用Just
下的值(如果它是的话,只需);使用foldl1'
也有帮助。这样,它就可以具有O(长度ys+长度xs)
空间复杂度(列表xs
和ys
被多次使用,因此可以重复使用)
帮助。然后,crossProduct xss
不需要立即存储在内存中,因此可以以增量方式生成和使用,只有xs
必须记住,因为它被多次使用。对于递归调用,必须共享剩余列表中的第一个,这样才能生成一个整体O(N1+…+Nk-1)
空间复杂性。我相信haskell懒惰的目的是,即使某些东西被表示为列表,如果不需要,列表也不一定被存储。您是否确实测试了空间消耗量?“观察GC复制的字节”为什么在GC过程中会如此关注复制的字节数?字节最大驻留时间或使用的总内存对于空间使用量测量应该更准确。您的程序使用不可变数据运行,所以很多分配都是可以的。GC复制很多可能是个问题,但这与内存使用量无关。我发现这与l的fusion有关这可能是理论上的问题。复制的字节没有那么有趣。打开更多统计信息,检查使用的最大堆是否增长。这是需要担心的数字。具体来说,它多次使用第二个参数。对于n=10000000
,foldPairProduct(+)[1..10][1..n]
使用1362MB内存,而foldPairProduct(+)[1..n][1..10]
使用1MB内存。@sabauma我想知道是否有可能完全阻止sharingI。我不知道有什么简单的方法可以做到这一点。这个问题可能会引起兴趣。也就是说,这样做需要每次重新计算列表,所以你可能会在很大程度上权衡时间和空间。我现在开始理解和欣赏惰性评估。它与严格性的结合非常强大:高级编程具有低级语言的性能。GHC在这段代码上的行为称为“流/循环融合”和“毁林”吗,如其他答案中所述?我想知道允许足够严格的规则。因为,我通常会使用自定义类型和功能。例如,在foldCrossProduct中,f
是自定义的n = 100
1,284,024 bytes allocated in the heap
15,440 bytes copied during GC
32,336 bytes maximum residency (1 sample(s))
19,920 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
n = 1000
120,207,224 bytes allocated in the heap
114,848 bytes copied during GC
68,336 bytes maximum residency (1 sample(s))
24,832 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
n = 10000
12,001,432,024 bytes allocated in the heap
5,708,472,592 bytes copied during GC
428,336 bytes maximum residency (5000 sample(s))
99,960 bytes maximum slop
3 MB total memory in use (0 MB lost due to fragmentation)
n = 100000
1,200,013,672,824 bytes allocated in the heap
816,574,713,664 bytes copied during GC
4,028,336 bytes maximum residency (100002 sample(s))
770,264 bytes maximum slop
14 MB total memory in use (0 MB lost due to fragmentation)
n = 100
105,131,320 bytes allocated in the heap
38,697,432 bytes copied during GC
427,832 bytes maximum residency (34 sample(s))
209,312 bytes maximum slop
3 MB total memory in use (0 MB lost due to fragmentation)
n = 1000
1,041,254,480 bytes allocated in the heap
374,148,224 bytes copied during GC
427,832 bytes maximum residency (334 sample(s))
211,936 bytes maximum slop
3 MB total memory in use (0 MB lost due to fragmentation)
n = 10000
10,402,479,240 bytes allocated in the heap
3,728,429,728 bytes copied during GC
427,832 bytes maximum residency (3334 sample(s))
215,936 bytes maximum slop
3 MB total memory in use (0 MB lost due to fragmentation)
n = 100
105,131,344 bytes allocated in the heap
38,686,648 bytes copied during GC
431,408 bytes maximum residency (34 sample(s))
205,456 bytes maximum slop
3 MB total memory in use (0 MB lost due to fragmentation)
n = 1000
1,050,614,504 bytes allocated in the heap
412,084,688 bytes copied during GC
4,031,456 bytes maximum residency (53 sample(s))
1,403,976 bytes maximum slop
15 MB total memory in use (0 MB lost due to fragmentation)
n = 10000
quit after over 1362 MB total memory in use (0 MB lost due to fragmentation)
n = 100
4,351,176 bytes allocated in the heap
59,432 bytes copied during GC
74,296 bytes maximum residency (1 sample(s))
21,320 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
n = 1000
527,009,960 bytes allocated in the heap
45,827,176 bytes copied during GC
211,680 bytes maximum residency (1 sample(s))
25,760 bytes maximum slop
2 MB total memory in use (0 MB lost due to fragmentation)
shum@shum-laptop:/tmp/shum$ cat test.hs
foldPairProduct f xs ys = foldl1 f [ x*y | x <- xs, y <- ys]
n :: Int
n = 10000
main = print $ foldPairProduct (+) [1..n] [1..n]
shum@shum-laptop:/tmp/shum$ ghc --make -fforce-recomp -O test.hs
[1 of 1] Compiling Main ( test.hs, test.o )
Linking test ...
shum@shum-laptop:/tmp/shum$ time ./test +RTS -s
2500500025000000
10,401,332,232 bytes allocated in the heap
3,717,333,376 bytes copied during GC
428,280 bytes maximum residency (3335 sample(s))
219,792 bytes maximum slop
3 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 16699 colls, 0 par 4.27s 4.40s 0.0003s 0.0009s
Gen 1 3335 colls, 0 par 1.52s 1.52s 0.0005s 0.0012s
INIT time 0.00s ( 0.00s elapsed)
MUT time 2.23s ( 2.17s elapsed)
GC time 5.79s ( 5.91s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 8.02s ( 8.08s elapsed)
%GC time 72.2% (73.2% elapsed)
Alloc rate 4,659,775,665 bytes per MUT second
Productivity 27.8% of total user, 27.6% of total elapsed
real 0m8.085s
user 0m8.025s
sys 0m0.040s
shum@shum-laptop:/tmp/shum$
foldPairProduct :: (Num a, Ord a) => (a -> a -> a) -> [a] -> [a] -> a
foldPairProduct f xs ys = foldl1 f [ x*y | x <- xs, y <- ys]
3,717,333,376 bytes copied during GC
20,445,728 bytes copied during GC
foldPairProduct' :: Num a => (Maybe a -> Maybe a -> Maybe a) -> [a] -> [a] -> Maybe a
foldPairProduct' _ _ [] = Nothing
foldPairProduct' _ [] _ = Nothing
foldPairProduct' f (x:xs) (y:ys) =
foldl1 f [Just $ x*y, foldPairProduct' f [x] ys, foldPairProduct' f xs [y],
foldPairProduct' f xs ys]
foldCrossProduct :: Num a => (a -> a -> a) -> [[a]] -> a
foldCrossProduct f xss = foldl1 f (crossProduct xss)
crossProduct :: Num a => [[a]] -> [a]
crossProduct [] = []
crossProduct (xs:[]) = xs
crossProduct (xs:xss) = [x * y | x <- xs, y <- crossProduct xss]
crossProduct (xs:xss) = [x * y | y <- crossProduct xss, x <- xs]