Haskell 提高序列化期间的内存使用率(Data.Binary)
我对Haskell还是个新手,每天都在学习新东西。我的问题是在使用Data.Binary库进行序列化时内存使用率过高。也许我只是用错了图书馆,但我想不出来 实际的想法是,我从磁盘读取二进制数据,添加新数据,然后将所有内容写回磁盘。代码如下:Haskell 提高序列化期间的内存使用率(Data.Binary),haskell,memory,serialization,memory-leaks,Haskell,Memory,Serialization,Memory Leaks,我对Haskell还是个新手,每天都在学习新东西。我的问题是在使用Data.Binary库进行序列化时内存使用率过高。也许我只是用错了图书馆,但我想不出来 实际的想法是,我从磁盘读取二进制数据,添加新数据,然后将所有内容写回磁盘。代码如下: module Main where import Data.Binary import System.Environment import Data.List (foldl') data DualNo = DualNo Int Int derivin
module Main
where
import Data.Binary
import System.Environment
import Data.List (foldl')
data DualNo = DualNo Int Int deriving (Show)
instance Data.Binary.Binary DualNo where
put (DualNo a b) = do
put a
put b
get = do
a <- get
b <- get
return (DualNo a b)
-- read DualNo from HDD
readData :: FilePath -> IO [DualNo]
readData filename = do
no <- decodeFile filename :: IO [DualNo]
return no
-- write DualNo to HDD
writeData :: [DualNo] -> String -> IO ()
writeData no filename = encodeFile filename (no :: [DualNo])
writeEmptyDataToDisk :: String -> IO ()
writeEmptyDataToDisk filename = writeData [] filename
-- feed a the list with a new dataset
feedWithInputData :: [DualNo] -> [(Int, Int)] -> [DualNo]
feedWithInputData existData newData = foldl' func existData newData
where
func dataset (a,b) = DualNo a b : dataset
main :: IO ()
main = do
[newInputData, toPutIntoExistingData] <- System.Environment.getArgs
if toPutIntoExistingData == "empty"
then writeEmptyDataToDisk "myData.dat"
else return ()
loadedData <- readData "myData.dat"
newData <- return (case newInputData of
"dataset1" -> feedWithInputData loadedData dataset1
"dataset2" -> feedWithInputData loadedData dataset2
otherwise -> feedWithInputData loadedData dataset3)
writeData newData "myData.dat"
dataset1 = zip [1..100000] [2,4..200000]
dataset2 = zip [5,10..500000] [3,6..300000]
dataset3 = zip [4,8..400000] [6,12..600000]
查看prof文件:
Tue Apr 12 18:11 2016 Time and Allocation Profiling Report (Final)
Main +RTS -p -sstderr -RTS dataset1 empty
total time = 0.06 secs (60 ticks @ 1000 us, 1 processor)
total alloc = 102,613,008 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
put Main 48.3 53.0
writeData Main 30.0 18.8
dataset1 Main 13.3 23.4
feedWithInputData Main 6.7 0.0
feedWithInputData.func Main 1.7 4.7
individual inherited
COST CENTRE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 68 0 0.0 0.0 100.0 100.0
main Main 137 0 0.0 0.0 86.7 76.6
feedWithInputData Main 150 1 6.7 0.0 8.3 4.7
feedWithInputData.func Main 154 100000 1.7 4.7 1.7 4.7
writeData Main 148 1 30.0 18.8 78.3 71.8
put Main 155 100000 48.3 53.0 48.3 53.0
readData Main 147 0 0.0 0.1 0.0 0.1
writeEmptyDataToDisk Main 142 0 0.0 0.0 0.0 0.1
writeData Main 143 0 0.0 0.1 0.0 0.1
CAF:main1 Main 133 0 0.0 0.0 0.0 0.0
main Main 136 1 0.0 0.0 0.0 0.0
CAF:main2 Main 132 0 0.0 0.0 0.0 0.0
main Main 139 0 0.0 0.0 0.0 0.0
writeEmptyDataToDisk Main 140 1 0.0 0.0 0.0 0.0
writeData Main 141 1 0.0 0.0 0.0 0.0
CAF:main7 Main 131 0 0.0 0.0 0.0 0.0
main Main 145 0 0.0 0.0 0.0 0.0
readData Main 146 1 0.0 0.0 0.0 0.0
CAF:dataset1 Main 123 0 0.0 0.0 5.0 7.8
dataset1 Main 151 1 5.0 7.8 5.0 7.8
CAF:dataset4 Main 122 0 0.0 0.0 5.0 7.8
dataset1 Main 153 0 5.0 7.8 5.0 7.8
CAF:dataset5 Main 121 0 0.0 0.0 3.3 7.8
dataset1 Main 152 0 3.3 7.8 3.3 7.8
CAF:main4 Main 116 0 0.0 0.0 0.0 0.0
main Main 138 0 0.0 0.0 0.0 0.0
CAF:main6 Main 115 0 0.0 0.0 0.0 0.0
main Main 149 0 0.0 0.0 0.0 0.0
CAF:main3 Main 113 0 0.0 0.0 0.0 0.0
main Main 144 0 0.0 0.0 0.0 0.0
CAF GHC.Conc.Signal 107 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding 103 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding.Iconv 101 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Handle.FD 94 0 0.0 0.0 0.0 0.0
CAF GHC.IO.FD 86 0 0.0 0.0 0.0 0.0
Tue Apr 12 18:15 2016 Time and Allocation Profiling Report (Final)
Main +RTS -p -sstderr -RTS dataset2 myData.dat
total time = 0.14 secs (139 ticks @ 1000 us, 1 processor)
total alloc = 213,866,232 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
put Main 41.0 50.9
writeData Main 25.9 18.0
get Main 25.2 16.8
dataset2 Main 4.3 11.2
readData Main 1.4 0.8
feedWithInputData.func Main 1.4 2.2
individual inherited
COST CENTRE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 68 0 0.0 0.0 100.0 100.0
main Main 137 0 0.0 0.0 95.7 88.8
feedWithInputData Main 148 1 0.7 0.0 2.2 2.2
feedWithInputData.func Main 152 100000 1.4 2.2 1.4 2.2
writeData Main 145 1 25.9 18.0 66.9 68.9
put Main 153 200000 41.0 50.9 41.0 50.9
readData Main 141 0 1.4 0.8 26.6 17.6
get Main 144 0 25.2 16.8 25.2 16.8
CAF:main1 Main 133 0 0.0 0.0 0.0 0.0
main Main 136 1 0.0 0.0 0.0 0.0
CAF:main7 Main 131 0 0.0 0.0 0.0 0.0
main Main 139 0 0.0 0.0 0.0 0.0
readData Main 140 1 0.0 0.0 0.0 0.0
CAF:dataset2 Main 126 0 0.0 0.0 0.7 3.7
dataset2 Main 149 1 0.7 3.7 0.7 3.7
CAF:dataset6 Main 125 0 0.0 0.0 2.2 3.7
dataset2 Main 151 0 2.2 3.7 2.2 3.7
CAF:dataset7 Main 124 0 0.0 0.0 1.4 3.7
dataset2 Main 150 0 1.4 3.7 1.4 3.7
CAF:$fBinaryDualNo1 Main 120 0 0.0 0.0 0.0 0.0
get Main 143 1 0.0 0.0 0.0 0.0
CAF:main4 Main 116 0 0.0 0.0 0.0 0.0
main Main 138 0 0.0 0.0 0.0 0.0
CAF:main6 Main 115 0 0.0 0.0 0.0 0.0
main Main 146 0 0.0 0.0 0.0 0.0
CAF:main5 Main 114 0 0.0 0.0 0.0 0.0
main Main 147 0 0.0 0.0 0.0 0.0
CAF:main3 Main 113 0 0.0 0.0 0.0 0.0
main Main 142 0 0.0 0.0 0.0 0.0
CAF GHC.Conc.Signal 107 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding 103 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding.Iconv 101 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Handle.FD 94 0 0.0 0.0 0.0 0.0
CAF GHC.IO.FD 86 0 0.0 0.0 0.0 0.0
现在我添加更多数据:
$ ./Main dataset2 myData.dat +RTS -p -sstderr
343,601,008 bytes allocated in the heap
175,650,728 bytes copied during GC
34,113,936 bytes maximum residency (8 sample(s))
971,896 bytes maximum slop
78 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 640 colls, 0 par 0.082s 0.083s 0.0001s 0.0017s
Gen 1 8 colls, 0 par 0.140s 0.141s 0.0176s 0.0484s
INIT time 0.001s ( 0.001s elapsed)
MUT time 0.138s ( 0.139s elapsed)
GC time 0.221s ( 0.224s elapsed)
RP time 0.000s ( 0.000s elapsed)
PROF time 0.000s ( 0.000s elapsed)
EXIT time 0.006s ( 0.006s elapsed)
Total time 0.370s ( 0.370s elapsed)
%GC time 59.8% (60.5% elapsed)
Alloc rate 2,485,518,518 bytes per MUT second
Productivity 39.9% of total user, 39.8% of total elapsed
查看新的prof文件:
Tue Apr 12 18:11 2016 Time and Allocation Profiling Report (Final)
Main +RTS -p -sstderr -RTS dataset1 empty
total time = 0.06 secs (60 ticks @ 1000 us, 1 processor)
total alloc = 102,613,008 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
put Main 48.3 53.0
writeData Main 30.0 18.8
dataset1 Main 13.3 23.4
feedWithInputData Main 6.7 0.0
feedWithInputData.func Main 1.7 4.7
individual inherited
COST CENTRE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 68 0 0.0 0.0 100.0 100.0
main Main 137 0 0.0 0.0 86.7 76.6
feedWithInputData Main 150 1 6.7 0.0 8.3 4.7
feedWithInputData.func Main 154 100000 1.7 4.7 1.7 4.7
writeData Main 148 1 30.0 18.8 78.3 71.8
put Main 155 100000 48.3 53.0 48.3 53.0
readData Main 147 0 0.0 0.1 0.0 0.1
writeEmptyDataToDisk Main 142 0 0.0 0.0 0.0 0.1
writeData Main 143 0 0.0 0.1 0.0 0.1
CAF:main1 Main 133 0 0.0 0.0 0.0 0.0
main Main 136 1 0.0 0.0 0.0 0.0
CAF:main2 Main 132 0 0.0 0.0 0.0 0.0
main Main 139 0 0.0 0.0 0.0 0.0
writeEmptyDataToDisk Main 140 1 0.0 0.0 0.0 0.0
writeData Main 141 1 0.0 0.0 0.0 0.0
CAF:main7 Main 131 0 0.0 0.0 0.0 0.0
main Main 145 0 0.0 0.0 0.0 0.0
readData Main 146 1 0.0 0.0 0.0 0.0
CAF:dataset1 Main 123 0 0.0 0.0 5.0 7.8
dataset1 Main 151 1 5.0 7.8 5.0 7.8
CAF:dataset4 Main 122 0 0.0 0.0 5.0 7.8
dataset1 Main 153 0 5.0 7.8 5.0 7.8
CAF:dataset5 Main 121 0 0.0 0.0 3.3 7.8
dataset1 Main 152 0 3.3 7.8 3.3 7.8
CAF:main4 Main 116 0 0.0 0.0 0.0 0.0
main Main 138 0 0.0 0.0 0.0 0.0
CAF:main6 Main 115 0 0.0 0.0 0.0 0.0
main Main 149 0 0.0 0.0 0.0 0.0
CAF:main3 Main 113 0 0.0 0.0 0.0 0.0
main Main 144 0 0.0 0.0 0.0 0.0
CAF GHC.Conc.Signal 107 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding 103 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding.Iconv 101 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Handle.FD 94 0 0.0 0.0 0.0 0.0
CAF GHC.IO.FD 86 0 0.0 0.0 0.0 0.0
Tue Apr 12 18:15 2016 Time and Allocation Profiling Report (Final)
Main +RTS -p -sstderr -RTS dataset2 myData.dat
total time = 0.14 secs (139 ticks @ 1000 us, 1 processor)
total alloc = 213,866,232 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
put Main 41.0 50.9
writeData Main 25.9 18.0
get Main 25.2 16.8
dataset2 Main 4.3 11.2
readData Main 1.4 0.8
feedWithInputData.func Main 1.4 2.2
individual inherited
COST CENTRE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 68 0 0.0 0.0 100.0 100.0
main Main 137 0 0.0 0.0 95.7 88.8
feedWithInputData Main 148 1 0.7 0.0 2.2 2.2
feedWithInputData.func Main 152 100000 1.4 2.2 1.4 2.2
writeData Main 145 1 25.9 18.0 66.9 68.9
put Main 153 200000 41.0 50.9 41.0 50.9
readData Main 141 0 1.4 0.8 26.6 17.6
get Main 144 0 25.2 16.8 25.2 16.8
CAF:main1 Main 133 0 0.0 0.0 0.0 0.0
main Main 136 1 0.0 0.0 0.0 0.0
CAF:main7 Main 131 0 0.0 0.0 0.0 0.0
main Main 139 0 0.0 0.0 0.0 0.0
readData Main 140 1 0.0 0.0 0.0 0.0
CAF:dataset2 Main 126 0 0.0 0.0 0.7 3.7
dataset2 Main 149 1 0.7 3.7 0.7 3.7
CAF:dataset6 Main 125 0 0.0 0.0 2.2 3.7
dataset2 Main 151 0 2.2 3.7 2.2 3.7
CAF:dataset7 Main 124 0 0.0 0.0 1.4 3.7
dataset2 Main 150 0 1.4 3.7 1.4 3.7
CAF:$fBinaryDualNo1 Main 120 0 0.0 0.0 0.0 0.0
get Main 143 1 0.0 0.0 0.0 0.0
CAF:main4 Main 116 0 0.0 0.0 0.0 0.0
main Main 138 0 0.0 0.0 0.0 0.0
CAF:main6 Main 115 0 0.0 0.0 0.0 0.0
main Main 146 0 0.0 0.0 0.0 0.0
CAF:main5 Main 114 0 0.0 0.0 0.0 0.0
main Main 147 0 0.0 0.0 0.0 0.0
CAF:main3 Main 113 0 0.0 0.0 0.0 0.0
main Main 142 0 0.0 0.0 0.0 0.0
CAF GHC.Conc.Signal 107 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding 103 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding.Iconv 101 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Handle.FD 94 0 0.0 0.0 0.0 0.0
CAF GHC.IO.FD 86 0 0.0 0.0 0.0 0.0
我添加新数据的频率越高,内存使用率就越高。我的意思是,很明显,我需要更多的内存来存储更大的数据集。但是对于这个问题没有更好的解决方案(比如逐渐将数据写回磁盘)
编辑:
事实上,让我困扰的最重要的事情是以下观察:
我真的需要更多的内存来加载程序中的数据,而不是磁盘上二进制文件中相同数据的空间吗?如果不需要立即将整个文件存储在内存中,您可以查看流。这可能会有帮助:更大的数据集=>更多的内存?如果您正在流式传输数据,则不会这样做,我认为正是因为这个原因,binary才使用lazy bytestring。检查代码中的惰性(当您强制/共享值时)。为什么
feedWithInputData
反转您的输入列表?这不会流。谢谢你的流提示,但我认为流不是一个选项。这里的代码实际上是我正在处理的项目的一个简化示例。这里我使用一个简单的数据结构,比如:data DualNo=DualNo Int.102 MB是程序分配的内存,而不是它占用的内存(堆峰值)。主要是在GC期间,可能需要复制和移动一些数据。通过堆图,程序的堆峰值(dataset1 empty
)约为10MB。有关如何生成程序的堆图,请参见。