Parsing Haskell中严格使用二进制文件解码时出现的问题
我试图严格地读取和解码二进制文件,这似乎在大多数情况下都能正常工作。但不幸的是,在少数情况下,我的程序失败了 “字节太少。在字节位置1读取失败” 我猜二进制的解码功能认为没有可用的数据, 但我知道有,只要重新运行程序就可以了Parsing Haskell中严格使用二进制文件解码时出现的问题,parsing,haskell,binary,io,lazy-evaluation,Parsing,Haskell,Binary,Io,Lazy Evaluation,我试图严格地读取和解码二进制文件,这似乎在大多数情况下都能正常工作。但不幸的是,在少数情况下,我的程序失败了 “字节太少。在字节位置1读取失败” 我猜二进制的解码功能认为没有可用的数据, 但我知道有,只要重新运行程序就可以了 import Data.Trie as T import qualified Data.ByteString as B import qualified Data.ByteString.Lazy as L import Data.Binary import System.I
import Data.Trie as T
import qualified Data.ByteString as B
import qualified Data.ByteString.Lazy as L
import Data.Binary
import System.IO
tmp = "blah"
main = do
let trie = T.fromList [(B.pack [p], p) | p <- [0..]]
(file,hdl) <- openTempFile "/tmp" tmp
B.hPutStr hdl (B.concat $ L.toChunks $ encode trie)
hClose hdl
putStrLn file
t <- B.readFile file
let trie' = decode (L.fromChunks [t])
print (trie' == trie)
我尝试了几种解决方案,但都无法解决我的问题:(
- 使用withBinaryFile:
decodeFile' path = withBinaryFile path ReadMode doDecode where doDecode h = do c <- LBS.hGetContents h return $! decode c
safeEncodeFile path value = do
fd <- openFd path WriteOnly (Just 0o600) (defaultFileFlags {trunc = True})
waitToSetLock fd (WriteLock, AbsoluteSeek, 0, 0)
let cs = encode value
let outFn = LBS.foldrChunks (\c rest -> writeChunk fd c >> rest) (return ()) cs
outFn
closeFd fd
where
writeChunk fd bs = unsafeUseAsCString bs $ \ptr ->
fdWriteBuf fd (castPtr ptr) (fromIntegral $ BS.length bs)
如果您能够生成一些最小的代码片段来运行和演示问题,这将非常有用。目前我不认为这与您的程序跟踪没有问题,因为这些句柄是打开/关闭的,读/写操作相互阻碍。下面是我制作的测试代码示例,效果很好
import Data.Trie as T
import qualified Data.ByteString as B
import qualified Data.ByteString.Lazy as L
import Data.Binary
import System.IO
tmp = "blah"
main = do
let trie = T.fromList [(B.pack [p], p) | p <- [0..]]
(file,hdl) <- openTempFile "/tmp" tmp
B.hPutStr hdl (B.concat $ L.toChunks $ encode trie)
hClose hdl
putStrLn file
t <- B.readFile file
let trie' = decode (L.fromChunks [t])
print (trie' == trie)
import Data.Trie作为T
将限定数据.ByteString作为B导入
将限定的Data.ByteString.Lazy作为L导入
导入数据。二进制
导入系统.IO
tmp=“诸如此类”
main=do
让TIE=T.OfList[(B.PoC[P],p)p ]这不是一个严格的问题——例如,您的第二个解决方案保证所有数据都被读取。您的二进制解析器本身可能存在一些问题吗?考虑使用谷类而不是二进制。.因此,二进制实例的get函数基本上如下所示:“get=do trie谢谢。它或多或少与您的代码类似,但读写是交换的。1.读取数据,2.修改数据,3.使用Binary.encodeFile写入数据,这将在写入之前截断文件。因此,我认为这是一种竞争条件,即在覆盖文件时读取文件的进程加载(请参阅我文章中的“编辑”)。
safeDecodeFile def path = do
e <- doesFileExist path
if e
then do fd <- openFd path ReadOnly Nothing
(defaultFileFlags{nonBlock=True})
waitToSetLock fd (ReadLock, AbsoluteSeek, 0, 0)
c <- fdGetContents fd
let !v = decode $! c
return v
else return def
fdGetContents fd = lazyRead
where
lazyRead = unsafeInterleaveIO loop
loop = do blk <- readBlock fd
case blk of
Nothing -> return LBS.Empty
Just c -> do cs <- lazyRead
return (LBS.Chunk c cs)
readBlock fd = do buf <- mallocBytes 4096
readSize <- fdReadBuf fd buf 4096
if readSize == 0
then do free buf
closeFd fd
return Nothing
else do bs <- unsafePackCStringFinalizer buf
(fromIntegral readSize)
(free buf)
return $ Just bs
import qualified Data.ByteString as BS
import qualified Data.ByteString.Lazy as LBS
import qualified Data.ByteString.Lazy.Internal as LBS
import Data.Trie as T
import qualified Data.ByteString as B
import qualified Data.ByteString.Lazy as L
import Data.Binary
import System.IO
tmp = "blah"
main = do
let trie = T.fromList [(B.pack [p], p) | p <- [0..]]
(file,hdl) <- openTempFile "/tmp" tmp
B.hPutStr hdl (B.concat $ L.toChunks $ encode trie)
hClose hdl
putStrLn file
t <- B.readFile file
let trie' = decode (L.fromChunks [t])
print (trie' == trie)