Haskell 吃了我的麦片粥，也吃了_Haskell

Haskell 吃了我的麦片粥，也吃了

haskell

Haskell 吃了我的麦片粥，也吃了,haskell,Haskell,我正在使用Data.Serialize.Get并尝试定义以下组合器： getConsumed :: Get a -> Get (ByteString, a) 它的行为应该类似于传入的Get操作，但也返回Get使用的ByteString。用例是我有一个需要解析和散列的二进制结构，在解析之前我不知道长度尽管这个组合器的语义很简单，但它的实现却异常复杂没有深入研究Get的内部结构，我的本能是使用这个怪物： getConsumed :: Get a -> Get (B.ByteStri

我正在使用

Data.Serialize.Get

并尝试定义以下组合器：

getConsumed :: Get a -> Get (ByteString, a)

它的行为应该类似于传入的

Get

操作，但也返回

Get

使用的

ByteString

。用例是我有一个需要解析和散列的二进制结构，在解析之前我不知道长度

尽管这个组合器的语义很简单，但它的实现却异常复杂

没有深入研究

Get

的内部结构，我的本能是使用这个怪物：

getConsumed :: Get a -> Get (B.ByteString, a)
getConsumed g = do
  (len, r) <- lookAhead $ do
                before <- remaining
                res <- g
                after <- remaining
                return (before - after, res)
  bs <- getBytes len
  return (bs, r)

所以我一定是在什么地方误解了谷类食品

有没有人看到我对

getconsumered

的定义有什么问题，或者对如何实现它有更好的想法

编辑：Dan Doel指出，

remaining

可以只返回给定块的剩余长度，如果您跨越块边界，这不是很有用。在这种情况下，我不确定操作的目的是什么，但这解释了为什么我的代码不起作用！现在我只需要找到一个可行的替代方案

编辑2：在仔细考虑之后，如果我在循环中手动输入

Get

（

remaining>=getBytes

）单个块（

remaining>=getBytes

），并跟踪它正在吃什么，那么

remaining

会给我提供当前块的长度，这似乎对我有利。我也还没有设法使这种方法起作用，但它似乎比最初的方法更有希望

编辑3：如果有人好奇，下面是上面编辑2的代码：

getChunk :: Get B.ByteString
getChunk = remaining >>= getBytes

getConsumed :: Get a -> Get (B.ByteString, a)
getConsumed g = do
    (len, res) <- lookAhead $ measure g
    bs <- getBytes len
    return (bs, res)
  where
  measure :: Get a -> Get (Int ,a)
  measure g = do
    chunk <- getChunk
    measure' (B.length chunk) (runGetPartial g chunk)

  measure' :: Int -> Result a -> Get (Int, a)
  measure' !n (Fail e) = fail e
  measure' !n (Done r bs) = return (n - B.length bs, r)
  measure' !n (Partial f) = do
    chunk <- getChunk
    measure' (n + B.length chunk) (f chunk)

谷物包装没有存储足够的信息，无法简单地实现您想要的功能。我希望您使用块的想法可能会奏效，或者使用一个特殊的

runGet

。用叉子叉谷类食物并使用内部构件可能是最简单的方法

写你自己的可以工作，这是我做的。我的自定义库确实实现了足够的机制来实现您想要的功能：

import Text.ProtocolBuffers.Get
import Control.Applicative
import qualified Data.ByteString as B

getConsumed :: Get a -> Get (B.ByteString, a)
getConsumed thing = do
  start <- bytesRead
  (a,stop) <- lookAhead ((,) <$> thing <*> bytesRead)
  bs <- getByteString (fromIntegral (stop-start))
  return (bs,a)

import Text.ProtocolBuffers.Get
导入控制
将限定数据.ByteString作为B导入
getConsumed:：Get a->Get（B.ByteString，a）
做某事
开始编辑：另一个不需要额外计算的解决方案
getConsumed :: Get a -> Get (B.ByteString, a)
getConsumed g = do
  (len, r) <- lookAhead $ do
                (res,after) <- lookAhead $ liftM2 (,) g remaining
                total <- remaining
                return (total-after, res)
  bs <- getBytes len
  return (bs, r)

FWIW和iteratee包这是enumWithparser stream2stream
，它基本上完全按照您在第二次编辑中的建议执行。你可能会发现这个定义很有用，或者可能是countConsumed
，它的功能稍有不同，但更简单。这很酷，但我使用谷物的主要原因是它在其他库中很容易得到支持，比如导管：/你的第二个选项将执行（至少一个子集）g的工作两次，对吗？不过，第一种方法似乎有效。谢谢
import Text.ProtocolBuffers.Get
import Control.Applicative
import qualified Data.ByteString as B

getConsumed :: Get a -> Get (B.ByteString, a)
getConsumed thing = do
  start <- bytesRead
  (a,stop) <- lookAhead ((,) <$> thing <*> bytesRead)
  bs <- getByteString (fromIntegral (stop-start))
  return (bs,a)

getConsumed :: Get a -> Get (B.ByteString, a)
getConsumed g = do
  (len, r) <- lookAhead $ do
                (res,after) <- lookAhead $ liftM2 (,) g remaining
                total <- remaining
                return (total-after, res)
  bs <- getBytes len
  return (bs, r)

getConsumed :: Get a -> Get (B.ByteString, a)
getConsumed g = do
  _ <- lookAhead g -- Make sure all necessary chunks are preloaded
  (len, r) <- lookAhead $ do
                before <- remaining
                res <- g
                after <- remaining
                return (before - after, res)
  bs <- getBytes len
  return (bs, r)