Haskell 解码JSON流，其中一些值需要在其他值之前_Haskell_Aeson_Haskell Pipes

Haskell 解码JSON流，其中一些值需要在其他值之前

haskell

Haskell 解码JSON流，其中一些值需要在其他值之前,haskell,aeson,haskell-pipes,Haskell,Aeson,Haskell Pipes,假设我们有这样一个JSON对象（base64由TestRing编码）：现在，我们希望从源接收图像，并使用id标记中的信息将其存储在某个位置。因此，这意味着id必须提前解析（以确定图像的位置），而image必须以流式方式解析。这是直截了当的吗我计划使用pipes aeson、aws（用于S3存储）和pipes从Websocket制作人处进行流式解码，其中S3存储桶作为消费者（只有解析id以确定S3存储桶的位置，才能创建流式解码）。从这个方法来看，我不知道我是否真的能做到上面所说的。这是我第一次

假设我们有这样一个JSON对象（base64由TestRing编码）：

现在，我们希望从源接收

图像

，并使用

id

标记中的信息将其存储在某个位置。因此，这意味着

id

必须提前解析（以确定图像的位置），而

image

必须以流式方式解析。这是直截了当的吗

我计划使用

pipes aeson

、

aws

（用于

S3

存储）和

pipes

从

Websocket

制作人处进行流式解码，其中

S3

存储桶作为消费者（只有解析

id

以确定

S3

存储桶的位置，才能创建流式解码）。从这个方法来看，我不知道我是否真的能做到上面所说的。这是我第一次尝试在JSON和管道中进行流式传输。因此，我们将非常感谢您的帮助

一个简单的文件系统读写示例也可以作为

websocketproducer

和

S3 consumer

的代理

附录

由于JSON键值对是按照数组排序的，因此对于上面定义的数据类型，

image

数据可能在

id

之前。因此，将其更改为JSON数组也可能有所帮助（Haskell中的元组，

aeson

TH派生似乎转换为有序数组）。如果需要，请随时更改数据类型定义，以强制执行解码顺序。例如，数据类型可能更改为：

TaggedImage = TaggedImage (Text,ByteString)

我相信您将无法重用

pipes aeson

库，因为它不提供在解码的JSON记录的嵌套字段上进行流式处理的方法，也不支持类似光标的结构导航。这意味着您需要手工解析JSON记录的框架

此外，还需要做一些工作，以将

base64 bytestring

封装在类似于

管道的API中，并使用以下类型：
-- Convert a base64-encoded stream to a raw byte stream
decodeBase64
    :: Producer ByteString m r
    -- ^ Base64-encoded bytes
    -> Producer ByteString m (Either SomeException (Producer ByteString m r)) 
    -- ^ Raw bytes

请注意，如果解码成功完成，结果将返回剩余字节字符串的生产者
（即base64编码字节之后的所有内容）。这使您可以在图像字节结束的位置继续解析
但是，假设您有一个decodeBase64
函数，那么代码将如何工作的大致轮廓是您将有三个部分：

使用适用于管道的binary
解析器分析图像字节前的记录前缀
使用decodeBase64
功能对解码图像字节进行流式传输
还可以使用适用于管道的解析器binary
解析图像字节后的记录后缀


换句话说，类型和实现大致如下所示：
-- This would match the "{ 'id' : 'foo', 'image' : '" prefix of the JSON record
skipPrefix :: Data.Binary.Get ()

skipPrefix’ :: Monad m => Producer ByteString m r -> m (Either DecodingError (Producer ByteString m r))
skipPrefix’ = execStateT (Pipes.Binary.decodeGet skipPrefix)

— This would match the "' }" suffix of the JSON record
skipSuffix :: Data.Binary.Get ()

skipSuffix’ :: Monad m => Producer ByteString m r -> m (Either DecodingError (Producer ByteString m r))
skipSuffix’ = execStateT (Pipes.Binary.decodeGet skipSuffix)

streamImage
    ::  Monad m
    =>  Producer ByteString m r
    ->  Producer ByteString m (Either SomeException (Producer ByteString m r))
streamImage p0 = do
    e0 <- lift (skipPrefix’ p0)
    case e0 of
        Left exc -> return (Left (toException exc))
        Right p1 -> do
            e1 <- decodeBase64 p1
            case e1 of
                Left exc -> return (Left exc)
                Right p2 -> do
                    e2 <- lift (skipSuffix’ p2)
                    case e2 of
                        Left exc -> return (Left (toException exc))
                        Right p3 -> return (Right p3)

——这将匹配JSON记录的“{'id'：'foo'，'image'：'”前缀
skipPrefix:：Data.Binary.Get（）
skipPrefix'：：Monad m=>Producer ByteString m r->m（解码错误（Producer ByteString m r））
skipPrefix'=execStateT（Pipes.Binary.decodeGet skipPrefix）
-这将匹配JSON记录的“'}”后缀
skipSuffix:：Data.Binary.Get（）
skipSuffix'：：Monad m=>Producer ByteString m r->m（解码错误（Producer ByteString m r））
skipSuffix'=execStateT（Pipes.Binary.decodeGet skipSuffix）
流图像
：：单子m
=>生产者通过测试环m r
->Producer-ByteString m（某个异常（Producer-ByteString m-r））
streamImage p0=do
e0返回（左侧（例外exc））
右p1->do
e1返回（左exc）
右p2->do
e2返回（左（例外exc））
右p3->返回（右p3）

换句话说，streamImage
将以一个Producer
作为从JSON记录的第一个字符开始的输入，并将从该记录提取的解码图像字节流化。如果解码成功，那么它将在JSON记录之后立即返回剩余的字节流。
我假设您想将图像直接流式传输到磁盘？@ErikR，是的，出于示例目的，这应该可以。磁盘位置取决于id。我认为解码后的不能直接流到磁盘。每个解析的顶级JSON值在被提供给您之前都会被完整地解析。@ErikR，是的，decoded
签名显示相同的内容。手动解析器能代替解码的工作吗？比如，在有序JSON数组中手动解析id
之后，从bytestring生成一个流，并将其直接传递到pipes
？不确定aeson是否允许这样做。您必须编写自己的JSON解析器。使用两个命令实现简单的面向行的协议会更容易：Id…
和chunk…
。当您看到一个id…
行时，它将开始一个新文件。当您看到块…
行时，它意味着将该（base64）编码块写入当前文件。也许一个空的块行意味着文件结束。
-- This would match the "{ 'id' : 'foo', 'image' : '" prefix of the JSON record
skipPrefix :: Data.Binary.Get ()

skipPrefix’ :: Monad m => Producer ByteString m r -> m (Either DecodingError (Producer ByteString m r))
skipPrefix’ = execStateT (Pipes.Binary.decodeGet skipPrefix)

— This would match the "' }" suffix of the JSON record
skipSuffix :: Data.Binary.Get ()

skipSuffix’ :: Monad m => Producer ByteString m r -> m (Either DecodingError (Producer ByteString m r))
skipSuffix’ = execStateT (Pipes.Binary.decodeGet skipSuffix)

streamImage
    ::  Monad m
    =>  Producer ByteString m r
    ->  Producer ByteString m (Either SomeException (Producer ByteString m r))
streamImage p0 = do
    e0 <- lift (skipPrefix’ p0)
    case e0 of
        Left exc -> return (Left (toException exc))
        Right p1 -> do
            e1 <- decodeBase64 p1
            case e1 of
                Left exc -> return (Left exc)
                Right p2 -> do
                    e2 <- lift (skipSuffix’ p2)
                    case e2 of
                        Left exc -> return (Left (toException exc))
                        Right p3 -> return (Right p3)