Scala Akka stream group仅适用于任何一个_Scala_Grouping_Akka Stream_Either_Scala 2.11

Scala Akka stream group仅适用于任何一个

scala

Scala Akka stream group仅适用于任何一个,scala,grouping,akka-stream,either,scala-2.11,Scala,Grouping,Akka Stream,Either,Scala 2.11,我有一个源代码，它发出或[String，MyClass] 我想用批处理的MyClass调用一个外部服务，然后用或[String，ExternalServiceResponse]继续下游，这就是为什么我需要对流的元素进行分组如果流只发出MyClass元素，那么就很容易了-只需调用grouped： val source: Source[MyClass, NotUsed] = <custom implementation> source .grouped(10)

我有一个源代码，它发出

或[String，MyClass]

我想用批处理的

MyClass

调用一个外部服务，然后用

或[String，ExternalServiceResponse]

继续下游，这就是为什么我需要对流的元素进行分组

如果流只发出

MyClass

元素，那么就很容易了-只需调用

grouped

：

val source: Source[MyClass, NotUsed] = <custom implementation>
source
  .grouped(10)                 // Seq[MyClass]
  .map(callExternalService(_)) // ExternalServiceResponse

val-source:source[MyClass，未使用]=
来源
.grouped（10）//Seq[MyClass]
.map（callExternalService（））//ExternalServiceResponse

但是，在我的场景中，如何仅将元素分组到任意一个元素的右侧

val source: Source[Either[String, MyClass], NotUsed] = <custom implementation>
source
  .???                                                      // Either[String, Seq[MyClass]]
  .map {
    case Right(myClasses) => callExternalService(myClasses)
    case Left(string) => Left(string)
  }                                                         // Either[String, ExternalServiceResponse]

val-source:source[字符串，MyClass]，未使用]
来源
.???                                                      // [String，Seq[MyClass]]
.地图{
大小写权限（myClasses）=>callExternalService（myClasses）
大小写左（字符串）=>左（字符串）
}//要么[String，ExternalServiceResponse]

下面的方法很有效，但还有更惯用的方法吗

val source: Source[Either[String, MyClass], NotUsed] = <custom implementation>
source
  .groupBy(2, either => either.isRight)
  .grouped(10)
  .map(input => input.headOption match {
    case Some(Right(_)) =>
      callExternalService(input.map(item => item.right.get))
    case _ =>
      input
  })
  .mapConcat(_.to[scala.collection.immutable.Iterable])
  .mergeSubstreams

val-source:source[字符串，MyClass]，未使用]
来源
.groupBy（2，任择=>任择.isRight）
.分组（10）
.map（输入=>input.headOption匹配{
案例部分（右（））=>
callExternalService（input.map（item=>item.right.get））
案例=>
输入
})
.mapConcat（u.to[scala.collection.immutable.Iterable]）
.合并子流

这应该将

或[L，R]

的源转换为

或[L，Seq[R]]

的源，并具有可配置的

右分组
def groupRights[L, R](groupSize: Int)(in: Source[Either[L, R], NotUsed]): Source[Either[L, Seq[R]], NotUsed] =
  in.map(Option _)  // Yep, an Option[Either[L, R]]
    .concat(Source.single(None)) // to emit when `in` completes
    .statefulMapConcat { () =>
      val buffer = new scala.collection.mutable.ArrayBuffer[R](groupSize)

      def dumpBuffer(): List[Either[L, Seq[R]] = {
        val out = List(Right(buffer.toList))
        buffer.clear()
        out
      }

      incoming: Option[Either[L,R]] => {
        incoming.map { _.fold(
            l => List(Left(l)),  // unfortunate that we have to re-wrap
            r => {
              buffer += r
              if (buffer.size == groupSize) {
                dumpBuffer()
              } else {
                Nil
              }
            }
          )
        }.getOrElse(dumpBuffer()) // End of stream
      }
    }

除此之外，我还要注意调用外部服务的下游代码可以重写为
.map(_.right.map(callExternalService))

如果您可以使用parallelismn
可靠地调用外部服务，那么使用以下方法也值得：
.mapAsync(n) { e.fold(
    l => Future.successful(Left(l)),
    r => Future { Right(callExternalService(r)) }
  )
}

如果您想以保持顺序为代价最大限度地提高吞吐量，甚至可以将mapsync
替换为mapsynordered
这应该将or[L，R]
的源转换为or[L，Seq[R]
的源，并使用Right
s的可配置分组
def groupRights[L, R](groupSize: Int)(in: Source[Either[L, R], NotUsed]): Source[Either[L, Seq[R]], NotUsed] =
  in.map(Option _)  // Yep, an Option[Either[L, R]]
    .concat(Source.single(None)) // to emit when `in` completes
    .statefulMapConcat { () =>
      val buffer = new scala.collection.mutable.ArrayBuffer[R](groupSize)

      def dumpBuffer(): List[Either[L, Seq[R]] = {
        val out = List(Right(buffer.toList))
        buffer.clear()
        out
      }

      incoming: Option[Either[L,R]] => {
        incoming.map { _.fold(
            l => List(Left(l)),  // unfortunate that we have to re-wrap
            r => {
              buffer += r
              if (buffer.size == groupSize) {
                dumpBuffer()
              } else {
                Nil
              }
            }
          )
        }.getOrElse(dumpBuffer()) // End of stream
      }
    }

除此之外，我还要注意调用外部服务的下游代码可以重写为
.map(_.right.map(callExternalService))

如果您可以使用parallelismn
可靠地调用外部服务，那么使用以下方法也值得：
.mapAsync(n) { e.fold(
    l => Future.successful(Left(l)),
    r => Future { Right(callExternalService(r)) }
  )
}

如果您想以保持顺序为代价最大限度地提高吞吐量，甚至可以将mapsync
替换为mapsyncUnordered
，您可以将源代码分成两个分支，以各自的方式处理权限，然后合并回两个子流：
// case class MyClass(x: Int)
// case class ExternalServiceResponse(xs: Seq[MyClass])
// def callExternalService(xs: Seq[MyClass]): ExternalServiceResponse =
//    ExternalServiceResponse(xs)
// val source: Source[Either[String, MyClass], _] =
//   Source(List(Right(MyClass(1)), Left("2"), Right(MyClass(3)), Left("4"), Right(MyClass(5))))

val lefts: Source[Either[String, Nothing], _] =
  source
    .collect { case Left(l) => Left(l) }

val rights: Source[Either[Nothing, ExternalServiceResponse], _] =
  source
    .collect { case Right(x: MyClass) => x }
    .grouped(2)
    .map(callExternalService)
    .map(Right(_))

val out: Source[Either[String, ExternalServiceResponse], _] = rights.merge(lefts)

// out.runForeach(println)
// Left(2)
// Right(ExternalServiceResponse(Vector(MyClass(1), MyClass(3))))
// Left(4)
// Right(ExternalServiceResponse(Vector(MyClass(5))))

您可以将eithers源划分为两个分支，以便以自己的方式处理权限，然后合并回两个子流：
// case class MyClass(x: Int)
// case class ExternalServiceResponse(xs: Seq[MyClass])
// def callExternalService(xs: Seq[MyClass]): ExternalServiceResponse =
//    ExternalServiceResponse(xs)
// val source: Source[Either[String, MyClass], _] =
//   Source(List(Right(MyClass(1)), Left("2"), Right(MyClass(3)), Left("4"), Right(MyClass(5))))

val lefts: Source[Either[String, Nothing], _] =
  source
    .collect { case Left(l) => Left(l) }

val rights: Source[Either[Nothing, ExternalServiceResponse], _] =
  source
    .collect { case Right(x: MyClass) => x }
    .grouped(2)
    .map(callExternalService)
    .map(Right(_))

val out: Source[Either[String, ExternalServiceResponse], _] = rights.merge(lefts)

// out.runForeach(println)
// Left(2)
// Right(ExternalServiceResponse(Vector(MyClass(1), MyClass(3))))
// Left(4)
// Right(ExternalServiceResponse(Vector(MyClass(5))))