Scala Akka stream group仅适用于任何一个
我有一个源代码,它发出Scala Akka stream group仅适用于任何一个,scala,grouping,akka-stream,either,scala-2.11,Scala,Grouping,Akka Stream,Either,Scala 2.11,我有一个源代码,它发出或[String,MyClass] 我想用批处理的MyClass调用一个外部服务,然后用或[String,ExternalServiceResponse]继续下游,这就是为什么我需要对流的元素进行分组 如果流只发出MyClass元素,那么就很容易了-只需调用grouped: val source: Source[MyClass, NotUsed] = <custom implementation> source .grouped(10)
或[String,MyClass]
我想用批处理的MyClass
调用一个外部服务,然后用或[String,ExternalServiceResponse]
继续下游,这就是为什么我需要对流的元素进行分组
如果流只发出MyClass
元素,那么就很容易了-只需调用grouped
:
val source: Source[MyClass, NotUsed] = <custom implementation>
source
.grouped(10) // Seq[MyClass]
.map(callExternalService(_)) // ExternalServiceResponse
val-source:source[MyClass,未使用]=
来源
.grouped(10)//Seq[MyClass]
.map(callExternalService())//ExternalServiceResponse
但是,在我的场景中,如何仅将元素分组到任意一个元素的右侧
val source: Source[Either[String, MyClass], NotUsed] = <custom implementation>
source
.??? // Either[String, Seq[MyClass]]
.map {
case Right(myClasses) => callExternalService(myClasses)
case Left(string) => Left(string)
} // Either[String, ExternalServiceResponse]
val-source:source[字符串,MyClass],未使用]
来源
.??? // [String,Seq[MyClass]]
.地图{
大小写权限(myClasses)=>callExternalService(myClasses)
大小写左(字符串)=>左(字符串)
}//要么[String,ExternalServiceResponse]
下面的方法很有效,但还有更惯用的方法吗
val source: Source[Either[String, MyClass], NotUsed] = <custom implementation>
source
.groupBy(2, either => either.isRight)
.grouped(10)
.map(input => input.headOption match {
case Some(Right(_)) =>
callExternalService(input.map(item => item.right.get))
case _ =>
input
})
.mapConcat(_.to[scala.collection.immutable.Iterable])
.mergeSubstreams
val-source:source[字符串,MyClass],未使用]
来源
.groupBy(2,任择=>任择.isRight)
.分组(10)
.map(输入=>input.headOption匹配{
案例部分(右())=>
callExternalService(input.map(item=>item.right.get))
案例=>
输入
})
.mapConcat(u.to[scala.collection.immutable.Iterable])
.合并子流
这应该将或[L,R]
的源转换为或[L,Seq[R]]
的源,并具有可配置的右分组
def groupRights[L, R](groupSize: Int)(in: Source[Either[L, R], NotUsed]): Source[Either[L, Seq[R]], NotUsed] =
in.map(Option _) // Yep, an Option[Either[L, R]]
.concat(Source.single(None)) // to emit when `in` completes
.statefulMapConcat { () =>
val buffer = new scala.collection.mutable.ArrayBuffer[R](groupSize)
def dumpBuffer(): List[Either[L, Seq[R]] = {
val out = List(Right(buffer.toList))
buffer.clear()
out
}
incoming: Option[Either[L,R]] => {
incoming.map { _.fold(
l => List(Left(l)), // unfortunate that we have to re-wrap
r => {
buffer += r
if (buffer.size == groupSize) {
dumpBuffer()
} else {
Nil
}
}
)
}.getOrElse(dumpBuffer()) // End of stream
}
}
除此之外,我还要注意调用外部服务的下游代码可以重写为
.map(_.right.map(callExternalService))
如果您可以使用parallelismn
可靠地调用外部服务,那么使用以下方法也值得:
.mapAsync(n) { e.fold(
l => Future.successful(Left(l)),
r => Future { Right(callExternalService(r)) }
)
}
如果您想以保持顺序为代价最大限度地提高吞吐量,甚至可以将mapsync
替换为mapsynordered
这应该将or[L,R]
的源转换为or[L,Seq[R]
的源,并使用Right
s的可配置分组
def groupRights[L, R](groupSize: Int)(in: Source[Either[L, R], NotUsed]): Source[Either[L, Seq[R]], NotUsed] =
in.map(Option _) // Yep, an Option[Either[L, R]]
.concat(Source.single(None)) // to emit when `in` completes
.statefulMapConcat { () =>
val buffer = new scala.collection.mutable.ArrayBuffer[R](groupSize)
def dumpBuffer(): List[Either[L, Seq[R]] = {
val out = List(Right(buffer.toList))
buffer.clear()
out
}
incoming: Option[Either[L,R]] => {
incoming.map { _.fold(
l => List(Left(l)), // unfortunate that we have to re-wrap
r => {
buffer += r
if (buffer.size == groupSize) {
dumpBuffer()
} else {
Nil
}
}
)
}.getOrElse(dumpBuffer()) // End of stream
}
}
除此之外,我还要注意调用外部服务的下游代码可以重写为
.map(_.right.map(callExternalService))
如果您可以使用parallelismn
可靠地调用外部服务,那么使用以下方法也值得:
.mapAsync(n) { e.fold(
l => Future.successful(Left(l)),
r => Future { Right(callExternalService(r)) }
)
}
如果您想以保持顺序为代价最大限度地提高吞吐量,甚至可以将mapsync
替换为mapsyncUnordered
,您可以将源代码分成两个分支,以各自的方式处理权限,然后合并回两个子流:
// case class MyClass(x: Int)
// case class ExternalServiceResponse(xs: Seq[MyClass])
// def callExternalService(xs: Seq[MyClass]): ExternalServiceResponse =
// ExternalServiceResponse(xs)
// val source: Source[Either[String, MyClass], _] =
// Source(List(Right(MyClass(1)), Left("2"), Right(MyClass(3)), Left("4"), Right(MyClass(5))))
val lefts: Source[Either[String, Nothing], _] =
source
.collect { case Left(l) => Left(l) }
val rights: Source[Either[Nothing, ExternalServiceResponse], _] =
source
.collect { case Right(x: MyClass) => x }
.grouped(2)
.map(callExternalService)
.map(Right(_))
val out: Source[Either[String, ExternalServiceResponse], _] = rights.merge(lefts)
// out.runForeach(println)
// Left(2)
// Right(ExternalServiceResponse(Vector(MyClass(1), MyClass(3))))
// Left(4)
// Right(ExternalServiceResponse(Vector(MyClass(5))))
您可以将eithers源划分为两个分支,以便以自己的方式处理权限,然后合并回两个子流:
// case class MyClass(x: Int)
// case class ExternalServiceResponse(xs: Seq[MyClass])
// def callExternalService(xs: Seq[MyClass]): ExternalServiceResponse =
// ExternalServiceResponse(xs)
// val source: Source[Either[String, MyClass], _] =
// Source(List(Right(MyClass(1)), Left("2"), Right(MyClass(3)), Left("4"), Right(MyClass(5))))
val lefts: Source[Either[String, Nothing], _] =
source
.collect { case Left(l) => Left(l) }
val rights: Source[Either[Nothing, ExternalServiceResponse], _] =
source
.collect { case Right(x: MyClass) => x }
.grouped(2)
.map(callExternalService)
.map(Right(_))
val out: Source[Either[String, ExternalServiceResponse], _] = rights.merge(lefts)
// out.runForeach(println)
// Left(2)
// Right(ExternalServiceResponse(Vector(MyClass(1), MyClass(3))))
// Left(4)
// Right(ExternalServiceResponse(Vector(MyClass(5))))