Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/kotlin/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Akka 如何将已排序流的项与子流分组?_Akka_Akka Stream - Fatal编程技术网

Akka 如何将已排序流的项与子流分组?

Akka 如何将已排序流的项与子流分组?,akka,akka-stream,Akka,Akka Stream,你们能解释一下如何在akka streams中使用新的groupBy?似乎很没用groupBy用于返回(T,Source),但不再返回。下面是我的例子(我模仿了docs中的一个): 这只是挂起。可能它挂起是因为子流的数量低于唯一键的数量。但如果我有无限的流,我该怎么办呢?我想分组,直到关键更改 在我的真实流中,数据总是按我分组的值排序。也许我根本不需要groupBy?如果您的流数据总是被排序的,您可以通过以下方式利用它进行分组: val source = Source(List( 1 -&g

你们能解释一下如何在akka streams中使用新的
groupBy
?似乎很没用
groupBy
用于返回
(T,Source)
,但不再返回。下面是我的例子(我模仿了docs中的一个):

这只是挂起。可能它挂起是因为子流的数量低于唯一键的数量。但如果我有无限的流,我该怎么办呢?我想分组,直到关键更改


在我的真实流中,数据总是按我分组的值排序。也许我根本不需要
groupBy

如果您的流数据总是被排序的,您可以通过以下方式利用它进行分组:

val source = Source(List(
  1 -> "1a", 1 -> "1b", 1 -> "1c",
  2 -> "2a", 2 -> "2b",
  3 -> "3a", 3 -> "3b", 3 -> "3c",
  4 -> "4a",
  5 -> "5a", 5 -> "5b", 5 -> "5c",
  6 -> "6a", 6 -> "6b",
  7 -> "7a",
  8 -> "8a", 8 -> "8b",
  9 -> "9a", 9 -> "9b",
))

source
  // group elements by pairs
  // the last one will be not a pair, but a single element
  .sliding(2,1)
  // when both keys in a pair are different, we split the group into a subflow
  .splitAfter(pair => (pair.headOption, pair.lastOption) match {
    case (Some((key1, _)), Some((key2, _))) => key1 != key2
  })
  // then we cut only the first element of the pair 
  // to reconstruct the original stream, but grouped by sorted key
  .mapConcat(_.headOption.toList)
  // then we fold the substream into a single element
  .fold(0 -> List.empty[String]) {
    case ((_, values), (key, value)) => key -> (value +: values)
  }
  // merge it back and dump the results
  .mergeSubstreams
  .runWith(Sink.foreach(println))
最后,您将获得以下结果:

(1,List(1c, 1b, 1a))
(2,List(2b, 2a))
(3,List(3c, 3b, 3a))
(4,List(4a))
(5,List(5c, 5b, 5a))
(6,List(6b, 6a))
(7,List(7a))
(8,List(8b, 8a))
(9,List(9a))

但是与groupBy相比,您不受不同键数量的限制。

您也可以使用
statefulMapConcat
实现它,这将稍微便宜一些,因为它不做任何子实体化(但您必须忍受使用
var
s的耻辱):


我最终实现了定制阶段

class GroupAfterKeyChangeStage[K, T](keyForItem: T ⇒ K, maxBufferSize: Int) extends GraphStage[FlowShape[T, List[T]]] {

  private val in = Inlet[T]("GroupAfterKeyChangeStage.in")
  private val out = Outlet[List[T]]("GroupAfterKeyChangeStage.out")

  override val shape: FlowShape[T, List[T]] =
    FlowShape(in, out)

  override def createLogic(inheritedAttributes: Attributes): GraphStageLogic = new GraphStageLogic(shape) with InHandler with OutHandler {

    private val buffer = new ListBuffer[T]
    private var currentKey: Option[K] = None

    // InHandler
    override def onPush(): Unit = {
      val nextItem = grab(in)
      val nextItemKey = keyForItem(nextItem)

      if (currentKey.forall(_ == nextItemKey)) {
        if (currentKey.isEmpty)
          currentKey = Some(nextItemKey)

        if (buffer.size == maxBufferSize)
          failStage(new RuntimeException(s"Maximum buffer size is exceeded on key $nextItemKey"))
        else {
          buffer += nextItem
          pull(in)
        }
      } else {
        val result = buffer.result()
        buffer.clear()
        buffer += nextItem
        currentKey = Some(nextItemKey)
        push(out, result)
      }
    }

    // OutHandler
    override def onPull(): Unit = {
      if (isClosed(in))
        failStage(new RuntimeException("Upstream finished but there was a truncated final frame in the buffer"))
      else
        pull(in)
    }

    // InHandler
    override def onUpstreamFinish(): Unit = {
      val result = buffer.result()
      if (result.nonEmpty) {
        emit(out, result)
        completeStage()
      } else
        completeStage()

      // else swallow the termination and wait for pull
    }

    override def postStop(): Unit = {
      buffer.clear()
    }

    setHandlers(in, out, this)
  }
}

如果你不想复制粘贴它,我已经将它添加到我维护的文件中。为了使用,您需要添加

Resolver.bintrayRepo("cppexpert", "maven")
给你的解决者。将傻瓜添加到依赖项中

"com.walkmind" %% "scala-tricks" % "2.15"
它在
com.walkmind.akkastream.FlowExt
中作为流实现

def groupSortedByKey[K, T](keyForItem: T ⇒ K, maxBufferSize: Int): Flow[T, List[T], NotUsed]
我的例子是

source
  .via(FlowExt.groupSortedByKey(_._1, 128))
一年后,有一门课是这样做的:

libraryDependencies += "com.typesafe.akka" %% "akka-stream-contrib" % "0.9"
以及:


好主意!昨天我还使用了
splitWhen
实现了它,但是我必须使用包含最后一个ID的
var
@shutty最后一项丢失。最后一组项不幸丢失。通过切换到新行为调用Emit时,Emit已经处理了未拉出的情况,因此无需为此阶段失败。太棒了。正是我需要的,在一行。谢谢
source
  .via(FlowExt.groupSortedByKey(_._1, 128))
libraryDependencies += "com.typesafe.akka" %% "akka-stream-contrib" % "0.9"
import akka.stream.contrib.AccumulateWhileUnchanged
source.via(new AccumulateWhileUnchanged(_._1))