Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala 如何合并已排序的'Stream'或'List'中相邻的相似条目`_Scala_Functional Programming - Fatal编程技术网

Scala 如何合并已排序的'Stream'或'List'中相邻的相似条目`

Scala 如何合并已排序的'Stream'或'List'中相邻的相似条目`,scala,functional-programming,Scala,Functional Programming,给予 大(>1000000个条目,不要期望它适合内存) 排序(写入元组的第一个值) 流状 val ss = List( (1, "2.5"), (1, "5.0"), (2, "3.0"), (2, "4.0"), (2, "6.0"), (3, "1.0")).toStream // just for demo val xs = List( (1, "2.5"), (1, "5.0"), (2, "3.0"), (2, "4.0"), (2, "6.0"), (3, "1.0")) 我希

给予

  • 大(>1000000个条目,不要期望它适合内存)
  • 排序(写入元组的第一个值)
流状

val ss = List( (1, "2.5"), (1, "5.0"), (2, "3.0"), (2, "4.0"), (2, "6.0"), (3, "1.0")).toStream
// just for demo
val xs = List( (1, "2.5"), (1, "5.0"), (2, "3.0"), (2, "4.0"), (2, "6.0"), (3, "1.0"))
我希望连接相邻的条目,以便转换的输出成为

List( (1, "2.5 5.0"), (2, "3.0 4.0 6.0"), (3, "6.0") )
第二个元组值将由某个幺半群函数合并(此处为字符串串联)

想法/尝试/尝试 群比
groupBy
似乎不是一个有效的选择,因为条目是在内存中的映射中收集的

扫描左 结果是

List(Joiner(0,a), Joiner(1,2.5), Joiner(1,2.5 5.0), Joiner(2,3.0))
(请忽略包装
Joiner


但是我没有找到一种方法来消除“不完整”的条目。

Emit
true
指示初始元素(当值切换时),而不是最后一个,这很容易,对吗?然后,您可以收集这些条目,然后是初始条目。 也许是这样的:

   ss.scanLeft((0, "", true)) { 
     case ((a, str, _), (b, c)) if (str == "" || a == b) => (b, str + " " + c, false) 
     case (_, (b, c)) => (b, c.toString, true)
   } .:+ (0, "", true)
     .sliding(2)
     .collect { case Seq(a, (_, _, true)) =>  (a._1, a._2) }
(注意,
:+
thingy-它在流的末尾附加了一个“伪”条目,这样最后一个“实”元素后面也跟着一个“真”条目,并且不会被过滤掉)。

这似乎没问题

def makeEm(s: Stream[(Int, String)]) = {

  import Stream._

  @tailrec
  def z(source: Stream[(Int, String)], curr: (Int, List[String]), acc: Stream[(Int, String)]): Stream[(Int, String)] = source match {
    case Empty =>
      Empty
    case x #:: Empty =>
      acc :+ (curr._1 -> (x._2 :: curr._2).mkString(","))
    case x #:: y #:: etc if x._1 != y._1 =>
      val c = curr._1 -> (x._2 :: curr._2).mkString(",")
      z(y #:: etc, (y._1, List[String]()), acc :+ c)
    case x #:: etc =>
      z(etc, (x._1, x._2 :: curr._2), acc)
  }

  z(s, (0, List()), Stream())
}
测试:

val ss = List( (1, "2.5"), (1, "5.0"), (2, "3.0"), (2, "4.0"), (2, "6.0"), (3, "1.0")).toStream
makeEm(ss).toList.mkString(",")

val s = List().toStream
makeEm(s).toList.mkString(",")

val ss2 = List( (1, "2.5"), (1, "5.0")).toStream
makeEm(ss2).toList.mkString(",")

val s3 = List((1, "2.5"),(2, "4.0"),(3, "1.0")).toStream
makeEm(s3).toList.mkString(",")
输出

ss: scala.collection.immutable.Stream[(Int, String)] = Stream((1,2.5), ?)
res0: String = (1,5.0,2.5),(2,6.0,4.0,3.0),(3,1.0)

s: scala.collection.immutable.Stream[Nothing] = Stream()
res1: String = 

ss2: scala.collection.immutable.Stream[(Int, String)] = Stream((1,2.5), ?)
res2: String = (1,5.0,2.5)

s3: scala.collection.immutable.Stream[(Int, String)] = Stream((1,2.5), ?)
res3: String = (0,2.5),(2,4.0),(3,1.0)

Wrt第二种方法:我想要
列表(Joiner(1,2.5.0),Joiner(2,3.0))
。条目
Joiner(1,2.5)
就是我所说的不完整。和< >代码> Joiner(0,a)< /C>只是开始点。考虑返回一个元组,例如“代码>(Cooter,BooLeIn)< /代码>,第二个元素指示这是否是“最终”条目。然后
.collect{case(j,true)=>j}
@Dima:Nice try(我也想到了这一点,但我不想使用标志,而是使用第二个case类)。尽管如此,这种方法还是失败了,因为我没有找到一种方法来查看条目是否已完成。请你把代码画出来好吗。。。。
ss: scala.collection.immutable.Stream[(Int, String)] = Stream((1,2.5), ?)
res0: String = (1,5.0,2.5),(2,6.0,4.0,3.0),(3,1.0)

s: scala.collection.immutable.Stream[Nothing] = Stream()
res1: String = 

ss2: scala.collection.immutable.Stream[(Int, String)] = Stream((1,2.5), ?)
res2: String = (1,5.0,2.5)

s3: scala.collection.immutable.Stream[(Int, String)] = Stream((1,2.5), ?)
res3: String = (0,2.5),(2,4.0),(3,1.0)