Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何进行groupby排名并将其作为列添加到spark scala中的现有数据帧中?_Scala_Apache Spark - Fatal编程技术网

如何进行groupby排名并将其作为列添加到spark scala中的现有数据帧中?

如何进行groupby排名并将其作为列添加到spark scala中的现有数据帧中?,scala,apache-spark,Scala,Apache Spark,目前我正在做的就是: val new_df= old_df.groupBy("column1").count().withColumnRenamed("count","column1_count") val new_df_rankings = new_df.withColumn( "column1_count_rank", dense_rank() .over(

目前我正在做的就是:

  val new_df= old_df.groupBy("column1").count().withColumnRenamed("count","column1_count")

  val new_df_rankings = new_df.withColumn(
    "column1_count_rank",
    dense_rank()
      .over(
        Window.orderBy($"column1_count".desc))).select("column1_count","column1_count_rank")

但实际上,我所要做的就是在原始df(old_df)中添加一个列,称为“column1_count_rank”,而无需经过所有这些中间步骤并合并回来

有办法做到这一点吗


谢谢,祝你今天愉快

应用聚合时,会有一个计算结果,它将创建新的数据帧。 你能给出一些输入和输出示例吗

old_df.groupBy(“column1”).agg(count(“*”).alias(“column1_count”))。withColumn(“column1_count_rank”,densite_rank()。over(Window.orderBy($”column1_count.desc))。选择(“column1_count”,“column1_count_rank”)