Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/arduino/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark 如何在Spark scala的window PartitionBy中应用多列 val partitionsColumns=“idnum,monthnum” val partitionsColumnsList=partitionsColumns.split(“,”).toList val loc=“/data/omega/published/invoice” val df=sqlContext.read.parquet(loc) val windowFunction=Window.partitionBy(partitionsColumnsList:*).orderBy(df(“生效日期”).desc) :38:错误:重载方法值partitionBy,并带有可选项: (cols:org.apache.spark.sql.Column*)org.apache.spark.sql.expressions.WindowSpec (colName:String,colName:String*)org.apache.spark.sql.expressions.WindowSpec 无法应用于(字符串) val windowFunction=Window.partitionBy(partitionsColumnsList:*).orderBy(df(“生效日期”).desc)_Apache Spark - Fatal编程技术网

Apache spark 如何在Spark scala的window PartitionBy中应用多列 val partitionsColumns=“idnum,monthnum” val partitionsColumnsList=partitionsColumns.split(“,”).toList val loc=“/data/omega/published/invoice” val df=sqlContext.read.parquet(loc) val windowFunction=Window.partitionBy(partitionsColumnsList:*).orderBy(df(“生效日期”).desc) :38:错误:重载方法值partitionBy,并带有可选项: (cols:org.apache.spark.sql.Column*)org.apache.spark.sql.expressions.WindowSpec (colName:String,colName:String*)org.apache.spark.sql.expressions.WindowSpec 无法应用于(字符串) val windowFunction=Window.partitionBy(partitionsColumnsList:*).orderBy(df(“生效日期”).desc)

Apache spark 如何在Spark scala的window PartitionBy中应用多列 val partitionsColumns=“idnum,monthnum” val partitionsColumnsList=partitionsColumns.split(“,”).toList val loc=“/data/omega/published/invoice” val df=sqlContext.read.parquet(loc) val windowFunction=Window.partitionBy(partitionsColumnsList:*).orderBy(df(“生效日期”).desc) :38:错误:重载方法值partitionBy,并带有可选项: (cols:org.apache.spark.sql.Column*)org.apache.spark.sql.expressions.WindowSpec (colName:String,colName:String*)org.apache.spark.sql.expressions.WindowSpec 无法应用于(字符串) val windowFunction=Window.partitionBy(partitionsColumnsList:*).orderBy(df(“生效日期”).desc),apache-spark,Apache Spark,是否可以通过Spark/Scala方法将列列表发送到分区 我已经实现了将一列传递给partitionBy方法,这个方法很有效。我不知道如何将多个列传递给partitionBy方法 基本上,我想通过List(Columns)传递到partitionBy方法 Spark版本为1.6。具有以下定义: 定义了分区之后 static WindowSpec partitionBy(scala.collection.Seq<Column> cols) static WindowSpec part

是否可以通过Spark/Scala方法将列列表发送到分区

我已经实现了将一列传递给
partitionBy
方法,这个方法很有效。我不知道如何将多个列传递给
partitionBy
方法

基本上,我想通过
List(Columns)
传递到
partitionBy
方法

Spark版本为1.6。

具有以下定义:

定义了分区之后

static WindowSpec partitionBy(scala.collection.Seq<Column> cols)
static WindowSpec partitionBy(String colName, scala.collection.Seq<String> colNames) 
static WindowSpec partitionBy(String colName, String... colNames)
创建具有已定义分区的WindowSpec

static WindowSpec partitionBy(scala.collection.Seq<Column> cols)
static WindowSpec partitionBy(String colName, scala.collection.Seq<String> colNames) 
static WindowSpec partitionBy(String colName, String... colNames)
以你为例,

val partitionsColumnsList = partitionsColumns.split(",").toList
您可以像这样使用它:

Window.partitionBy(partitionsColumnsList.map(col(_)):_*).orderBy(df("effective_date").desc)


下面的代码适用于我:

Window.partitionBy(partitionsColumnsList.map(col(_)):_*).orderBy(df("effective_date").desc)

还可以通过将列名作为列表分配给变量,并在partitionBy参数中使用这些列名,为partitionBy应用多个列,如下所示:

val partitioncolumns = List("idnum","monthnum")
val w = Window.partitionBy(partitioncolumns:_*).orderBy(df("effective_date").desc)