Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala &引用;构造LocalRelation“时发现未解析的属性;,有人能解释一下吗?_Scala_Apache Spark_Apache Spark Sql_Spark Dataframe - Fatal编程技术网

Scala &引用;构造LocalRelation“时发现未解析的属性;,有人能解释一下吗?

Scala &引用;构造LocalRelation“时发现未解析的属性;,有人能解释一下吗?,scala,apache-spark,apache-spark-sql,spark-dataframe,Scala,Apache Spark,Apache Spark Sql,Spark Dataframe,我有一个数据帧 +---+---+---+---+ |A |B |C |D | +---+---+---+---+ |a |b |b |c | +---+---+---+---+ 我通过执行以下操作将两个列转换为结构 import org.apache.spark.sql.functions._ val df = myDF.withColumn("colA", struct($"A", $"B")) .withColumn("colB", struct($"C".as("

我有一个数据帧

+---+---+---+---+
|A  |B  |C  |D  |
+---+---+---+---+
|a  |b  |b  |c  |
+---+---+---+---+
我通过执行以下操作将两个
转换为
结构

import org.apache.spark.sql.functions._

val df = myDF.withColumn("colA", struct($"A", $"B"))
  .withColumn("colB", struct($"C".as("A"), $"D".as("B")))
dataframe
schema

+-----+-----+
|colA |colB |
+-----+-----+
|[a,b]|[b,c]|
+-----+-----+

root
 |-- colA: struct (nullable = false)
 |    |-- A: string (nullable = true)
 |    |-- B: string (nullable = true)
 |-- colB: struct (nullable = false)
 |    |-- A: string (nullable = true)
 |    |-- B: string (nullable = true)
我想将两个
struct
列合并到一个列中,所以我这样做了

df.select(array(struct($"colA.A", $"colA.B"),struct($"colB.A", $"colB.B")).as("Result"))
它给出了正确的
dataframe
schema
as

+--------------+
|Result        |
+--------------+
|[[a,b], [b,c]]|
+--------------+

root
 |-- Result: array (nullable = false)
 |    |-- element: struct (containsNull = false)
 |    |    |-- A: string (nullable = true)
 |    |    |-- B: string (nullable = true)
我可以通过这样做得到同样的结果

df.select(array(struct($"A", $"B"),struct($"C".as("A"), $"D".as("B"))).as("Result"))
现在,如果我们看看整个过程,我们有

$"colA" == struct($"A", $"B") == struct($"colA.A", $"colA.B")

但是

当我这样做的时候

df.select(array($"colA", $"colB").as("Result"))
我得到以下错误

需求失败:构造LocalRelation时发现未解析的属性。 java.lang.IllegalArgumentException:需求失败:构造LocalRelation时发现未解析的属性。 在scala.Predef$.require处(Predef.scala:219) 位于org.apache.spark.sql.catalyst.plans.logical.LocalRelation.(LocalRelation.scala:50) 位于org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$33.applyOrElse(optimizer.scala:1402) 位于org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$33.applyOrElse(optimizer.scala:1398) 位于org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:286) .......

错误的含义是什么?我应该如何更正

df.select(array($"colA", $"colB").as("Result"))