Apache spark Dataframe左外部联接在Spark中工作不正常

Apache spark Dataframe左外部联接在Spark中工作不正常,apache-spark,apache-spark-sql,left-join,Apache Spark,Apache Spark Sql,Left Join,我有两个数据帧,模式如下: clusterDF schema root |-- cluster_id: string (nullable = true) df schema root |-- cluster_id: string (nullable = true) |-- name: string (nullable = true) 正在尝试使用 val nameDF = clusterDF.join(df, col("clusterDF.cluster_id") === col("

我有两个数据帧,模式如下:

clusterDF schema
root
 |-- cluster_id: string (nullable = true)

df schema
root
 |-- cluster_id: string (nullable = true)
 |-- name: string (nullable = true)
正在尝试使用

val nameDF  = clusterDF.join(df, col("clusterDF.cluster_id") === col("df.cluster_id"), "left" )
但上述代码在以下情况下失败:

org.apache.spark.sql.AnalysisException: cannot resolve '`clusterDF.cluster_id`' given input columns: [cluster_id, cluster_id, name];;
'Join LeftOuter, ('clusterDF.cluster_id = 'df.cluster_id)
:- Aggregate [cluster_id#0], [cluster_id#0]
:  +- Project [cluster_id#0]
:     +- Filter (name#18 = kroger)
:        +- Project [cluster_id#0, name#18]
:           +- Generate explode(influencers#1.screenName), true, false, [name#18]
:              +- Relation[cluster_id#0,influencers#1] json
+- Project [cluster_id#26, name#18]
   +- Generate explode(influencers#27.screenName), true, false, [name#18]
      +- Relation[cluster_id#26,influencers#27] json

我觉得很奇怪。有什么建议吗

错误消息非常清楚

org.apache.spark.sql.AnalysisException:无法解析给定输入列“
clusterDF.cluster\u id
”:[cluster\u id,cluster\u id,name]

如果您使用的列名不正确,请使用以下方法之一

val nameDF  = clusterDF.join(df, clusterDF("cluster_id") === df("cluster_id"), "left")

或者使用更新的版本

val nameDF  = clusterDF.join(df, clusterDF('cluster_id) === df('cluster_id), "left")

错误信息非常清楚

org.apache.spark.sql.AnalysisException:无法解析给定输入列“
clusterDF.cluster\u id
”:[cluster\u id,cluster\u id,name]

如果您使用的列名不正确,请使用以下方法之一

val nameDF  = clusterDF.join(df, clusterDF("cluster_id") === df("cluster_id"), "left")

或者使用更新的版本

val nameDF  = clusterDF.join(df, clusterDF('cluster_id) === df('cluster_id), "left")