Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Pyspark自连接,错误为“;已解析的属性缺失“;_Python_Python 3.x_Pyspark_Apache Spark 2.3 - Fatal编程技术网

Python Pyspark自连接,错误为“;已解析的属性缺失“;

Python Pyspark自连接,错误为“;已解析的属性缺失“;,python,python-3.x,pyspark,apache-spark-2.3,Python,Python 3.x,Pyspark,Apache Spark 2.3,在执行pyspark数据帧自连接时,我收到一条错误消息: Py4JJavaError: An error occurred while calling o1595.join. : org.apache.spark.sql.AnalysisException: Resolved attribute(s) un_val#5997 missing from day#290,item_listed#281,filename#286 in operator !Project [...]. Attribut

在执行pyspark数据帧自连接时,我收到一条错误消息:

Py4JJavaError: An error occurred while calling o1595.join.
: org.apache.spark.sql.AnalysisException: Resolved attribute(s) un_val#5997 missing from day#290,item_listed#281,filename#286 in operator !Project [...]. Attribute(s) with the same name appear in the operation: un_val. Please check if the right attribute(s) are used.;;
这是一个简单的dataframe自连接,如下所示,工作正常,但在dataframe上进行了一些操作(如添加列或与其他dataframe连接)后,会出现上述错误

df.join(df,on='item_listed')
使用数据帧别名(如bellow)也不起作用,并引发相同的错误消息:

df.alias('A').join(df.alias('B'), col('A.my_id') == col('B.my_id'))

我在这里找到了一个Java解决方案,pyspark的解决方案如下:

#Add a "_r" suffix to column names array
newcols = [c + '_r' for c in df.columns]

#clone the dataframe with columns renamed
df2 = df.toDF(*newcols)

#self-join
df.join(df2,df.my_column == df2.my_column_r)