Python 如何解决pyspark中由于微调功能导致的数据不匹配错误?
当我运行此代码时,它会给我一个错误:Python 如何解决pyspark中由于微调功能导致的数据不匹配错误?,python,apache-spark,pyspark,Python,Apache Spark,Pyspark,当我运行此代码时,它会给我一个错误: df1 = (df.withColumn("columnName_{}".format(columnName), psf.lit(columnName)) .withColumn("{}_not_null".format(columnName), psf.when((psf.col(columnName).isNotNull()& psf.trim(psf.col(columnName))!= psf.
df1 = (df.withColumn("columnName_{}".format(columnName), psf.lit(columnName))
.withColumn("{}_not_null".format(columnName), psf.when((psf.col(columnName).isNotNull()& psf.trim(psf.col(columnName))!= psf.lit('') ),1))
有人能帮我解决这个错误吗?您需要将第二个条件括在括号中,因为
&
的求值优先级高于=代码>:
cannot resolve '((`Address` IS NOT NULL) AND trim(`Address`))' due to data type mismatch: differing types in '((`Address` IS NOT NULL) AND trim(`Address`))' (boolean and string)
df1 = df.withColumn(
"columnName_{}".format(columnName),
psf.lit(columnName)
).withColumn(
"{}_not_null".format(columnName),
psf.when(
psf.col(columnName).isNotNull() &
(psf.trim(psf.col(columnName)) != psf.lit(''))
, 1)
)