Pyspark 属性错误:';非类型';对象没有属性';setCallSite';关于model.projectedf
当我试图将Pyspark 属性错误:';非类型';对象没有属性';setCallSite';关于model.projectedf,pyspark,apache-spark-sql,apache-spark-mllib,Pyspark,Apache Spark Sql,Apache Spark Mllib,当我试图将pyspark.ml.feature.inputermodel的subrogatedf属性中的值转换为python列表时,出现此错误: File "D:\repos\onnxmltools\onnxmltools\convert\sparkml\operator_converters\Imputer.py", line 21, in convert_imputer surrogates = op.surrogateDF.toPandas().values[0].tolist()
pyspark.ml.feature.inputermodel
的subrogatedf
属性中的值转换为python列表时,出现此错误:
File "D:\repos\onnxmltools\onnxmltools\convert\sparkml\operator_converters\Imputer.py", line 21, in convert_imputer
surrogates = op.surrogateDF.toPandas().values[0].tolist()
File "C:\Users\jeff\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\site-packages\pyspark\sql\dataframe.py", line 1968, in toPandas
pdf = pd.DataFrame.from_records(self.collect(), columns=self.columns)
File "C:\Users\jeff\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\site-packages\pyspark\sql\dataframe.py", line 465, in collect
with SCCallSiteSync(self._sc) as css:
File "C:\Users\jeff\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\site-packages\pyspark\traceback_utils.py", line 72, in __enter__
self._context._jsc.setCallSite(self._call_site)
AttributeError: 'NoneType' object has no attribute 'setCallSite'
代码如下所示。奇怪的是,执行model.subscratedf.show()
实际上打印了正确的值
data = self.spark.createDataFrame([
(1.0, float("nan")),
(2.0, float("nan")),
(float("nan"), 3.0),
(4.0, 4.0),
(5.0, 5.0)
], ["a", "b"])
imputer = Imputer(inputCols=["a", "b"], outputCols=["out_a", "out_b"])
model = imputer.fit(data)
surrogates = model.surrogateDF.toPandas().values[0].tolist()
从show()
打印:
我还尝试使用RDD
或first()
以不同的方式获取值,但没有任何区别
model.surrogateDF.show()
+---+---+
| a| b|
+---+---+
|3.0|4.0|
+---+---+