pyspark PandasudType.SCALAR convert行数组有错误

pyspark PandasudType.SCALAR convert行数组有错误,pyspark,series,pyspark-dataframes,Pyspark,Series,Pyspark Dataframes,我想使用PandasUDFDType.SCALAR操作行数组,如下所示: df = spark.createDataFrame([([1, 2, 3, 2],), ([4, 5, 5, 4],)], ['data']) @pandas_udf(ArrayType(IntegerType()), PandasUDFType.SCALAR) def s(x): z = x.apply(lambda xx: xx*2) return z df.select(s(df.data)).s

我想使用PandasUDFDType.SCALAR操作行数组,如下所示:

df = spark.createDataFrame([([1, 2, 3, 2],), ([4, 5, 5, 4],)], ['data'])

@pandas_udf(ArrayType(IntegerType()), PandasUDFType.SCALAR)
def s(x):
    z = x.apply(lambda xx: xx*2)
    return z
df.select(s(df.data)).show()
但它出了问题:

pyarrow.lib.ArrowInvalid: trying to convert NumPy type int32 but got int64```

同样的代码适用于我你的pandas、spark、pyarrow和numpy的哪个版本?
('0.25.2'、'2.4.4'、'0.13.0'、'1.16.3')
的顺序相同这可能是Pyarow的原因,我已经用Pyarow 0.13.0替换了Pyarow 0.8.0,成功了!