Apache spark 使用python将pickle(.pck)文件转换为spark数据帧

Apache spark 使用python将pickle(.pck)文件转换为spark数据帧,apache-spark,bigdl,Apache Spark,Bigdl,你好 亲爱的成员们,我想使用Bigdl训练模型,我有一组医学图像数据,以pickle对象文件(,pck)的形式。该pickle文件是一个3D图像(3D数组) 我尝试使用BigDl python API将其转换为spark datafram pickleRdd = sc.pickleFilehome/student/BigDL- trainings/elephantscale/data/volumetric_data/329637-8.pck sqlContext = SQLContext

你好 亲爱的成员们,我想使用Bigdl训练模型,我有一组医学图像数据,以pickle对象文件(,pck)的形式。该pickle文件是一个3D图像(3D数组)

我尝试使用BigDl python API将其转换为spark datafram

 pickleRdd = sc.pickleFilehome/student/BigDL- 
 trainings/elephantscale/data/volumetric_data/329637-8.pck
 sqlContext = SQLContext(sc)
 df = sqlContext.createDataFrame(pickleRdd) 
它抛出错误

Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 2, localhost, executor driver)
: java.io.IOException: file:/home/student/BigDL-trainings/elephantscale/data/volumetric_data/329637-8.pck not a SequenceFile
我已经在Python3.5和2.7上执行了这段代码,在这两种情况下我都遇到了错误