Dataframe 如何在Pyspark中设置KryoSerializer?

Dataframe 如何在Pyspark中设置KryoSerializer?,dataframe,apache-spark,pyspark,rdd,Dataframe,Apache Spark,Pyspark,Rdd,我是Pyspark的新手,请帮助我: spark = SparkSession.builder.appName("FlightDelayRDD").master("local[*]").getOrCreate() sc = spark.sparkContext sc.setSystemProperty("spark.dynamicAllocation.enabled", "true") sc.setSystemProperty("spark.dynamicAllocation.initialEx

我是Pyspark的新手,请帮助我:

spark = SparkSession.builder.appName("FlightDelayRDD").master("local[*]").getOrCreate()
sc = spark.sparkContext
sc.setSystemProperty("spark.dynamicAllocation.enabled", "true")
sc.setSystemProperty("spark.dynamicAllocation.initialExecutors", "6")
sc.setSystemProperty("spark.dynamicAllocation.minExecutors", "6")
sc.setSystemProperty("spark.dynamicAllocation.schedulerBacklogTimeout", "0.5s")
sc.setSystemProperty("spark.speculation", "true")
我想像上面配置的那样在pyspark中设置KryoSerializer。

自Spark 2.0.0以来,我们在使用简单类型、简单类型数组或字符串类型洗牌RDD时在内部使用Kryo序列化程序。

设置Kryo序列化程序:

sc.setSystemProperty("spark.serializer", "org.apache.spark.serializer.KryoSerializer")

检查:

spark.sparkContext.getConf().get("spark.serializer")

#u'org.apache.spark.serializer.KryoSerializer'

我已经试过了。请告诉我如何检查pyspark是否只使用KryoSerializer而不是java序列化程序。