在pySpark中执行createOrReplaceTempView后,如何维护列的数据类型?
我有一个数据框,其数据类型如下所示在pySpark中执行createOrReplaceTempView后,如何维护列的数据类型?,pyspark,spark-dataframe,pyspark-sql,Pyspark,Spark Dataframe,Pyspark Sql,我有一个数据框,其数据类型如下所示 orders.printSchema() root |-- order_id: long (nullable = true) |-- user_id: long (nullable = true) |-- eval_set: string (nullable = true) |-- order_number: short (nullable = true) |-- order_dow: short (nullable = true) |-- ord
orders.printSchema()
root
|-- order_id: long (nullable = true)
|-- user_id: long (nullable = true)
|-- eval_set: string (nullable = true)
|-- order_number: short (nullable = true)
|-- order_dow: short (nullable = true)
|-- order_hour_of_day: short (nullable = true)
|-- days_since_prior_order: short (nullable = true)
但是当我把它注册到一个表中时,数据类型都变成了字符串
orders.createOrReplaceTempView("orders")
spark.sql("describe orders").show()
+--------------------+---------+-------+
| col_name|data_type|comment|
+--------------------+---------+-------+
| order_id| string| |
| user_id| string| |
| eval_set| string| |
| order_number| string| |
| order_dow| string| |
| order_hour_of_day| string| |
|days_since_prior_...| string| |
+--------------------+---------+-------+
因此,如何在pyspark中维护从dataframe到table的原始类型。No
createOrReplaceTempView
不会更改模式。我在Spark Scala中进行了测试,它保留了模式
。pyspark
可能有问题