Python 无法在PySpark中创建数据帧
我想用以下代码在PySpark中创建一个数据帧Python 无法在PySpark中创建数据帧,python,apache-spark,pyspark,databricks,Python,Apache Spark,Pyspark,Databricks,我想用以下代码在PySpark中创建一个数据帧 from pyspark.sql import * from pyspark.sql.types import * temp = Row("DESC", "ID") temp1 = temp('Description1323', 123) print temp1 schema = StructType([StructField("DESC", StringType(), False), StructF
from pyspark.sql import *
from pyspark.sql.types import *
temp = Row("DESC", "ID")
temp1 = temp('Description1323', 123)
print temp1
schema = StructType([StructField("DESC", StringType(), False),
StructField("ID", IntegerType(), False)])
df = spark.createDataFrame(temp1, schema)
但我收到以下错误:
类型错误:StructType无法接受类型中的对象“Description1323”
输入'str'
我的代码有什么问题?问题是您正在传递一个
行
,您应该在那里传递一个行
列表。试试这个:
from pyspark.sql import *
from pyspark.sql.types import *
temp = Row("DESC", "ID")
temp1 = temp('Description1323', 123)
print temp1
schema = StructType([StructField("DESC", StringType(), False),
StructField("ID", IntegerType(), False)])
df = spark.createDataFrame([temp1], schema)
df.show()
结果是:
+---------------+---+
| DESC| ID|
+---------------+---+
|Description1323|123|
+---------------+---+