Python Spark DataFrame将元素转换为字符串
我有这样一个数据帧:Python Spark DataFrame将元素转换为字符串,python,apache-spark,pyspark,apache-spark-sql,Python,Apache Spark,Pyspark,Apache Spark Sql,我有这样一个数据帧: +---+--------------------+ |idn| recommendations| +---+--------------------+ |463|[[10955,0.0086656...| |496|[[12767,0.0209305...| |148|[[9813,0.00673213...| |471|[[8537,0.00546676...| |243|[[10846,0.0044064...| |623|[[10955,0.3857911.
+---+--------------------+
|idn| recommendations|
+---+--------------------+
|463|[[10955,0.0086656...|
|496|[[12767,0.0209305...|
|148|[[9813,0.00673213...|
|471|[[8537,0.00546676...|
|243|[[10846,0.0044064...|
|623|[[10955,0.3857911...|
|540|[[11463,0.0250675...|
|392|[[7177,0.01615425...|
|737|[[7994,0.12720428...|
|516|[[10955,0.4047550...|
+---+--------------------+
dataFrame.printSchema()
root
|-- idn: long (nullable = true)
|-- recommendations: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- id_usn: long (nullable = true)
| | |-- rating: double (nullable = true)
模式如下:
+---+--------------------+
|idn| recommendations|
+---+--------------------+
|463|[[10955,0.0086656...|
|496|[[12767,0.0209305...|
|148|[[9813,0.00673213...|
|471|[[8537,0.00546676...|
|243|[[10846,0.0044064...|
|623|[[10955,0.3857911...|
|540|[[11463,0.0250675...|
|392|[[7177,0.01615425...|
|737|[[7994,0.12720428...|
|516|[[10955,0.4047550...|
+---+--------------------+
dataFrame.printSchema()
root
|-- idn: long (nullable = true)
|-- recommendations: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- id_usn: long (nullable = true)
| | |-- rating: double (nullable = true)
现在我想将列中的id\u usn和评级转换为字符串您可以按如下方式强制转换嵌套结构列
col_schema = ArrayType(StructType([StructField('id_usn',StringType(),True),StructField('rating',StringType(),True)]))
df = dataFrame.select('idn',dataFrame.recommendations.cast(col_schema))
df.printSchema()
请试试这个,让我知道