Python 从pyspark数据帧中提取架构标签
我想从pyspark数据框架创建一个python列表,其中包含特定模式“级别”的模式标签 模式是:Python 从pyspark数据帧中提取架构标签,python,apache-spark,pyspark,Python,Apache Spark,Pyspark,我想从pyspark数据框架创建一个python列表,其中包含特定模式“级别”的模式标签 模式是: root |-- DISPLAY: struct (nullable = true) | |-- 1WO: struct (nullable = true) | | |-- JPY: struct (nullable = true) | | | |-- CHANGE24HOUR: string (nullable = true) | | |
root
|-- DISPLAY: struct (nullable = true)
| |-- 1WO: struct (nullable = true)
| | |-- JPY: struct (nullable = true)
| | | |-- CHANGE24HOUR: string (nullable = true)
| | | |-- CHANGEDAY: string (nullable = true)
| |-- AAVE: struct (nullable = true)
| | |-- JPY: struct (nullable = true)
| | | |-- CHANGE24HOUR: string (nullable = true)
| | | |-- CHANGEDAY: string (nullable = true)
预期产出为:
list = 1WO, AAVE
以下代码打印架构中的所有内容:
df.schema.jsonValue()
有没有一种简单的方法可以提取这些标签?使用星号符号选择第一层,然后在n中列出列:
df.选择('DISPLAY.*')。列