如何更新增量表pyspark/hive中的嵌套json_数据

如何更新增量表pyspark/hive中的嵌套json_数据,pyspark,apache-spark-sql,pyspark-dataframes,Pyspark,Apache Spark Sql,Pyspark Dataframes,我想将col4值更新为TT,如“col4”:“TT” 我尝试了以下代码: (Databricks) %sql select * from df2 jsonData -------- {"col1":"AA","col2":"BB","col3":"CC","col4":"DD"} 及 获取以下错误: update df2 set j

我想将col4值更新为TT,如
“col4”:“TT”

我尝试了以下代码:

(Databricks)
%sql 
select * from df2

jsonData
--------
{"col1":"AA","col2":"BB","col3":"CC","col4":"DD"}

获取以下错误:

update df2 set jsonData = JSON_MODIFY(jsonData '$.col4', 'TT')

使用
from_json
函数将json展平到列中,然后更新
col4
最后使用
to_json
函数重新创建json对象

示例

Error in SQL statement: AnalysisException: Undefined function: 'JSON_MODIFY'. 
This function is neither a registered temporary function nor a permanent function 
registered in the database 'default'.

使用
from_json
函数将json展平到列中,然后更新
col4
最后使用
to_json
函数重新创建json对象

示例

Error in SQL statement: AnalysisException: Undefined function: 'JSON_MODIFY'. 
This function is neither a registered temporary function nor a permanent function 
registered in the database 'default'.
df.show(10,False)
#+-------------------------------------------------+
#|jsonData                                         |
#+-------------------------------------------------+
#|{"col1":"AA","col2":"BB","col3":"CC","col4":"DD"}|
#+-------------------------------------------------+

from pyspark.sql.functions import *

df.selectExpr("from_json(jsonData,'col1 string,col2 string,col3 string,col4 string') as jsn_str").\
select("jsn_str.*").\
withColumn("col4",lit("TT")).\
withColumn("jsonData",to_json(struct(col("col1"),col("col2"),col("col3"),col("col4")))).\
select("jsonData").\
show(10,False)
#+-------------------------------------------------+
#|jsonData                                         |
#+-------------------------------------------------+
#|{"col1":"AA","col2":"BB","col3":"CC","col4":"TT"}|
#+-------------------------------------------------+