Java 如何在spark sql中将json数组转换为csv
我尝试了这个查询,以从linkedin数据中获得所需的经验Java 如何在spark sql中将json数组转换为csv,java,apache-spark,apache-spark-sql,bigdata,Java,Apache Spark,Apache Spark Sql,Bigdata,我尝试了这个查询,以从linkedin数据中获得所需的经验 Dataset<Row> filteredData = spark .sql("select full_name ,experience from (select *, explode(experience['title']) exp from tempTable )" + " a where lower(exp) like
Dataset<Row> filteredData = spark
.sql("select full_name ,experience from (select *, explode(experience['title']) exp from tempTable )"
+ " a where lower(exp) like '%developer%'");
但我有一个错误:
最后我试了一下,但是我得到了更多同名的行
Dataset<Row> filteredData = spark
.sql("select full_name ,explode(experience) from (select *, explode(experience['title']) exp from tempTable )"
+ " a where lower(exp) like '%developer%'");
请给我提示,如何将字符串数组转换为同一列中的逗号分隔字符串。您可以应用自定义项来生成逗号分隔字符串 像这样创建UDF
def mkString(value: WrappedArray[String]): String = value.mkString(",")
在sparkSQL上下文中注册UDF
sqlContext.udf.register("mkstring", mkString _)
将其应用于SparkSQL查询
sqlContext.sql(select mkstring(columnName) from tableName)
它将返回数组的逗号分隔值你能分享更多关于代码和示例数据的信息吗?事实上,我正在用SparkTransks编写代码,但你能分享json格式的示例代码吗?如果该列有使用concat_wsstring delimiter的数组,array else concat_ws',,collect_setcastdate as string谢谢你指导我:我按照你的建议做了@如果这对你来说是很好的,请接受答案