Apache spark sparksqljson布尔计算
我有一个示例JSON模式(因大小而被截断): 当我执行以下操作时:Apache spark sparksqljson布尔计算,apache-spark,pyspark,Apache Spark,Pyspark,我有一个示例JSON模式(因大小而被截断): 当我执行以下操作时: results = sqlContext.sql("SELECT LinearScheduleResult.Schedule.Airings.Sports from tv") 它返回: [Row(Sports=[False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, Fals
results = sqlContext.sql("SELECT LinearScheduleResult.Schedule.Airings.Sports from tv")
它返回:
[Row(Sports=[False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False])]
当我做一些更复杂的事情时,比如:
results = sqlContext.sql("SELECT LinearScheduleResult.Schedule.Airings from tv where LinearScheduleResult.Schedule.Airings.Sports = 'False'")
它永远不会返回任何东西,我尝试过'false',false,0,false,以及更多的组合
任何帮助都将不胜感激 Airings是一个数组,您需要首先分解该行。比如:
select a from tv
lateral view explode(LinearScheduleResult.Schedule.Airings) a as a
where a.Sports = false
为此,您必须使用HiveSqlContext
请参见,或者您可以下拉到常规rdd计算
select a from tv
lateral view explode(LinearScheduleResult.Schedule.Airings) a as a
where a.Sports = false