Apache spark sparksqljson布尔计算

Apache spark sparksqljson布尔计算,apache-spark,pyspark,Apache Spark,Pyspark,我有一个示例JSON模式(因大小而被截断): 当我执行以下操作时: results = sqlContext.sql("SELECT LinearScheduleResult.Schedule.Airings.Sports from tv") 它返回: [Row(Sports=[False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, Fals

我有一个示例JSON模式(因大小而被截断):

当我执行以下操作时:

results = sqlContext.sql("SELECT LinearScheduleResult.Schedule.Airings.Sports from tv")
它返回:

[Row(Sports=[False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False])]
当我做一些更复杂的事情时,比如:

results = sqlContext.sql("SELECT LinearScheduleResult.Schedule.Airings from tv where LinearScheduleResult.Schedule.Airings.Sports = 'False'")
它永远不会返回任何东西,我尝试过'false',false,0,false,以及更多的组合


任何帮助都将不胜感激

Airings是一个数组,您需要首先分解该行。比如:

select a from tv 
  lateral view explode(LinearScheduleResult.Schedule.Airings) a as a 
  where a.Sports = false
为此,您必须使用HiveSqlContext


请参见

,或者您可以下拉到常规rdd计算
select a from tv 
  lateral view explode(LinearScheduleResult.Schedule.Airings) a as a 
  where a.Sports = false