Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/ruby-on-rails-4/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark pyspark dataframe——为什么在下面的场景中会以不同的方式识别空值?_Apache Spark_Null_Pyspark_Spark Dataframe - Fatal编程技术网

Apache spark pyspark dataframe——为什么在下面的场景中会以不同的方式识别空值?

Apache spark pyspark dataframe——为什么在下面的场景中会以不同的方式识别空值?,apache-spark,null,pyspark,spark-dataframe,Apache Spark,Null,Pyspark,Spark Dataframe,为什么isNull()在以下场景中表现不同 PySpark 1.6 Python 2.6.6 两个数据帧的定义: df_t1 = sqlContext.sql("select 1 id, 9 num union all select 1 id, 2 num union all select 2 id, 3 num") df_t2 = sqlContext.sql("select 1 id, 1 start, 3 stop union all select 3 id, 1 start, 9 s

为什么
isNull()
在以下场景中表现不同

  • PySpark 1.6
  • Python 2.6.6
两个数据帧的定义:

df_t1 = sqlContext.sql("select 1 id, 9 num union all select 1 id, 2 num union all select 2 id, 3 num")
df_t2 = sqlContext.sql("select 1 id, 1 start, 3 stop union all select 3 id, 1 start, 9 stop")
情景1:

df_t1.join(df_t2, (df_t1.id == df_t2.id) & (df_t1.num >= df_t2.start) & (df_t1.num <= df_t2.stop), "left").select([df_t2.start, df_t2.start.isNull()]).show()
情景2:

df_new=df_t1.join(df_t2, (df_t1.id == df_t2.id) & (df_t1.num >= df_t2.start) & (df_t1.num <= df_t2.stop), "left")
情景3:

df_t1.join(df_t2, (df_t1.id == df_t2.id) & (df_t1.num >= df_t2.start) & (df_t1.num <= df_t2.stop), "left").filter("start is null").show()

谢谢。

谢谢@zero323的编辑。谢谢@zero323的编辑。
+-----+-------------+
|start|isnull(start)|
+-----+-------------+
| null|         true|
|    1|        false|
| null|         true|
+-----+-------------+
df_t1.join(df_t2, (df_t1.id == df_t2.id) & (df_t1.num >= df_t2.start) & (df_t1.num <= df_t2.stop), "left").filter("start is null").show()
+---+---+----+-----+----+
| id|num|  id|start|stop|
+---+---+----+-----+----+
|  1|  9|null| null|null|
|  2|  3|null| null|null|
+---+---+----+-----+----+