Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark select(“设备”)在Spark查询中做什么?_Apache Spark_Pyspark - Fatal编程技术网

Apache spark select(“设备”)在Spark查询中做什么?

Apache spark select(“设备”)在Spark查询中做什么?,apache-spark,pyspark,Apache Spark,Pyspark,这里有一个例子 df = ... # streaming DataFrame with IOT device data with schema { device: string, deviceType: string, signal: double, time: DateType } # Select the devices which have signal more than 10 df.select("device").where("signal >

这里有一个例子

df = ...  # streaming DataFrame with IOT device data with schema { device: string, deviceType: string, signal: double, time: DateType }

# Select the devices which have signal more than 10
df.select("device").where("signal > 10")
选择(“设备”)
部件做什么

如果它是由
信号
字段值选择的,那么为什么要提到
设备
字段

为什么不直接写呢

df.where("signal > 10")

?

此选项仅选择“设备”列

它只选择一列“设备”,忽略“信号>10”的所有其他列,而df.where(“信号>10”)将选择信号>10的所有列。
df.select("time").where("signal > 10")
select("device")
df.show
+----------+-------------------+
|signal    | B | C | D | E | F |
+----------+---+---+---+---+---+
|10        | 4 | 1 | 0 | 3 | 1 | 
|15        | 6 | 4 | 3 | 2 | 0 |
+----------+---+---+---+---+---+

 df.select("device").show
+----------+
|signal    |
+----------+
|10        |
|15        |
+----------+