Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark 如何用动态查询流控制Spark流?_Apache Spark_Pyspark_Apache Spark Sql_Spark Streaming - Fatal编程技术网

Apache spark 如何用动态查询流控制Spark流?

Apache spark 如何用动态查询流控制Spark流?,apache-spark,pyspark,apache-spark-sql,spark-streaming,Apache Spark,Pyspark,Apache Spark Sql,Spark Streaming,我有一个来自卡夫卡的数据流,称之为SourceStream 我有另一个Spark SQL查询流,其单个值是Spark SQL查询以及一个窗口大小 我希望这些查询应用于SourceStream数据,并将查询结果传递给接收器 例如。 源流 Id type timestamp user amount ------- ------ ---------- ---------- -------- uuid1 A 342342

我有一个来自卡夫卡的数据流,称之为SourceStream

我有另一个Spark SQL查询流,其单个值是Spark SQL查询以及一个窗口大小

我希望这些查询应用于SourceStream数据,并将查询结果传递给接收器

例如。 源流

  Id     type    timestamp     user     amount  
 ------- ------  ----------    ---------- -------- 
  uuid1   A      342342        ME           10.0  
  uuid2   B      234231        YOU        120.10  
  uuid3   A      234234        SOMEBODY    23.12  
  uuid4   A      234233        WHO         243.1  
  uuid5   C      124555        IT          35.12  
  ...
  ....
查询流

  Id     window      query   
 -------  ------     ------ 
  uuid13  1 hour     select 'uuid13' as u, max(amount) as output from df where type = 'A' group by ..
  uuid21  5 minute   select 'uuid121' as u, count(1) as output  from df where amount > 100 group by ..
  uuid321 1 day      select 'uuid321' as u, sum(amount) as output from df where amount > 100 group by ..
  ...
  ....
查询流中的每个查询都将应用于与查询一起提到的窗口中源流的传入数据,并将输出发送到接收器

我可以用什么方法来实现它