Apache spark “获取异常”;未注册任何输出操作,因此无需执行任何操作;从火花流
我已经运行了它,然后它显示了一个异常Apache spark “获取异常”;未注册任何输出操作,因此无需执行任何操作;从火花流,apache-spark,spark-streaming,rdd,spark-structured-streaming,Apache Spark,Spark Streaming,Rdd,Spark Structured Streaming,我已经运行了它,然后它显示了一个异常 package com.scala.sparkStreaming import org.apache.spark._ import org.apache.spark.streaming._ object Demo1 { def main(assdf:Array[String]){ val sc=new SparkContext("local","Stream") val stream=new StreamingContext(
package com.scala.sparkStreaming
import org.apache.spark._
import org.apache.spark.streaming._
object Demo1 {
def main(assdf:Array[String]){
val sc=new SparkContext("local","Stream")
val stream=new StreamingContext(sc,Seconds(2))
val rdd1=stream.textFileStream("D:/My Documents/Desktop/inbound/sse/ssd/").cache()
val mp1= rdd1.flatMap(_.split(","))
print(mp1.count())
stream.start()
stream.awaitTermination()
}
}
错误消息“未注册任何输出操作,因此无需执行任何操作”提示缺少某些内容
您的直接流rdd1
和mp1
没有任何操作。一个flatMap
只是一个由Spark延迟评估的转换。这就是stream.start()
方法引发此异常的原因
根据文档,您可以如下所示。在处理数据流时,可以通过RDD进行迭代。下面的代码在Spark版本2.4.5下运行良好
textFileStream
的文档中说,它“监视与Hadoop兼容的文件系统中的新文件,并将其作为文本文件读取”,因此请确保在作业运行时添加/修改要读取的文件
另外,虽然我对Windows上的Spark不太熟悉,但您可能需要将目录字符串更改为
org.apache.spark.streaming.dstream.MappedDStream@6342993220/05/22 18:14:16 ERROR StreamingContext: Error starting the context, marking it as stopped
java.lang.IllegalArgumentException: requirement failed: No output operations registered, so nothing to execute
at scala.Predef$.require(Predef.scala:277)
at org.apache.spark.streaming.DStreamGraph.validate(DStreamGraph.scala:169)
at org.apache.spark.streaming.StreamingContext.validate(StreamingContext.scala:517)
at org.apache.spark.streaming.StreamingContext.liftedTree1$1(StreamingContext.scala:577)
at org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:576)
at com.scala.sparkStreaming.Demo1$.main(Demo1.scala:18)
at com.scala.sparkStreaming.Demo1.main(Demo1.scala)
Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: No output operations registered, so nothing to execute
at scala.Predef$.require(Predef.scala:277)
at org.apache.spark.streaming.DStreamGraph.validate(DStreamGraph.scala:169)
at org.apache.spark.streaming.StreamingContext.validate(StreamingContext.scala:517)
at org.apache.spark.streaming.StreamingContext.liftedTree1$1(StreamingContext.scala:577)
at org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:576)
at com.scala.sparkStreaming.Demo1$.main(Demo1.scala:18)
at com.scala.sparkStreaming.Demo1.main(Demo1.scala)
以下是Spark Streaming的完整代码示例:
file://D:\\My Documents\\Desktop\\inbound\\sse\\ssd
在Spark版本2.4.5中,不推荐使用Spark流媒体
,我建议您已经熟悉Spark结构化流媒体
。其代码如下所示:
import org.apache.spark.SparkContext
import org.apache.spark.streaming.{Seconds, StreamingContext}
object Main extends App {
val sc=new SparkContext("local[1]","Stream")
val stream=new StreamingContext(sc,Seconds(2))
val rdd1 =stream.textFileStream("file:///path/to/src/main/resources")
val mp1= rdd1.flatMap(_.split(" "))
mp1.foreachRDD(rdd => rdd.collect().foreach(println(_)))
stream.start()
stream.awaitTermination()
}
错误消息“未注册任何输出操作,因此无需执行任何操作”提示缺少某些内容
您的直接流rdd1
和mp1
没有任何操作。一个flatMap
只是一个由Spark延迟评估的转换。这就是stream.start()
方法引发此异常的原因
根据文档,您可以如下所示。在处理数据流时,可以通过RDD进行迭代。下面的代码在Spark版本2.4.5下运行良好
textFileStream
的文档中说,它“监视与Hadoop兼容的文件系统中的新文件,并将其作为文本文件读取”,因此请确保在作业运行时添加/修改要读取的文件
另外,虽然我对Windows上的Spark不太熟悉,但您可能需要将目录字符串更改为
org.apache.spark.streaming.dstream.MappedDStream@6342993220/05/22 18:14:16 ERROR StreamingContext: Error starting the context, marking it as stopped
java.lang.IllegalArgumentException: requirement failed: No output operations registered, so nothing to execute
at scala.Predef$.require(Predef.scala:277)
at org.apache.spark.streaming.DStreamGraph.validate(DStreamGraph.scala:169)
at org.apache.spark.streaming.StreamingContext.validate(StreamingContext.scala:517)
at org.apache.spark.streaming.StreamingContext.liftedTree1$1(StreamingContext.scala:577)
at org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:576)
at com.scala.sparkStreaming.Demo1$.main(Demo1.scala:18)
at com.scala.sparkStreaming.Demo1.main(Demo1.scala)
Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: No output operations registered, so nothing to execute
at scala.Predef$.require(Predef.scala:277)
at org.apache.spark.streaming.DStreamGraph.validate(DStreamGraph.scala:169)
at org.apache.spark.streaming.StreamingContext.validate(StreamingContext.scala:517)
at org.apache.spark.streaming.StreamingContext.liftedTree1$1(StreamingContext.scala:577)
at org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:576)
at com.scala.sparkStreaming.Demo1$.main(Demo1.scala:18)
at com.scala.sparkStreaming.Demo1.main(Demo1.scala)
以下是Spark Streaming的完整代码示例:
file://D:\\My Documents\\Desktop\\inbound\\sse\\ssd
在Spark版本2.4.5中,不推荐使用Spark流媒体
,我建议您已经熟悉Spark结构化流媒体
。其代码如下所示:
import org.apache.spark.SparkContext
import org.apache.spark.streaming.{Seconds, StreamingContext}
object Main extends App {
val sc=new SparkContext("local[1]","Stream")
val stream=new StreamingContext(sc,Seconds(2))
val rdd1 =stream.textFileStream("file:///path/to/src/main/resources")
val mp1= rdd1.flatMap(_.split(" "))
mp1.foreachRDD(rdd => rdd.collect().foreach(println(_)))
stream.start()
stream.awaitTermination()
}
我试过上线。它成功了,但在控制台中没有打印任何内容。谢谢@mike的努力。我会检查的。我试过上线。它成功了,但在控制台中没有打印任何内容。谢谢@mike的努力。我会检查的。