Streaming Apache spark流媒体应用程序输出未转发到主机

Streaming Apache spark流媒体应用程序输出未转发到主机,streaming,apache-spark,flume,Streaming,Apache Spark,Flume,我正在尝试运行FlumeEvent示例,如下所示 import org.apache.spark.SparkConf import org.apache.spark.storage.StorageLevel import org.apache.spark.streaming._ import org.apache.spark.streaming.flume._ import org.apache.spark.util.IntParam import org.apache.spark.stream

我正在尝试运行FlumeEvent示例,如下所示

import org.apache.spark.SparkConf
import org.apache.spark.storage.StorageLevel
import org.apache.spark.streaming._
import org.apache.spark.streaming.flume._
import org.apache.spark.util.IntParam
import org.apache.spark.streaming.flume.FlumeUtils

object FlumeEventCount {
def main(args: Array[String]) {


val batchInterval = Milliseconds(2000)

// Create the context and set the batch size
val sparkConf = new SparkConf().setAppName("FlumeEventCount")
.set("spark.cleaner.ttl","3")


val ssc = new StreamingContext(sparkConf, batchInterval)


// Create a flume stream
var  stream = FlumeUtils.createStream(ssc, "192.168.1.5",3564, StorageLevel.MEMORY_ONLY_SER_2)


// Print out the count of events received from this server in each batch
stream.count().map(cnt => "Received  flume events." + cnt ).print()
stream.count.print()
stream.print()
ssc.start()
ssc.awaitTermination()
}
}
import AssemblyKeys._

assemblySettings

name := "flume-test"

version := "1.0"

scalaVersion := "2.10.4"

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.0.0" % "provided"

libraryDependencies += "org.apache.spark" %% "spark-streaming" % "1.0.0" % "provided"

libraryDependencies += "org.apache.spark" %% "spark-streaming-flume" % "1.0.0" exclude("org.apache.spark","spark-core") exclude("org.apache.spark", "spark-streaming_2.10")

resolvers += "Akka Repository" at "http://repo.akka.io/releases/"
我的sbt文件如下

import org.apache.spark.SparkConf
import org.apache.spark.storage.StorageLevel
import org.apache.spark.streaming._
import org.apache.spark.streaming.flume._
import org.apache.spark.util.IntParam
import org.apache.spark.streaming.flume.FlumeUtils

object FlumeEventCount {
def main(args: Array[String]) {


val batchInterval = Milliseconds(2000)

// Create the context and set the batch size
val sparkConf = new SparkConf().setAppName("FlumeEventCount")
.set("spark.cleaner.ttl","3")


val ssc = new StreamingContext(sparkConf, batchInterval)


// Create a flume stream
var  stream = FlumeUtils.createStream(ssc, "192.168.1.5",3564, StorageLevel.MEMORY_ONLY_SER_2)


// Print out the count of events received from this server in each batch
stream.count().map(cnt => "Received  flume events." + cnt ).print()
stream.count.print()
stream.print()
ssc.start()
ssc.awaitTermination()
}
}
import AssemblyKeys._

assemblySettings

name := "flume-test"

version := "1.0"

scalaVersion := "2.10.4"

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.0.0" % "provided"

libraryDependencies += "org.apache.spark" %% "spark-streaming" % "1.0.0" % "provided"

libraryDependencies += "org.apache.spark" %% "spark-streaming-flume" % "1.0.0" exclude("org.apache.spark","spark-core") exclude("org.apache.spark", "spark-streaming_2.10")

resolvers += "Akka Repository" at "http://repo.akka.io/releases/"
我用下面的命令运行程序

/tmp/spark-1.0.0-bin-hadoop2/bin/spark-submit --class FlumeEventCount --master local --deploy-mode client /tmp/fooproj/target/scala-2.10/cert-log-manager-assembly-1.0.jar 
另一方面,flume应用程序正确地发送了所有信息,我可以在日志中看到它收到的信息

我没有对spark的配置做任何更改,也没有设置任何环境变量,我只是下载并解包了程序

有人能告诉我我做错了什么吗

//编辑:当我执行spark的FlumeEventCount示例时,它是有效的
//edit2:如果我删除等待终止并添加ssc.stop,它会一次性打印所有内容,我想这是因为某些内容正在刷新。

…我现在应该更仔细地学习rtfm了

引用本页内容:

//Spark Streaming至少需要两个工作线程 val ssc=新的StreamingContext(“本地[2]”,“网络字计数”,秒(1))

我只用一根线就启动了spark 此外,以下操作也可以正常工作

stream.map(event=>"Event: header:"+ event.event.get(0).toString+" body:"+ new String(event.event.getBody.array) ).print

……我现在应该更仔细地学习rtfm了

引用本页内容:

//Spark Streaming至少需要两个工作线程 val ssc=新的StreamingContext(“本地[2]”,“网络字计数”,秒(1))

我只用一根线就启动了spark 此外,以下操作也可以正常工作

stream.map(event=>"Event: header:"+ event.event.get(0).toString+" body:"+ new String(event.event.getBody.array) ).print

我对卡夫卡也有同样的问题。所有都在运行,但在批处理结束时没有结果。您发现问题了吗?到目前为止,我已通过执行以下stream.map(event=>“event:header:”+event.event.get(0)。toString+“body:”+event.event.get(1)。asInstanceOf[ByteBuffer].asCharBuffer).打印,对于更复杂的事情,我有一个函数,它返回我想要打印的任何东西,作为rdd,可以用stream的打印方法打印。当我有一些有用的东西时,我会发布一个答案。我与卡夫卡有完全相同的问题。所有都在运行,但在批处理结束时没有结果。您发现问题了吗?到目前为止,我已通过执行以下stream.map(event=>“event:header:”+event.event.get(0)。toString+“body:”+event.event.get(1)。asInstanceOf[ByteBuffer].asCharBuffer).打印,对于更复杂的事情,我有一个函数,它返回我想要打印的任何东西,作为rdd,可以用stream的打印方法打印。当我有一些有用的东西时,我会发布一个答案