Scala 为什么星火广播没有';当我使用extends应用程序时,是否无法正常工作?
第一个代码抛出空指针异常Scala 为什么星火广播没有';当我使用extends应用程序时,是否无法正常工作?,scala,apache-spark,akka,Scala,Apache Spark,Akka,第一个代码抛出空指针异常 object TryBroadcast extends App{ val conf = new SparkConf().setAppName("o_o") val sc = new SparkContext(conf) val sample = sc.parallelize(1 to 1024) val bro = sc.broadcast(6666) val broSample = sample.map(x => x.toString + b
object TryBroadcast extends App{
val conf = new SparkConf().setAppName("o_o")
val sc = new SparkContext(conf)
val sample = sc.parallelize(1 to 1024)
val bro = sc.broadcast(6666)
val broSample = sample.map(x => x.toString + bro.value)
broSample.collect().foreach(println)
}
第二种方法效果很好
object TryBroadcast {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("o_o")
val sc = new SparkContext(conf)
val sample = sc.parallelize(1 to 1024)
val bro = sc.broadcast(6666)
val broSample = sample.map(x => x.toString + bro.value)
broSample.collect().foreach(println)
}
}
spark广播似乎与scala.App有冲突
scala版本:2.10.5
spark版本:1.4.0
堆栈跟踪:
lang.NullPointerException
at TryBroadcast$$anonfun$1.apply(TryBroadcast.scala:11)
at TryBroadcast$$anonfun$1.apply(TryBroadcast.scala:11)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:885)
at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$12.apply(RDD.scala:885)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1765)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1765)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
bro
在这两种情况下是完全不同的。在第一个示例中,它是单例类实例上的一个字段(TryBroadcast
)。在第二种情况下,它是一个局部变量
I局部变量被捕获、序列化并发送给执行器。在第一种情况下,引用指向一个字段,因此将捕获并发送单例。我不确定Scala单例是如何构建的,它是如何捕获的。显然,在本例中,当在执行器上访问它时,它最终未初始化
您可以将bro
设置为如下局部变量:
object TryBroadcast extends App {
val conf = new SparkConf().setAppName("o_o")
val sc = new SparkContext(conf)
val sample = sc.parallelize(1 to 1024)
val broSample = {
val bro = sc.broadcast(6666)
sample.map(x => x.toString + bro.value)
}
broSample.collect().foreach(println)
}
虽然没有很好的文档记录,但建议使用
defmain(args:Array[String]):Unit=???
而不是extends-App
参见和第11行是
sc.broadcast
或sample.map
?当我在本地主机上运行它时,它工作正常。如何运行它?第11行是sample.map。我组装了一个fatjar并将其提交给集群。我偶尔会遇到同样的问题。我希望有人能给出一个更详细的解释,说明单身汉最终是如何未初始化的。