Scala Can';t在使用sbt测试时初始化Spark上下文

Scala Can';t在使用sbt测试时初始化Spark上下文,scala,apache-spark,sbt,akka,Scala,Apache Spark,Sbt,Akka,我已经在Spark中使用Specs2框架中的Scala编写了单元测试用例。在一些测试中,我创建了一个Spark上下文并传入函数 val conf = new SparkConf().setAppName("test").setMaster("local[2]") val sc = new SparkContext(conf) val rdd = sc.parallelize(arr) val output = Util.ge

我已经在Spark中使用Specs2框架中的Scala编写了单元测试用例。在一些测试中,我创建了一个Spark上下文并传入函数

         val conf = new SparkConf().setAppName("test").setMaster("local[2]")
         val sc = new SparkContext(conf)
         val rdd = sc.parallelize(arr)
         val output = Util.getHistograms(rdd, header, skipCols, nBins)
这些测试在eclipse JUnit插件中正确执行,没有错误或失败,但是当我运行
sbt test
时,我得到一个奇怪的异常,测试返回时有错误

[info] Case 8: getHistograms should
[error]   ! return with correct output
[error]    akka.actor.InvalidActorNameException: actor name [ExecutorEndpoint] is not unique! (ChildrenContainer.scala:192)
[error] akka.actor.dungeon.ChildrenContainer$TerminatingChildrenContainer.reserve(ChildrenContainer.scala:192)
[error] akka.actor.dungeon.Children$class.reserveChild(Children.scala:77)
[error] akka.actor.ActorCell.reserveChild(ActorCell.scala:369)
[error] akka.actor.dungeon.Children$class.makeChild(Children.scala:202)
[error] akka.actor.dungeon.Children$class.attachChild(Children.scala:42)
[error] akka.actor.ActorCell.attachChild(ActorCell.scala:369)
[error] akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:552)
[error] org.apache.spark.rpc.akka.AkkaRpcEnv.actorRef$lzycompute$1(AkkaRpcEnv.scala:92)
[error] org.apache.spark.rpc.akka.AkkaRpcEnv.org$apache$spark$rpc$akka$AkkaRpcEnv$$actorRef$1(AkkaRpcEnv.scala:92)
[error] org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$setupEndpoint$1.apply(AkkaRpcEnv.scala:148)
[error] org.apache.spark.rpc.akka.AkkaRpcEnv$$anonfun$setupEndpoint$1.apply(AkkaRpcEnv.scala:148)
[error] org.apache.spark.rpc.akka.AkkaRpcEndpointRef.actorRef$lzycompute(AkkaRpcEnv.scala:281)
[error] org.apache.spark.rpc.akka.AkkaRpcEndpointRef.actorRef(AkkaRpcEnv.scala:281)
[error] org.apache.spark.rpc.akka.AkkaRpcEndpointRef.hashCode(AkkaRpcEnv.scala:329)
[error] org.apache.spark.rpc.akka.AkkaRpcEnv.registerEndpoint(AkkaRpcEnv.scala:73)
[error] org.apache.spark.rpc.akka.AkkaRpcEnv.setupEndpoint(AkkaRpcEnv.scala:149)
[error] org.apache.spark.executor.Executor.<init>(Executor.scala:89)
[error] org.apache.spark.scheduler.local.LocalEndpoint.<init>(LocalBackend.scala:57)
[error] org.apache.spark.scheduler.local.LocalBackend.start(LocalBackend.scala:119)
[error] org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144)
[error] org.apache.spark.SparkContext.<init>(SparkContext.scala:514)
[error] UtilTest$$anonfun$8$$anonfun$apply$29.apply(UtilTest.scala:113)
[error] UtilTest$$anonfun$8$$anonfun$apply$29.apply(UtilTest.scala:111)
[info]案例8:getHistograms应该
[错误]!返回正确的输出
[错误]akka.actor.InvalidActorNameException:actor名称[ExecutorEndpoint]不唯一!(ChildrenContainer.scala:192)
[错误]akka.actor.dungeon.ChildrenContainer$TerminatingChildrenContainer.reserve(ChildrenContainer.scala:192)
[错误]akka.actor.dungeon.Children$class.reserveChild(Children.scala:77)
[错误]akka.actor.ActorCell.reserveChild(ActorCell.scala:369)
[错误]akka.actor.dungeon.Children$class.makeChild(Children.scala:202)
[错误]akka.actor.dungeon.Children$class.attachChild(Children.scala:42)
[错误]akka.actor.ActorCell.attachChild(ActorCell.scala:369)
[错误]akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:552)
[错误]org.apache.spark.rpc.akka.akkarpcev.actorRef$lzycompute$1(akkarpcev.scala:92)
[错误]org.apache.spark.rpc.akka.akkarpcev.org$apache$spark$rpc$akka$akkarpcev$$actorRef$1(akkarpcev.scala:92)
[错误]org.apache.spark.rpc.akka.akkarpcev$$anonfun$setupEndpoint$1.apply(akkarpcev.scala:148)
[错误]org.apache.spark.rpc.akka.akkarpcev$$anonfun$setupEndpoint$1.apply(akkarpcev.scala:148)
[错误]org.apache.spark.rpc.akka.AkkaRpcEndpointRef.actorRef$lzycompute(akkarpcev.scala:281)
[错误]org.apache.spark.rpc.akka.AkkaRpcEndpointRef.actorRef(akkarpcev.scala:281)
[错误]org.apache.spark.rpc.akka.AkkaRpcEndpointRef.hashCode(akkarpcev.scala:329)
[错误]org.apache.spark.rpc.akka.akkarpcev.registerEndpoint(akkarpcev.scala:73)
[错误]org.apache.spark.rpc.akka.akkarpcev.setupEndpoint(akkarpcev.scala:149)
[错误]org.apache.spark.executor.executor.(executor.scala:89)
[错误]org.apache.spark.scheduler.local.LocalEndpoint.(LocalBackend.scala:57)
[错误]org.apache.spark.scheduler.local.LocalBackend.start(LocalBackend.scala:119)
[错误]org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144)
[错误]org.apache.spark.SparkContext。(SparkContext.scala:514)
[错误]UtilTest$$anonfun$8$$anonfun$apply$29.apply(UtilTest.scala:113)
[错误]UtilTest$$anonfun$8$$anonfun$apply$29.apply(UtilTest.scala:111)
我猜是因为SparkContext(sc)并没有被创建,我得到了一个空值,但我不明白是什么导致了这个。
提前感谢。

这是因为sbt同时执行所有测试,因此由于多次运行规范文件,创建了多个SparkContext。
要解决此问题,请添加一个单独的对象并在其中初始化SparkContext。在整个测试代码中都使用这个sc,这样就不会多次创建它。

事实上,原因更简单——不能在同一JVM中同时运行可变的spark上下文。sbt测试并行执行测试,这意味着如果您的所有测试都生成了spark上下文,那么测试将失败

要防止这种情况发生,请在build.sbt中添加以下内容:

// super important with multiple tests running spark Contexts
parallelExecution in Test := false

这将导致顺序测试的执行。

没有任何区别,所以您的问题是您在一个测试中创建了多个spark上下文?如果是这样,考虑使用前和后特性,并在测试中只使用一个星星之火,正如你所建议的。