Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/19.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scalatest和Spark giving“;java.io.NotSerializableException:org.scalatest.Assertions$AssertionHelper";_Scala_Apache Spark_Serialization_Rdd_Scalatest - Fatal编程技术网

Scalatest和Spark giving“;java.io.NotSerializableException:org.scalatest.Assertions$AssertionHelper";

Scalatest和Spark giving“;java.io.NotSerializableException:org.scalatest.Assertions$AssertionHelper";,scala,apache-spark,serialization,rdd,scalatest,Scala,Apache Spark,Serialization,Rdd,Scalatest,我在“com.holdenkarau.Spark testing base”和scalatest的帮助下测试Spark流媒体应用程序 import com.holdenkarau.spark.testing.StreamingSuiteBase import org.apache.spark.rdd.RDD import org.scalatest.{ BeforeAndAfter, FunSuite } class Test extends FunSuite with BeforeAndAf

我在“com.holdenkarau.Spark testing base”和scalatest的帮助下测试Spark流媒体应用程序

import com.holdenkarau.spark.testing.StreamingSuiteBase
import org.apache.spark.rdd.RDD
import org.scalatest.{ BeforeAndAfter, FunSuite }

class Test extends FunSuite with BeforeAndAfter with StreamingSuiteBase {

  var delim: String = ","

  before {
    System.clearProperty("spark.driver.port")
   }

  test(“This Fails“) {

    val source = scala.io.Source.fromURL(getClass.getResource(“/some_logs.csv"))
    val input = source.getLines.toList

    val rowRDDOut = Calculator.do(sc.parallelize(input))   //Returns DataFrame

    val report: RDD[String] = rowRDDOut.map(row => new String(row.getAs[String](0) + delim + row.getAs[String](1))

    source.close
  }
}
我得到字段“delim”的序列化异常:

org.apache.spark.SparkException: Task not serializable
[info]   at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:304)
[info]   at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:294)
[info]   at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122)
[info]   at org.apache.spark.SparkContext.clean(SparkContext.scala:2055)
[info]   at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:324)
[info]   at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:323)
[info]   at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
[info]   at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
[info]   at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
[info]   at org.apache.spark.rdd.RDD.map(RDD.scala:323)
[info]   ...
[info]   Cause: java.io.NotSerializableException: org.scalatest.Assertions$AssertionsHelper
[info] Serialization stack:
[info]  - object not serializable (class: org.scalatest.Assertions$AssertionsHelper, value: org.scalatest.Assertions$AssertionsHelper@78b339fa)
[info]  - field (class: org.scalatest.FunSuite, name: assertionsHelper, type: class org.scalatest.Assertions$AssertionsHelper)
如果我用字符串值替换'delim',它就可以正常工作

val report: RDD[String] = rowRDDOut.map(row => new String(row.getAs[String](0) + “,” + row.getAs[String](1))
第一版和第二版有什么区别


提前谢谢

问题不在于
delim
(字符串)的类型,而在于
delim
本身

尽量不要在
test()
方法之外定义变量。如果在
test
中定义
delm
,它应该可以工作

test(“This Fails“) {
   val delim = ","
   ...
}

现在,你可能会问为什么?好的,当您从外部范围引用
delim
时,Scala将尝试将封闭对象
类测试
组合在一起。此对象包含对
org.scalatest.Assertions$assertionHelper
的引用,表示它不可序列化(请参阅stacktrace)

我今天遇到了这个问题,即使在我将所有代码移动到测试中(如中所述)之后,错误仍然存在

最后,我发现我在代码中使用了错误的语法(编译器没有捕捉到)。在我的例子中,它是这样的:

// Wrong
df.filter(x => x.id === y)

// Right
df.filter(x => x.id == y)

哇,先生!我甚至不能这么想!谢谢虽然我也犯了同样的错误,但我从没想过我的测试课已经结束了