Scala 尝试在mapPartitionsWithIndex中广播时出现NullPointerException
我试图在mapPartitionsWithIndex函数中广播数组,但这会引发NullPointerException。这是我的密码Scala 尝试在mapPartitionsWithIndex中广播时出现NullPointerException,scala,apache-spark,nullpointerexception,broadcast,Scala,Apache Spark,Nullpointerexception,Broadcast,我试图在mapPartitionsWithIndex函数中广播数组,但这会引发NullPointerException。这是我的密码 var bestSolutions=bcWrapper(sc,(Array():Array[BAT1],3:Int))// rdd.mapPartitionsWithIndex{(索引,迭代器)=> var li=iterator.toArray var arr1=arr.sortWith( println(“广播不能被破坏”,e) } 广播=sc.广播(v) }
var bestSolutions=bcWrapper(sc,(Array():Array[BAT1],3:Int))//
rdd.mapPartitionsWithIndex{(索引,迭代器)=>
var li=iterator.toArray
var arr1=arr.sortWith(
BCIS的定义
case类bcWrapper[T:ClassTag](@transient sc:SparkContext,
@瞬态(v:T)
扩展可序列化{
广播变量:广播[T]=sc.Broadcast(_v)
def更新(v:T):单位={
试一试{
广播。销毁()
}抓住{
案例e:可丢弃=>
println(“广播不能被破坏”,e)
}
广播=sc.广播(v)
}
def值:T=广播的.value
}
调用bcWrapper类的更新函数时,代码引发异常。如何解决此问题?您不应该从
map
/mapPartitions
更新或销毁广播对象。只读一读。您想解决什么问题?我想从每个分区和zip分区号中选择一些N元素,然后广播这些元素。为什么要广播结果?(rdd.mapPartitionsWithIndex{(index,iterator)=>val li=iterator.toArray val arr1=arr.sortWith(<;).take(5)iterator.single((index,arr1))).collect
对您有效吗?collect将导致过多的网络流量并导致性能损失