“线程中的异常”;dispatcher-event-loop-1“;java.lang.OutOfMemoryError:java堆空间

“线程中的异常”;dispatcher-event-loop-1“;java.lang.OutOfMemoryError:java堆空间,java,apache-spark,google-cloud-platform,heap-memory,Java,Apache Spark,Google Cloud Platform,Heap Memory,我正在使用谷歌云平台使用Spark(2.0.2)进行图像处理。当我执行代码(Java)时,会出现以下错误: [第一阶段:>(0+0)/2]17/10/15 13:39:44警告org.apache.spark.scheduler.TaskSetManager:阶段1 包含一个非常大的任务(165836 KB)。最大值 建议的任务大小为100 KB [阶段1:>(0+1)/2]线程“dispatcher-event-loop-1”java.lang.OutOfMemoryError中出现异常:ja

我正在使用谷歌云平台使用Spark(2.0.2)进行图像处理。当我执行代码(Java)时,会出现以下错误:

[第一阶段:>(0+0)/2]17/10/15 13:39:44警告org.apache.spark.scheduler.TaskSetManager:阶段1 包含一个非常大的任务(165836 KB)。最大值 建议的任务大小为100 KB

[阶段1:>(0+1)/2]线程“dispatcher-event-loop-1”java.lang.OutOfMemoryError中出现异常:java堆空间

在java.util.Arrays.copyOf(Arrays.java:3236)
在java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:118)处
在java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) 在java.io.ByteArrayOutputStream.write处(ByteArrayOutputStream.java:153) 位于org.apache.spark.util.ByteBufferOutputStream.write(ByteBufferOutputStream.scala:41) 在java.io.ObjectOutputStream$BlockDataOutputStream.write处(ObjectOutputStream.java:1853) 在java.io.ObjectOutputStream.write(ObjectOutputStream.java:709)处 channels$writeablebytechannelimpl.write(channels.java:458) 位于org.apache.spark.util.SerializableBuffer$$anonfun$writeObject$1.apply(SerializableBuffer.scala:49) 位于org.apache.spark.util.SerializableBuffer$$anonfun$writeObject$1.apply(SerializableBuffer.scala:47) 位于org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1276) 位于org.apache.spark.util.SerializableBuffer.writeObject(SerializableBuffer.scala:47) 在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)
位于sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)中 在java.lang.reflect.Method.invoke(Method.java:498)
位于java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1028) 位于java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496) 位于java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) 位于java.io.ObjectOutputStream.WriteObject 0(ObjectOutputStream.java:1178) 位于java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) 位于java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509) 位于java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) 位于java.io.ObjectOutputStream.WriteObject 0(ObjectOutputStream.java:1178) 位于java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348) 位于org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43) 位于org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100) 位于org.apache.spark.scheduler.cluster.RoughGrainedSchedulerBackend$DriverEndpoint$$anonfun$launchTasks$1.apply(RoughGrainedSchedulerBackend.scala:250) 位于org.apache.spark.scheduler.cluster.RoughGrainedSchedulerBackend$DriverEndpoint$$anonfun$launchTasks$1.apply(RoughGrainedSchedulerBackend.scala:249) 位于scala.collection.immutable.List.foreach(List.scala:381) org.apache.spark.scheduler.cluster.roughGrainedSchedulerBackend$DriverEndpoint.launchTasks(roughGrainedSchedulerBackend.scala:249) 在 org.apache.spark.scheduler.cluster.roughGrainedSchedulerBackend$DriverEndpoint.org$apache$spark$scheduler$cluster$roughGrainedSchedulerBackend$DriverEndpoint$$makeOffers(roughGrainedSchedulerBackend.scala:220)

在哪里以及如何增加Java堆空间

我的节目:

public static void main(String[] args) {
  try{

          //Configuration de Spark .... 
          SparkSession spark = SparkSession
                .builder()
                .appName("Features")
                .getOrCreate();

          JavaSparkContext jsc = new JavaSparkContext(spark.sparkContext());

          //Configuration HBase  .... 
          String tableName = "Descripteurs";
          Configuration conf = HBaseConfiguration.create();
          conf.addResource(new Path("/home/ibtissam/hbase-1.2.5/conf/hbase-site.xml"));
          conf.addResource(new Path("/home/ibtissam/hbase-1.2.5/conf/core-site.xml"));
          conf.set(TableInputFormat.INPUT_TABLE, tableName);

          Connection connection = ConnectionFactory.createConnection(conf);
          Admin admin = connection.getAdmin(); 
          Table tab = connection.getTable(TableName.valueOf(tableName));

          for (int n=0; n<10; n++) {
              List<String> images =new ArrayList<>();
              String repertory_ = "/home/ibtissam/images-test-10000/images-"+n+"/"; 
              File repertory = new File(repertory_);
              String files[] = repertory.list(); 

              for(int k=0; k<10;k++){
                  ExecutorService executorService = Executors.newCachedThreadPool();
                  List<MyRunnable> runnableList = new ArrayList<>();

                  for(int i=k*100; i<(k+1)*100 ; i++){
                        MyRunnable runnable = new MyRunnable(repertory_+files[i]); 
                        runnableList.add(runnable);
                        executorService.execute(runnable);
                  }
                  executorService.shutdown();

                  while(!executorService.isTerminated()){}

                  for (int i=0; i<runnableList.size(); i++) {
                      images.add(runnableList.get(i).descripteurs_);
                  }
              }

          JavaRDD<String> rdd = jsc.parallelize(images, 2000);

          //Calcul des descripteurs
          JavaPairRDD<String,String> rdd_final = rdd.mapToPair(new PairFunction<String,String,String>() {
                @Override
                public Tuple2<String,String> call(String value) {

                  String strTab[] = value.split(","); 
                  int h = Integer.parseInt(strTab[1]);
                  int w = Integer.parseInt(strTab[2]);
                  String type = strTab[3]; 
                  String nom = strTab[0];
                  String key = nom+"-"+h+"-"+w+"-"+type;

                  // Conversion de String >> Mat
                  Mat image = new Mat(h, w, 16);
                  UByteRawIndexer idx = image.createIndexer();
                  int indice = 4;
                  for (int i =0;i<h;i++) {
                    for (int j=0;j<w;j++) {
                      idx.put(i, j, Integer.parseInt(strTab[indice]));
                      indice = indice++;
                    }
                  }

                  // Calcul des features 
                  SIFT sift = new SIFT().create(); 
                  KeyPointVector keypoints = new KeyPointVector();
                  Mat descriptors = new Mat();

                  image.convertTo(image, CV_8UC3);

                  sift.detect(image, keypoints);

                  KeyPointVector keypoints_sorted = new KeyPointVector(); 
                  keypoints_sorted = sort(keypoints);
                  KeyPointVector  keypoints_2 = new KeyPointVector((keypoints_sorted.size())/4); 
                  for (int k = 0; k < (keypoints_sorted.size())/4; k++){
                      keypoints_2.put(k, keypoints_sorted.get(k));  
                  }

                  sift.compute(image,keypoints_2,descriptors);
                  image.release(); 

                  int hDes = descriptors.size().height();
                  int wDes = descriptors.size().width();
                  key = key +"-"+hDes+"-"+wDes+"-"+descriptors.type();

                  while(hDes ==0 | wDes==0){
                      SIFT sift_ = new SIFT().create(); 
                      KeyPointVector keypoints_ = new KeyPointVector();

                      sift.detect(image, keypoints_);

                      KeyPointVector keypoints_sorted_ = new KeyPointVector(); 
                      keypoints_sorted_ = sort(keypoints_);
                      KeyPointVector  keypoints_2_ = new KeyPointVector((keypoints_sorted_.size())/4); 
                      for (int k = 0; k < (keypoints_sorted_.size())/4; k++){
                          keypoints_2_.put(k, keypoints_sorted_.get(k));  
                      }

                      sift_.compute(image,keypoints_2_,descriptors);
                  }

                  // Converion des features => String 
                  String featuresStr = new String("");
                  FloatRawIndexer idx_ = descriptors.createIndexer(); 
                  int position =0;

                  for (int i =0;i < descriptors.size().height();i++) {
                    for (int j =0;j < descriptors.size().width();j++) {

                      if (position == 0) {
                          featuresStr = String.valueOf(idx_.get(position))+",";
                      }
                      if (position == ((descriptors.size().height()*descriptors.size().width())-1) ){
                          featuresStr = featuresStr + String.valueOf(idx_.get(position));                  
                      }else{
                          featuresStr = featuresStr + String.valueOf(idx_.get(position))+","; 
                      }
                      position++;
                    }
                  }
                  descriptors.release(); 
                  Tuple2<String, String> tuple = new Tuple2<>(key, featuresStr);
                  return tuple;
                } 
              });

              System.out.println("Fin de calcul des descripteurs  .... ");

              List<Tuple2<String,String>> liste = rdd_final.collect();

              System.out.println("Insertion dans hbase .... \n");
              for (int b=0; b<liste.size(); b++) {

                    String metadata[] = liste.get(b)._1().split("-"); 
                    String data = liste.get(b)._2();
                    // Row 
                    byte [] row = Bytes.toBytes(liste.get(b)._1());

                    // Family
                    byte [] family1 = Bytes.toBytes("Metadata");
                    byte [] family2 = Bytes.toBytes("Data");

                    // Qualifiers
                    byte [] height = Bytes.toBytes("height");
                    byte [] width = Bytes.toBytes("width");
                    byte [] colorSpace = Bytes.toBytes("colorSpace");
                    byte [] name = Bytes.toBytes("name");

                    byte [] features = Bytes.toBytes("features");

                    // Create Put
                    Put put = new Put(row);
                    put.addColumn(family1, height, Bytes.toBytes(metadata[5]));
                    put.addColumn(family1, width, Bytes.toBytes(metadata[6]));
                    put.addColumn(family1, name, Bytes.toBytes(metadata[0]+"-"+metadata[1]+"-"+metadata[2]+"-"+metadata[3]));
                    put.addColumn(family1, colorSpace, Bytes.toBytes(metadata[4]));
                    put.addColumn(family2, features, Bytes.toBytes(liste.get(b)._2()));
                    tab.put(put);
              }
            }
            jsc.close();

      }catch(Exception e){

        System.out.println(e);
      }
    }
publicstaticvoidmain(字符串[]args){
试一试{
//配置去火花。。。。
火花会话火花=火花会话
.builder()
.appName(“功能”)
.getOrCreate();
JavaSparkContext jsc=新的JavaSparkContext(spark.sparkContext());
//配置HBase。。。。
String tableName=“描述符”;
Configuration=HBaseConfiguration.create();
conf.addResource(新路径(“/home/ibtissam/hbase-1.2.5/conf/hbase site.xml”);
conf.addResource(新路径(“/home/ibtissam/hbase-1.2.5/conf/core site.xml”);
conf.set(TableInputFormat.INPUT_TABLE,tableName);
Connection=ConnectionFactory.createConnection(conf);
Admin=connection.getAdmin();
Table tab=connection.getTable(TableName.valueOf(TableName));

对于(int n=0;n尝试将驱动程序堆增加“-driver memory XXXXm”

尝试将驱动程序堆增加“-driver memory XXXXm”

您确实需要专注于减少任务大小,而不是增加JVM大小。为什么您的任务需要~160MB堆空间?请将您的程序添加到您的问题中。看起来
broadcast
可能会有所帮助,因为里面使用了一个查找表(或类似的东西)@Progman我已将我的程序添加到问题中。@JoeC我已将我的任务大小减少到370Kb,我做得再好不过了。你能告诉我如何在谷歌云上增加堆大小吗?你真的需要专注于减少任务大小,而不是增加JVM大小。为什么你的任务需要~160MB的堆空间?请将你的程序添加到你的问题中n、 看起来
broadcast
在这里可能会有所帮助,因为里面有一个查找表(或类似的东西)。@Progman我已经将我的程序添加到了问题中。@JoeC我已经将我的任务大小减少到370Kb,我做得再好不过了。你能告诉我如何在谷歌云上增加堆大小吗?