“线程中的异常”；dispatcher-event-loop-1“；java.lang.OutOfMemoryError:java堆空间_Java_Apache Spark_Google Cloud Platform_Heap Memory

“线程中的异常”；dispatcher-event-loop-1“；java.lang.OutOfMemoryError:java堆空间

java apache-spark google-cloud-platform

“线程中的异常”；dispatcher-event-loop-1“；java.lang.OutOfMemoryError:java堆空间,java,apache-spark,google-cloud-platform,heap-memory,Java,Apache Spark,Google Cloud Platform,Heap Memory,我正在使用谷歌云平台使用Spark（2.0.2）进行图像处理。当我执行代码（Java）时，会出现以下错误： [第一阶段：>（0+0）/2]17/10/15 13:39:44警告org.apache.spark.scheduler.TaskSetManager:阶段1 包含一个非常大的任务（165836 KB）。最大值建议的任务大小为100 KB [阶段1:>（0+1）/2]线程“dispatcher-event-loop-1”java.lang.OutOfMemoryError中出现异常：ja

我正在使用谷歌云平台使用Spark（2.0.2）进行图像处理。当我执行代码（Java）时，会出现以下错误：

[第一阶段：>（0+0）/2]17/10/15 13:39:44警告org.apache.spark.scheduler.TaskSetManager:阶段1 包含一个非常大的任务（165836 KB）。最大值建议的任务大小为100 KB

[阶段1:>（0+1）/2]线程“dispatcher-event-loop-1”java.lang.OutOfMemoryError中出现异常：java堆空间

在java.util.Arrays.copyOf（Arrays.java:3236）
在java.io.ByteArrayOutputStream.grow（ByteArrayOutputStream.java:118）处
在java.io.ByteArrayOutputStream.ensureCapacity（ByteArrayOutputStream.java:93）在java.io.ByteArrayOutputStream.write处（ByteArrayOutputStream.java:153）位于org.apache.spark.util.ByteBufferOutputStream.write（ByteBufferOutputStream.scala:41）在java.io.ObjectOutputStream$BlockDataOutputStream.write处（ObjectOutputStream.java:1853）在java.io.ObjectOutputStream.write（ObjectOutputStream.java:709）处 channels$writeablebytechannelimpl.write（channels.java:458）位于org.apache.spark.util.SerializableBuffer$$anonfun$writeObject$1.apply（SerializableBuffer.scala:49）位于org.apache.spark.util.SerializableBuffer$$anonfun$writeObject$1.apply（SerializableBuffer.scala:47）位于org.apache.spark.util.Utils$.tryOrIOException（Utils.scala:1276）位于org.apache.spark.util.SerializableBuffer.writeObject（SerializableBuffer.scala:47）在sun.reflect.NativeMethodAccessorImpl.invoke0（本机方法）
位于sun.reflect.NativeMethodAccessorImpl.invoke（NativeMethodAccessorImpl.java:62）在sun.reflect.DelegatingMethodAccessorImpl.invoke（DelegatingMethodAccessorImpl.java:43）中在java.lang.reflect.Method.invoke（Method.java:498）
位于java.io.ObjectStreamClass.invokeWriteObject（ObjectStreamClass.java:1028）位于java.io.ObjectOutputStream.writeSerialData（ObjectOutputStream.java:1496）位于java.io.ObjectOutputStream.writeOrdinaryObject（ObjectOutputStream.java:1432）位于java.io.ObjectOutputStream.WriteObject 0（ObjectOutputStream.java:1178）位于java.io.ObjectOutputStream.defaultWriteFields（ObjectOutputStream.java:1548）位于java.io.ObjectOutputStream.writeSerialData（ObjectOutputStream.java:1509）位于java.io.ObjectOutputStream.writeOrdinaryObject（ObjectOutputStream.java:1432）位于java.io.ObjectOutputStream.WriteObject 0（ObjectOutputStream.java:1178）位于java.io.ObjectOutputStream.writeObject（ObjectOutputStream.java:348）位于org.apache.spark.serializer.JavaSerializationStream.writeObject（JavaSerializer.scala:43）位于org.apache.spark.serializer.JavaSerializerInstance.serialize（JavaSerializer.scala:100）位于org.apache.spark.scheduler.cluster.RoughGrainedSchedulerBackend$DriverEndpoint$$anonfun$launchTasks$1.apply（RoughGrainedSchedulerBackend.scala:250）位于org.apache.spark.scheduler.cluster.RoughGrainedSchedulerBackend$DriverEndpoint$$anonfun$launchTasks$1.apply（RoughGrainedSchedulerBackend.scala:249）位于scala.collection.immutable.List.foreach（List.scala:381） org.apache.spark.scheduler.cluster.roughGrainedSchedulerBackend$DriverEndpoint.launchTasks（roughGrainedSchedulerBackend.scala:249）在 org.apache.spark.scheduler.cluster.roughGrainedSchedulerBackend$DriverEndpoint.org$apache$spark$scheduler$cluster$roughGrainedSchedulerBackend$DriverEndpoint$$makeOffers（roughGrainedSchedulerBackend.scala:220）

在哪里以及如何增加Java堆空间

我的节目：

public static void main(String[] args) {
  try{

          //Configuration de Spark .... 
          SparkSession spark = SparkSession
                .builder()
                .appName("Features")
                .getOrCreate();

          JavaSparkContext jsc = new JavaSparkContext(spark.sparkContext());

          //Configuration HBase  .... 
          String tableName = "Descripteurs";
          Configuration conf = HBaseConfiguration.create();
          conf.addResource(new Path("/home/ibtissam/hbase-1.2.5/conf/hbase-site.xml"));
          conf.addResource(new Path("/home/ibtissam/hbase-1.2.5/conf/core-site.xml"));
          conf.set(TableInputFormat.INPUT_TABLE, tableName);

          Connection connection = ConnectionFactory.createConnection(conf);
          Admin admin = connection.getAdmin(); 
          Table tab = connection.getTable(TableName.valueOf(tableName));

          for (int n=0; n<10; n++) {
              List<String> images =new ArrayList<>();
              String repertory_ = "/home/ibtissam/images-test-10000/images-"+n+"/"; 
              File repertory = new File(repertory_);
              String files[] = repertory.list(); 

              for(int k=0; k<10;k++){
                  ExecutorService executorService = Executors.newCachedThreadPool();
                  List<MyRunnable> runnableList = new ArrayList<>();

                  for(int i=k*100; i<(k+1)*100 ; i++){
                        MyRunnable runnable = new MyRunnable(repertory_+files[i]); 
                        runnableList.add(runnable);
                        executorService.execute(runnable);
                  }
                  executorService.shutdown();

                  while(!executorService.isTerminated()){}

                  for (int i=0; i<runnableList.size(); i++) {
                      images.add(runnableList.get(i).descripteurs_);
                  }
              }

          JavaRDD<String> rdd = jsc.parallelize(images, 2000);

          //Calcul des descripteurs
          JavaPairRDD<String,String> rdd_final = rdd.mapToPair(new PairFunction<String,String,String>() {
                @Override
                public Tuple2<String,String> call(String value) {

                  String strTab[] = value.split(","); 
                  int h = Integer.parseInt(strTab[1]);
                  int w = Integer.parseInt(strTab[2]);
                  String type = strTab[3]; 
                  String nom = strTab[0];
                  String key = nom+"-"+h+"-"+w+"-"+type;

                  // Conversion de String >> Mat
                  Mat image = new Mat(h, w, 16);
                  UByteRawIndexer idx = image.createIndexer();
                  int indice = 4;
                  for (int i =0;i<h;i++) {
                    for (int j=0;j<w;j++) {
                      idx.put(i, j, Integer.parseInt(strTab[indice]));
                      indice = indice++;
                    }
                  }

                  // Calcul des features 
                  SIFT sift = new SIFT().create(); 
                  KeyPointVector keypoints = new KeyPointVector();
                  Mat descriptors = new Mat();

                  image.convertTo(image, CV_8UC3);

                  sift.detect(image, keypoints);

                  KeyPointVector keypoints_sorted = new KeyPointVector(); 
                  keypoints_sorted = sort(keypoints);
                  KeyPointVector  keypoints_2 = new KeyPointVector((keypoints_sorted.size())/4); 
                  for (int k = 0; k < (keypoints_sorted.size())/4; k++){
                      keypoints_2.put(k, keypoints_sorted.get(k));  
                  }

                  sift.compute(image,keypoints_2,descriptors);
                  image.release(); 

                  int hDes = descriptors.size().height();
                  int wDes = descriptors.size().width();
                  key = key +"-"+hDes+"-"+wDes+"-"+descriptors.type();

                  while(hDes ==0 | wDes==0){
                      SIFT sift_ = new SIFT().create(); 
                      KeyPointVector keypoints_ = new KeyPointVector();

                      sift.detect(image, keypoints_);

                      KeyPointVector keypoints_sorted_ = new KeyPointVector(); 
                      keypoints_sorted_ = sort(keypoints_);
                      KeyPointVector  keypoints_2_ = new KeyPointVector((keypoints_sorted_.size())/4); 
                      for (int k = 0; k < (keypoints_sorted_.size())/4; k++){
                          keypoints_2_.put(k, keypoints_sorted_.get(k));  
                      }

                      sift_.compute(image,keypoints_2_,descriptors);
                  }

                  // Converion des features => String 
                  String featuresStr = new String("");
                  FloatRawIndexer idx_ = descriptors.createIndexer(); 
                  int position =0;

                  for (int i =0;i < descriptors.size().height();i++) {
                    for (int j =0;j < descriptors.size().width();j++) {

                      if (position == 0) {
                          featuresStr = String.valueOf(idx_.get(position))+",";
                      }
                      if (position == ((descriptors.size().height()*descriptors.size().width())-1) ){
                          featuresStr = featuresStr + String.valueOf(idx_.get(position));                  
                      }else{
                          featuresStr = featuresStr + String.valueOf(idx_.get(position))+","; 
                      }
                      position++;
                    }
                  }
                  descriptors.release(); 
                  Tuple2<String, String> tuple = new Tuple2<>(key, featuresStr);
                  return tuple;
                } 
              });

              System.out.println("Fin de calcul des descripteurs  .... ");

              List<Tuple2<String,String>> liste = rdd_final.collect();

              System.out.println("Insertion dans hbase .... \n");
              for (int b=0; b<liste.size(); b++) {

                    String metadata[] = liste.get(b)._1().split("-"); 
                    String data = liste.get(b)._2();
                    // Row 
                    byte [] row = Bytes.toBytes(liste.get(b)._1());

                    // Family
                    byte [] family1 = Bytes.toBytes("Metadata");
                    byte [] family2 = Bytes.toBytes("Data");

                    // Qualifiers
                    byte [] height = Bytes.toBytes("height");
                    byte [] width = Bytes.toBytes("width");
                    byte [] colorSpace = Bytes.toBytes("colorSpace");
                    byte [] name = Bytes.toBytes("name");

                    byte [] features = Bytes.toBytes("features");

                    // Create Put
                    Put put = new Put(row);
                    put.addColumn(family1, height, Bytes.toBytes(metadata[5]));
                    put.addColumn(family1, width, Bytes.toBytes(metadata[6]));
                    put.addColumn(family1, name, Bytes.toBytes(metadata[0]+"-"+metadata[1]+"-"+metadata[2]+"-"+metadata[3]));
                    put.addColumn(family1, colorSpace, Bytes.toBytes(metadata[4]));
                    put.addColumn(family2, features, Bytes.toBytes(liste.get(b)._2()));
                    tab.put(put);
              }
            }
            jsc.close();

      }catch(Exception e){

        System.out.println(e);
      }
    }

publicstaticvoidmain（字符串[]args）{
试一试{
//配置去火花。。。。
火花会话火花=火花会话
.builder（）
.appName（“功能”）
.getOrCreate（）；
JavaSparkContext jsc=新的JavaSparkContext（spark.sparkContext（））；
//配置HBase。。。。
String tableName=“描述符”；
Configuration=HBaseConfiguration.create（）；
conf.addResource（新路径（“/home/ibtissam/hbase-1.2.5/conf/hbase site.xml”）；
conf.addResource（新路径（“/home/ibtissam/hbase-1.2.5/conf/core site.xml”）；
conf.set（TableInputFormat.INPUT_TABLE，tableName）；
Connection=ConnectionFactory.createConnection（conf）；
Admin=connection.getAdmin（）；
Table tab=connection.getTable（TableName.valueOf（TableName））；
对于（int n=0；n尝试将驱动程序堆增加“-driver memory XXXXm”
尝试将驱动程序堆增加“-driver memory XXXXm”
您确实需要专注于减少任务大小，而不是增加JVM大小。为什么您的任务需要~160MB堆空间？请将您的程序添加到您的问题中。看起来broadcast
可能会有所帮助，因为里面使用了一个查找表（或类似的东西）@Progman我已将我的程序添加到问题中。@JoeC我已将我的任务大小减少到370Kb，我做得再好不过了。你能告诉我如何在谷歌云上增加堆大小吗？你真的需要专注于减少任务大小，而不是增加JVM大小。为什么你的任务需要~160MB的堆空间？请将你的程序添加到你的问题中n、 看起来broadcast
在这里可能会有所帮助，因为里面有一个查找表（或类似的东西）。@Progman我已经将我的程序添加到了问题中。@JoeC我已经将我的任务大小减少到370Kb，我做得再好不过了。你能告诉我如何在谷歌云上增加堆大小吗？