Apache spark EMR Spark Cluster设备5000分区上没有剩余空间

Apache spark EMR Spark Cluster设备5000分区上没有剩余空间,apache-spark,amazon-emr,Apache Spark,Amazon Emr,当我们尝试运行Spark作业时,它正在处理2gb和10GB文件。我得到一个错误: org.apache.spark.SparkException: Job aborted due to stage failure: Task 2003 in stage 32.0 failed 1 times, most recent failure: Lost task 2003.0 in stage 32.0 (TID 33046, localhost, executor driver): java.io.F

当我们尝试运行Spark作业时,它正在处理2gb和10GB文件。我得到一个错误:

org.apache.spark.SparkException: Job aborted due to stage failure: Task 2003 in stage 32.0 failed 1 times, most recent failure: Lost task 2003.0 in stage 32.0 (TID 33046, localhost, executor driver): java.io.FileNotFoundException: /mnt/tmp/blockmgr-13430cd7-0455-4sfgs-a98f-7f96e0252471/13/temp_shuffle_e680c565-f17a-47cb-9ef9-29cdcf14e50f (No space left on device)
我们通过将这两个数据帧重新划分为5000来实现这一点(我们尝试了1001000,但遇到了相同的问题)

下面是我们正在使用的一些Spark配置

信息MemoryStore:MemoryStore以38.1 GB的容量启动

spark.executor.memory] =12G
spark.driver.memory =64G
spark.executor.cores =4
当我们创建实例时,我们正在连接一个100BG EBS卷(我们有5个节点的集群,它是从java aws sdk动态创建的)

不知道为什么我们在处理最多不超过20gb的文件时会耗尽空间


谢谢

我正在设置根卷

我必须改变创建集群的方式,现在我将EBS卷分配给集群中的每个实例

private InstanceFleetConfig getInstanceFleetConfig(String instanceType,InstanceFleetType instanceFleetType,int onDemandCapacity) {
        InstanceFleetConfig masterInstanceConfig = new InstanceFleetConfig();
        masterInstanceConfig.setInstanceFleetType(instanceFleetType);
        masterInstanceConfig.setTargetOnDemandCapacity(onDemandCapacity);

        InstanceTypeConfig instanceTypeConfig = new InstanceTypeConfig();
        instanceTypeConfig.setInstanceType(instanceType);

        EbsConfiguration ebsConfiguration = new EbsConfiguration();
        EbsBlockDeviceConfig ebsBDConfig = new EbsBlockDeviceConfig();
        // Setting the Volume Size to 100GB
        ebsBDConfig.setVolumeSpecification(new VolumeSpecification().withSizeInGB(100).withVolumeType(VolumeType.Gp2.toString()));

        List<EbsBlockDeviceConfig> ebsBlockDeviceConfigs = new ArrayList<EbsBlockDeviceConfig>();
        ebsBlockDeviceConfigs.add(ebsBDConfig);

        ebsConfiguration.setEbsBlockDeviceConfigs(ebsBlockDeviceConfigs);

        instanceTypeConfig.setEbsConfiguration(ebsConfiguration);

        List<InstanceTypeConfig> instanceTypeConfigs = new ArrayList<InstanceTypeConfig>();
        instanceTypeConfigs.add(instanceTypeConfig);

        masterInstanceConfig.setInstanceTypeConfigs(instanceTypeConfigs);

        return masterInstanceConfig;
    }
私有InstanceFleetConfig getInstanceFleetConfig(字符串instanceType,InstanceFleetType InstanceFleetType,int onDemandCapacity){
InstanceFleetConfig masterInstanceConfig=新InstanceFleetConfig();
masterInstanceConfig.setInstanceFleetType(instanceFleetType);
masterInstanceConfig.setTargetOnDemandCapacity(onDemandCapacity);
InstanceTypeConfig InstanceTypeConfig=新InstanceTypeConfig();
instanceTypeConfig.setInstanceType(instanceType);
EBSCOConfiguration EBSCOConfiguration=新EBSCOConfiguration();
EbsBlockDeviceConfig ebsBDConfig=新的EbsBlockDeviceConfig();
//将卷大小设置为100GB
ebsBDConfig.setVolumeSpecification(新的VolumeSpecification()。带有SizeingB(100)。带有VolumeType(VolumeType.Gp2.toString());
List ebsblockdeviceconfig=newarraylist();
添加(ebsBDConfig);
EBS配置。设置ebsBlockDeviceConfigs(ebsBlockDeviceConfigs);
instanceTypeConfig.setEBSCOConfiguration(EBSCOConfiguration);
List instanceTypeConfigs=new ArrayList();
添加(instanceTypeConfig);
masterInstanceConfig.setInstanceTypeConfigs(instanceTypeConfigs);
返回masterInstanceConfig;
}