Amazon s3 Gzip文件在使用Java应用程序将其上载到S3存储桶时被损坏

Amazon s3 Gzip文件在使用Java应用程序将其上载到S3存储桶时被损坏,amazon-s3,gzip,gzipinputstream,Amazon S3,Gzip,Gzipinputstream,我正在使用java应用程序将一个gzip文件上传到S3 bucket中,其中的数据将在Athena中使用。gzip文件在上载时损坏。 由于这个原因,雅典娜无法查看gzip文件中的数据,而且当文件被下载并手动尝试解压时,它会说“它不是gzip文件” private void getAndProcessFilesGenReports(String parUrl, String custCode, long size, String queryDate) { try (Clos

我正在使用java应用程序将一个gzip文件上传到S3 bucket中,其中的数据将在Athena中使用。gzip文件在上载时损坏。 由于这个原因,雅典娜无法查看gzip文件中的数据,而且当文件被下载并手动尝试解压时,它会说“它不是gzip文件”

private void getAndProcessFilesGenReports(String parUrl, String custCode, long size, String queryDate) {
            try (CloseableHttpClient httpclient = HttpClientBuilder.create().setDefaultCredentialsProvider(getCredentialsProvider()).build();) {
          CloseableHttpResponse response;
          HttpGet httpget = new HttpGet(BASE_URI.concat(parUrl));
          response = httpclient.execute(httpget);
          httpget.setConfig(config);

              response.getStatusLine().getStatusCode(), response.getStatusLine().getReasonPhrase());

          if (response.getStatusLine().getStatusCode() != 200) {
            log.error("getAndProcessFilesGenReports partUrl could not get response for custCode---> {}", custCode);
          }

          if (response.getStatusLine().getStatusCode() == 200) {
            GZIPInputStream gzis = new GZIPInputStream(response.getEntity().getContent());
            String bucketName = bucketForDetailedBilling(GEN_REPORT_TYPE, custCode, queryDate);
            uploadGzipFileToS3(gzis, size, bucketName);
          }
        } catch (Exception e) {
          log.error("error in getAndProcessFilesGenReports()--->", e);
        }
      }
private void uploadGzipFileToS3(InputStream gzis, long size, String bucketName) {
    log.info("uploadGzipFileToS3 size{} --- bucketName {}--->", size, bucketName);
    ClientConfiguration clientConfiguration = new ClientConfiguration();
    clientConfiguration.setConnectionMaxIdleMillis(600000);
    clientConfiguration.setConnectionTimeout(600000);
    clientConfiguration.setClientExecutionTimeout(600000);
    clientConfiguration.setUseGzip(true);
    clientConfiguration.setConnectionTTL(1000 * 60 * 60);
    AmazonS3Client amazonS3Client = new AmazonS3Client(clientConfiguration);
    TransferManager transferManager = new TransferManager(amazonS3Client);
    try {
      ObjectMetadata objectMetadata = new ObjectMetadata();
      objectMetadata.setContentLength(size);

      transferManager.getConfiguration().setMultipartUploadThreshold(1024 * 5);

      PutObjectRequest request = new PutObjectRequest(bucketName, DBR_NAME + DBR_EXT, gzis, objectMetadata);
      request.getRequestClientOptions().setReadLimit(1024 * 5 + 1);
      request.setSdkClientExecutionTimeout(10000 * 60 * 60);

      Upload upload = transferManager.upload(request);

      upload.waitForCompletion();
    }`

您是否比较了原始文件和上载的文件?两者的字节数相同吗?上载zip文件时,为什么要使用new
gzip输入流(response.getEntity().getContent())例如,Apache HTTP客户端将根据默认值解压缩流。哪里有“长码”?使用GZIPInputStream时,上载内容长度可能不等于输入流长度。public void processQueryResponseGenReports(QueryResponse QRResponse,String custCode,String queryDate){List reportList=QRResponse.getReport();reportList.stream().filter(x->!isNull(x)).forEach(reportData->{if(reportData.getUrl()!=null){getAndProcessFilesGenReports(reportData.getUrl(),custCode,reportData.getSize(),queryDate);}else{log.info(“processQueryResponseGenReports reportData不包含url-->”);}}这里我得到了长的尺寸,你是对的,可能是长度不一样,实际上长度是一样的,唯一的问题是contentType,我做了objectMetadata.setContentType(“application/x-gzip”);现在解决了。