Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/amazon-s3/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Amazon s3 处理从.tar.gz文件到S3 Bucket的流式TarArchiveEntry_Amazon S3_Aws Lambda_Java Stream_Tar_Gunzip - Fatal编程技术网

Amazon s3 处理从.tar.gz文件到S3 Bucket的流式TarArchiveEntry

Amazon s3 处理从.tar.gz文件到S3 Bucket的流式TarArchiveEntry,amazon-s3,aws-lambda,java-stream,tar,gunzip,Amazon S3,Aws Lambda,Java Stream,Tar,Gunzip,我正在使用aws Lamda解压并遍历tar.gz文件,然后将它们上传回s3,保留原始目录结构 我在通过PutObjectRequest将TarArchiveEntry流式传输到S3存储桶时遇到了一个问题。当第一个条目成功流式传输时,尝试在TarArchiveInputStream上获取NextAttarEntry()时,由于基础GunzipCompress充气器为null,将抛出一个null指针,该值在s3.putObject(new PutObjectRequest(…)调用之前具有适当的值

我正在使用aws Lamda解压并遍历tar.gz文件,然后将它们上传回s3,保留原始目录结构

我在通过PutObjectRequest将TarArchiveEntry流式传输到S3存储桶时遇到了一个问题。当第一个条目成功流式传输时,尝试在TarArchiveInputStream上获取NextAttarEntry()时,由于基础GunzipCompress充气器为null,将抛出一个null指针,该值在s3.putObject(new PutObjectRequest(…)调用之前具有适当的值

我还没有找到关于gz输入流充气器属性在部分发送到s3后如何/为什么被设置为null的文档。
编辑进一步调查显示,AWS调用似乎在完成指定内容长度的上传后关闭输入流。。。还没有找到如何防止这种行为

下面是我的代码的基本情况。提前感谢您的帮助、意见和建议

public String handleRequest(S3Event s3Event, Context context) {

    try {
        S3Event.S3EventNotificationRecord s3EventRecord = s3Event.getRecords().get(0);
        String s3Bucket = s3EventRecord.getS3().getBucket().getName();

        // Object key may have spaces or unicode non-ASCII characters.
        String srcKey = s3EventRecord.getS3().getObject().getKey();

        System.out.println("Received valid request from bucket: " + bucketName + " with srckey: " + srcKeyInput);

        String bucketFolder = srcKeyInput.substring(0, srcKeyInput.lastIndexOf('/') + 1);
        System.out.println("File parent directory: " + bucketFolder);

        final AmazonS3 s3Client = AmazonS3ClientBuilder.defaultClient();

        TarArchiveInputStream tarInput = new TarArchiveInputStream(new GzipCompressorInputStream(getObjectContent(s3Client, bucketName, srcKeyInput)));

        TarArchiveEntry currentEntry = tarInput.getNextTarEntry();

        while (currentEntry != null) {
            String fileName = currentEntry.getName();
            System.out.println("For path = " + fileName);

            // checking if looking at a file (vs a directory)
            if (currentEntry.isFile()) {

                System.out.println("Copying " + fileName + " to " + bucketFolder + fileName + " in bucket " + bucketName);
                ObjectMetadata metadata = new ObjectMetadata();
                metadata.setContentLength(currentEntry.getSize());

                s3Client.putObject(new PutObjectRequest(bucketName, bucketFolder + fileName, tarInput, metadata)); // contents are properly and successfully sent to s3
                System.out.println("Done!");
            }

            currentEntry = tarInput.getNextTarEntry(); // NPE here due underlying gz inflator is null;
        }
    } catch (Exception e) {
        e.printStackTrace();
    } finally {
        IOUtils.closeQuietly(tarInput);
    }
}

没错,AWS关闭了提供给
PutObjectRequest
InputStream
,我不知道有什么方法可以指示AWS不要这样做

但是,您可以使用from来包装
TarArchiveInputStream
,如下所示:

InputStream shieldedInput = new CloseShieldInputStream(tarInput);

s3Client.putObject(new PutObjectRequest(bucketName, bucketFolder + fileName, shieldedInput, metadata));
当AWS关闭提供的
CloseShieldInputStream
时,基础
TarArchiveInputStream
将保持打开状态



另外,我不知道
ByteArrayInputStream(tarInput.getCurrentEntry())
做了什么,但它看起来很奇怪。为了回答这个问题,我忽略了它。

进一步的调查显示,AWS调用似乎在完成指定内容长度的上传后关闭了输入流。。。无法找到如何防止此行为。如何
getObjectContent()
生成
InputStream
谢谢您花时间和建议!我最终只是扩展了InputStream并覆盖了close调用。没有听说过CloseShieldInputStream。。。我也会尝试一下。