Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/amazon-s3/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 如何为org.apache.parquet.avro.AvroParquetReader配置S3访问权限?_Java_Amazon S3_Parquet - Fatal编程技术网

Java 如何为org.apache.parquet.avro.AvroParquetReader配置S3访问权限?

Java 如何为org.apache.parquet.avro.AvroParquetReader配置S3访问权限?,java,amazon-s3,parquet,Java,Amazon S3,Parquet,我为此挣扎了一段时间,想与大家分享我的解决方案。AvroParquetReader是一个很好的读取拼花地板的工具,但其S3访问的默认值很弱: java.io.InterruptedIOException: doesBucketExist on MY_BUCKET: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCr

我为此挣扎了一段时间,想与大家分享我的解决方案。AvroParquetReader是一个很好的读取拼花地板的工具,但其S3访问的默认值很弱:

java.io.InterruptedIOException: doesBucketExist on MY_BUCKET: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider SharedInstanceProfileCredentialsProvider : com.amazonaws.AmazonClientException: Unable to load credentials from service endpoint

我想使用与com.amazonaws.auth.profile.ProfileCredentialsProvider类似的凭据提供程序,它用于访问我的S3存储桶,但从AvroParquetReader的类定义或文档中不清楚我将如何实现这一点。

这段代码适合我。它允许AvroParquetReader使用ProfileCredentialsProvider访问S3

import com.amazonaws.auth.AWSCredentialsProvider;
import com.amazonaws.auth.profile.ProfileCredentialsProvider;
import org.apache.parquet.avro.AvroParquetReader;
import org.apache.parquet.hadoop.ParquetReader;
import org.apache.hadoop.fs.Path;
import org.apache.avro.generic.GenericRecord;
import org.apache.hadoop.conf.Configuration;

...

final String path = "s3a://"+bucketName+"/"+pathName;
final Configuration configuration = new Configuration();
configuration.setClass("fs.s3a.aws.credentials.provider", ProfileCredentialsProvider.class,
        AWSCredentialsProvider.class);
ParquetReader<GenericRecord> parquetReader =
        AvroParquetReader.<GenericRecord>builder(new Path(path)).withConf(configuration).build();
import com.amazonaws.auth.AWSCredentialsProvider;
导入com.amazonaws.auth.profile.ProfileCredentialsProvider;
导入org.apache.parquet.avro.AvroParquetReader;
导入org.apache.parquet.hadoop.ParquetReader;
导入org.apache.hadoop.fs.Path;
导入org.apache.avro.generic.GenericRecord;
导入org.apache.hadoop.conf.Configuration;
...
最终字符串路径=“s3a://”+bucketName+“/”+路径名;
最终配置=新配置();
setClass(“fs.s3a.aws.credentials.provider”,ProfileCredentialsProvider.class,
AWSCredentialsProvider.class);
镶木机镶木机=
AvroParquetReader.builder(新路径(Path)).withConf(配置).build();

这个代码对我有用。它允许AvroParquetReader使用ProfileCredentialsProvider访问S3

import com.amazonaws.auth.AWSCredentialsProvider;
import com.amazonaws.auth.profile.ProfileCredentialsProvider;
import org.apache.parquet.avro.AvroParquetReader;
import org.apache.parquet.hadoop.ParquetReader;
import org.apache.hadoop.fs.Path;
import org.apache.avro.generic.GenericRecord;
import org.apache.hadoop.conf.Configuration;

...

final String path = "s3a://"+bucketName+"/"+pathName;
final Configuration configuration = new Configuration();
configuration.setClass("fs.s3a.aws.credentials.provider", ProfileCredentialsProvider.class,
        AWSCredentialsProvider.class);
ParquetReader<GenericRecord> parquetReader =
        AvroParquetReader.<GenericRecord>builder(new Path(path)).withConf(configuration).build();
import com.amazonaws.auth.AWSCredentialsProvider;
导入com.amazonaws.auth.profile.ProfileCredentialsProvider;
导入org.apache.parquet.avro.AvroParquetReader;
导入org.apache.parquet.hadoop.ParquetReader;
导入org.apache.hadoop.fs.Path;
导入org.apache.avro.generic.GenericRecord;
导入org.apache.hadoop.conf.Configuration;
...
最终字符串路径=“s3a://”+bucketName+“/”+路径名;
最终配置=新配置();
setClass(“fs.s3a.aws.credentials.provider”,ProfileCredentialsProvider.class,
AWSCredentialsProvider.class);
镶木机镶木机=
AvroParquetReader.builder(新路径(Path)).withConf(配置).build();

对于其他遇到此问题的人,我发现@jd_free answer不适合我。我需要更改的唯一一件事就是将有关所使用的
AWSCredentialsProvider类型的配置设置传递给
AvroParquetReader

Configuration configuration = new Configuration();
        configuration.set("fs.s3a.aws.credentials.provider", "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider");
        configuration.set("fs.s3a.access.key", "KEY");
        configuration.set("fs.s3a.secret.key", "KEY");`

问题在于提供的凭据以及提供给配置的方式。有关不同凭据提供程序的更多信息,请使用“签出”。它解释了可用于不同场景的不同类型,包括如何从环境变量中获取凭据。

对于其他遇到此问题的人,我发现@jd_free answer对我不起作用。我需要更改的唯一一件事就是将有关所使用的
AWSCredentialsProvider类型的配置设置传递给
AvroParquetReader

Configuration configuration = new Configuration();
        configuration.set("fs.s3a.aws.credentials.provider", "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider");
        configuration.set("fs.s3a.access.key", "KEY");
        configuration.set("fs.s3a.secret.key", "KEY");`
问题在于提供的凭据以及提供给配置的方式。有关不同凭据提供程序的更多信息,请使用“签出”。它解释了可用于不同场景的不同类型,包括如何从环境变量中获取凭据