Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/amazon-web-services/14.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop 从AWS上EMR上的jar写入S3中的文件_Hadoop_Amazon Web Services_Amazon S3_Mapreduce_Elastic Map Reduce - Fatal编程技术网

Hadoop 从AWS上EMR上的jar写入S3中的文件

Hadoop 从AWS上EMR上的jar写入S3中的文件,hadoop,amazon-web-services,amazon-s3,mapreduce,elastic-map-reduce,Hadoop,Amazon Web Services,Amazon S3,Mapreduce,Elastic Map Reduce,是否有任何方法可以将Java jar中的文件写入S3文件夹,在S3文件夹中可以写入reduce文件?我试过这样的方法: FileSystem fs = FileSystem.get(conf); FSDataOutputStream FS = fs.create(new Path("S3 folder output path"+"//Result.txt")); PrintWriter writer = new PrintWriter(FS);

是否有任何方法可以将Java jar中的文件写入S3文件夹,在S3文件夹中可以写入reduce文件?我试过这样的方法:

    FileSystem fs = FileSystem.get(conf);
    FSDataOutputStream FS = fs.create(new Path("S3 folder output path"+"//Result.txt"));        

    PrintWriter writer  = new PrintWriter(FS);
    writer.write(averageDelay.toString());
    writer.close();
    FS.close();

这里Result.txt是我想写的新文件。

回答我自己的问题:-

我发现了错误。我应该将S3文件夹路径的URI传递给文件系统对象,如下所示:-

 FileSystem fileSystem = FileSystem.get(URI.create(otherArgs[1]),conf);
 FSDataOutputStream fsDataOutputStream = fileSystem.create(new Path(otherArgs[1]+"//Result.txt"));      
 PrintWriter writer  = new PrintWriter(fsDataOutputStream);
 writer.write("\n Average Delay:"+averageDelay);
 writer.close();
 fsDataOutputStream.close();  
FileSystem FileSystem=FileSystem.get(URI.create(otherArgs[1]),new JobConf(.class));
FSDataOutputStream FSDataOutputStream=fileSystem.create(新建)
路径(otherArgs[1]+“//Result.txt”);
PrintWriter writer=新的PrintWriter(fsDataOutputStream);
writer.write(“\n平均延迟:“+averageDelay”);
writer.close();
fsDataOutputStream.close();

这就是我在上面的代码块中处理conf变量的方式,它工作起来很有魅力。

这里有另一种在Java中使用AWS S3 putObject和字符串缓冲区的方法

... AmazonS3 s3Client;

public void reduce(Text key, java.lang.Iterable<Text> values, Reducer<Text, Text, Text, Text>.Context context) throws Exception {

    UUID fileUUID = UUID.randomUUID();
    SimpleDateFormat sdf = new SimpleDateFormat("yyy-MM-dd");
    sdf.setTimeZone(TimeZone.getTimeZone("UTC"));

    String fileName = String.format("nightly-dump/%s/%s-%s",sdf.format(new Date()), key, fileUUID);
    log.info("Filename = [{}]", fileName);

    String content = "";
    int count = 0;
    for (Text value : values) {
        count++;
        String s3Line = value.toString();
        content += s3Line + "\n";
    }
    log.info("Count = {}, S3Lines = \n{}", count, content);


    PutObjectResult putObjectResult = s3Client.putObject(S3_BUCKETNAME, fileName, content);
    log.info("Put versionId = {}", putObjectResult.getVersionId());

    reduceWriteContext("1", "1");

    context.setStatus("COMPLETED");
}
。。。亚马逊S3客户端;
公共void reduce(文本键、java.lang.Iterable值、Reducer.Context上下文)引发异常{
UUID fileUUID=UUID.randomUUID();
SimpleDataFormat sdf=新SimpleDataFormat(“yyy-MM-dd”);
sdf.setTimeZone(TimeZone.getTimeZone(“UTC”));
String fileName=String.format(“夜间转储/%s/%s-%s”,sdf.format(new Date()),key,fileUUID);
info(“Filename=[{}]”,Filename);
字符串内容=”;
整数计数=0;
用于(文本值:值){
计数++;
字符串s3Line=value.toString();
content+=s3Line+“\n”;
}
info(“Count={},S3Lines=\n{}”,Count,content);
PutObjectResult PutObjectResult=s3Client.putObject(S3_BUCKETNAME、文件名、内容);
info(“Put versionId={}”,putObjectResult.getVersionId());
简化书面文本(“1”、“1”);
上下文。设置状态(“已完成”);
}

btw,为什么不使用?它与您正在执行的方法一样具有可移植性,但对于长时间运行的作业可能更有用。您的代码中有什么
conf
?什么是
其他参数[1]
?此代码对以后查找问题的人没有帮助
... AmazonS3 s3Client;

public void reduce(Text key, java.lang.Iterable<Text> values, Reducer<Text, Text, Text, Text>.Context context) throws Exception {

    UUID fileUUID = UUID.randomUUID();
    SimpleDateFormat sdf = new SimpleDateFormat("yyy-MM-dd");
    sdf.setTimeZone(TimeZone.getTimeZone("UTC"));

    String fileName = String.format("nightly-dump/%s/%s-%s",sdf.format(new Date()), key, fileUUID);
    log.info("Filename = [{}]", fileName);

    String content = "";
    int count = 0;
    for (Text value : values) {
        count++;
        String s3Line = value.toString();
        content += s3Line + "\n";
    }
    log.info("Count = {}, S3Lines = \n{}", count, content);


    PutObjectResult putObjectResult = s3Client.putObject(S3_BUCKETNAME, fileName, content);
    log.info("Put versionId = {}", putObjectResult.getVersionId());

    reduceWriteContext("1", "1");

    context.setStatus("COMPLETED");
}