Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/amazon-web-services/14.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop 在AMI3.0.1上运行弹性mapreduce流_Hadoop_Amazon Web Services_Hadoop Streaming_Amazon Emr - Fatal编程技术网

Hadoop 在AMI3.0.1上运行弹性mapreduce流

Hadoop 在AMI3.0.1上运行弹性mapreduce流,hadoop,amazon-web-services,hadoop-streaming,amazon-emr,Hadoop,Amazon Web Services,Hadoop Streaming,Amazon Emr,正在尝试使用较新的AMI 3.0.1运行流作业: 我会遇到如下错误: Error: java.lang.RuntimeException: Error in configuring object ... Caused by: java.io.IOException: Cannot run program "s3://elasticmapreduce/samples/wordcount/wordSplitter.py": error=2, No such file or directory at

正在尝试使用较新的AMI 3.0.1运行流作业: 我会遇到如下错误:

Error: java.lang.RuntimeException: Error in configuring object
...
Caused by: java.io.IOException: Cannot run program "s3://elasticmapreduce/samples/wordcount/wordSplitter.py": error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1041)
at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:219)
... 23 more
Caused by: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:135)
at java.lang.ProcessImpl.start(ProcessImpl.java:130)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1022)
... 24 more
--输出s3://mybucket/output——减速机聚合

在AMI 2.4.2上运行相同的作业时,工作正常:

elastic-mapreduce --create --instance-type m1.large \
--log-uri s3n://mybucket/logs --stream \
--mapper s3://elasticmapreduce/samples/wordcount/wordSplitter.py \
--input s3://mybucket/input/alice.txt \
--output s3://mybucket/output --reducer aggregate

我需要使用AMI3.0.1,因为其他自定义JAR步骤使用Hadoop 2.2.0。

虽然这个问题不完全是这个特定问题的答案,但当我在EMR主机上直接启动作业时遇到同样的问题时,这个问题让我决定了答案

我遇到的问题的解决方案(使用AMI v3.x.x提供的Hadoop 2.x)需要使用-files选项:

hadoop jar contrib/streaming/hadoop-streaming.jar \
    -files s3n://<my bucket>/mapper.py,s3n://<my bucket>/reducer.py \
    -input s3n://<my bucket>/location/* \
    -output s3n://<my bucket>/emr-output \
    -mapper mapper.py  \
    -reducer reducer.py
hadoop-jar contrib/streaming/hadoop-streaming.jar\
-文件s3n:///mapper.py,s3n:///reducer.py\
-输入s3n:///位置//*\
-输出s3n:///emr输出\
-mapper mapper.py\
-减速机
在Hadoop的早期版本(由AMI v2.4.x提供的Hadoop 1.x)中,以下语句工作正常:

hadoop jar contrib/streaming/hadoop-streaming.jar \
    -input s3n://<my bucket>/location/* \
    -output s3n://<my bucket>/emr-output \
    -mapper s3n://<my bucket>/mapper.py  \
    -reducer s3n://<my bucket>/reducer.py
hadoop-jar contrib/streaming/hadoop-streaming.jar\
-输入s3n:///位置//*\
-输出s3n:///emr输出\
-映射器s3n:///mapper.py\
-减速器s3n:///reducer.py

如果其他人有此问题,我已经在AWS论坛上询问过,他们正在调查,并建议解决方法:
hadoop jar contrib/streaming/hadoop-streaming.jar \
    -input s3n://<my bucket>/location/* \
    -output s3n://<my bucket>/emr-output \
    -mapper s3n://<my bucket>/mapper.py  \
    -reducer s3n://<my bucket>/reducer.py