Hadoop 所有任务尝试均已完成,但mapreduce中的作业失败

Hadoop 所有任务尝试均已完成,但mapreduce中的作业失败,hadoop,mapreduce,Hadoop,Mapreduce,我处理8个映射任务和1个reduce任务。虽然所有映射任务尝试都已成功完成,但映射减少作业失败。我的示例代码来自Hadoop初学者指南(Garry Turkington),该指南针对跳过数据运行。该程序的主要思想是在map中测试任务失败。尽管源文件中包含导致失败的数据(示例中为skiptext),但map reduce可以成功地完成该任务。但是,我没有完成工作,遇到工作失败。我该怎么办 完整的源代码是: import java.io.IOException; import org.apache.

我处理8个映射任务和1个reduce任务。虽然所有映射任务尝试都已成功完成,但映射减少作业失败。我的示例代码来自Hadoop初学者指南(Garry Turkington),该指南针对跳过数据运行。该程序的主要思想是在map中测试任务失败。尽管源文件中包含导致失败的数据(示例中为skiptext),但map reduce可以成功地完成该任务。但是,我没有完成工作,遇到工作失败。我该怎么办

完整的源代码是:

import java.io.IOException;
import org.apache.hadoop.conf.* ;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.* ;
import org.apache.hadoop.mapred.* ;
import org.apache.hadoop.mapred.lib.* ;
public class SkipData
{
public static class MapClass extends MapReduceBase
implements Mapper<LongWritable, Text, Text, LongWritable>
{
private final static LongWritable one = new
LongWritable(1);
private Text word = new Text("totalcount");
public void map(LongWritable key, Text value,
OutputCollector<Text, LongWritable> output,
Reporter reporter) throws IOException
{
String line = value.toString();
if (line.equals("skiptext"))
throw new RuntimeException("Found skiptext") ;
output.collect(word, one);
}
}
public static void main(String[] args) throws Exception
{
Configuration config = new Configuration() ;
JobConf conf = new JobConf(config, SkipData.class);
conf.setJobName("SkipData");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(LongWritable.class);
conf.setMapperClass(MapClass.class);
conf.setCombinerClass(LongSumReducer.class);
conf.setReducerClass(LongSumReducer.class);
FileInputFormat.setInputPaths(conf,args[0]) ;
FileOutputFormat.setOutputPath(conf, new
Path(args[1])) ;
JobClient.runJob(conf);
}
}

看起来代码正在按设计工作。找到skiptext行,在这种情况下,实现作业以引发任务结束异常。这是一种常见的编码技术,它迫使人们在某一点上实现逻辑。在需要修改代码的地方放置一个throw RuntimeException(),并强制开发人员查看该部分代码


查看代码并决定在skiptext行中要执行的操作。是否需要实现其他逻辑来替换异常?如果是这样,请用正确的行为替换抛出的异常。

谢谢您的回答。作者给出了完整的编码。虽然我使用相同的源代码运行,但我无法完成这项工作。你能帮我吗?请我再次摆出了完整源代码的姿势。谢谢你抽出时间。
18/02/28 21:12:58 INFO mapreduce.Job: Job job_local724352166_0001 failed with state FAILED due to: NA
18/02/28 21:12:58 WARN mapred.LocalJobRunner: job_local724352166_0001
java.lang.Exception: java.lang.RuntimeException: Found skiptext
at  org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: Found skiptext
at mapredpack.SkipTest$MapClass.map(SkipTest.java:23)
at mapredpack.SkipTest$MapClass.map(SkipTest.java:1)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at  org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner .java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
18/02/28 21:12:58 DEBUG security.UserGroupInformation: PrivilegedAction  as:naychi (auth:SIMPLE)   from:org.apache.hadoop.mapreduce.Job.getCounters(Job.java:758)
18/02/28 21:12:59 DEBUG security.UserGroupInformation: PrivilegedAction as:naychi (auth:SIMPLE) from:org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:331)
18/02/28 21:12:59 INFO mapreduce.Job: Counters: 23
File System Counters
    FILE: Number of bytes read=29905
    FILE: Number of bytes written=2020669
    FILE: Number of read operations=0
    FILE: Number of large read operations=0
    FILE: Number of write operations=0
    HDFS: Number of bytes read=128005127
    HDFS: Number of bytes written=0
    HDFS: Number of read operations=80
    HDFS: Number of large read operations=0
    HDFS: Number of write operations=7
Map-Reduce Framework
    Map input records=1542671
    Map output records=1542669
    Map output bytes=29310711
    Map output materialized bytes=135
    Input split bytes=686
    Combine input records=1161148
    Combine output records=5
    Spilled Records=5
    Failed Shuffles=0
    Merged Map outputs=0
    GC time elapsed (ms)=8601
    Total committed heap usage (bytes)=3840933888
File Input Format Counters 
    Bytes Read=23163911
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:873)
at mapredpack.SkipTest.main(SkipTest.java:58)
18/02/28 21:12:59 DEBUG ipc.Client: stopping client from cache:  org.apache.hadoop.ipc.Client@2e55dd0c
18/02/28 21:12:59 DEBUG ipc.Client: removing client from cache: org.apache.hadoop.ipc.Client@2e55dd0c
18/02/28 21:12:59 DEBUG ipc.Client: stopping actual client because no more references remain: org.apache.hadoop.ipc.Client@2e55dd0c
18/02/28 21:12:59 DEBUG ipc.Client: Stopping client
18/02/28 21:12:59 DEBUG ipc.Client: IPC Client (1313916817) connection to localhost/127.0.0.1:9000 from naychi: closed
18/02/28 21:12:59 DEBUG ipc.Client: IPC Client (1313916817) connection to localhost/127.0.0.1:9000 from naychi: stopped, remaining connections 0