“线程中的异常”;“主要”;java.io.IOException:作业失败!在mapreduce中
我是一个新的Hadoop用户。我在Hadoop初学者指南(Garry Turkington)中运行了这个示例代码,但遇到了作业失败的问题。我在输出文件夹中没有看到输出文件(部分文件)。我在mapred-site.xml文件中做了很多更改,但我无法解决作业失败的问题。我该怎么办“线程中的异常”;“主要”;java.io.IOException:作业失败!在mapreduce中,java,hadoop,mapreduce,Java,Hadoop,Mapreduce,我是一个新的Hadoop用户。我在Hadoop初学者指南(Garry Turkington)中运行了这个示例代码,但遇到了作业失败的问题。我在输出文件夹中没有看到输出文件(部分文件)。我在mapred-site.xml文件中做了很多更改,但我无法解决作业失败的问题。我该怎么办 import java.io.IOException; import org.apache.hadoop.conf.* ; import org.apache.hadoop.fs.Path; import org.apac
import java.io.IOException;
import org.apache.hadoop.conf.* ;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.* ;
import org.apache.hadoop.mapred.* ;
import org.apache.hadoop.mapred.lib.* ;
import org.apache.hadoop.mapred.SkipBadRecords;
public class SkipData
{
public static class MapClass extends MapReduceBase implements Mapper<LongWritable, Text, Text, LongWritable>
{
private final static LongWritable one = new LongWritable(1);
private Text word = new Text("totalcount");
public void map(LongWritable key, Text value, OutputCollector<Text,LongWritable> output, Reporter reporter) throws IOException
{
String line = value.toString();
if (line.equals("skiptext"))
throw new RuntimeException("Found skiptext") ;
output.collect(word, one);
}
}
public static void main(String[] args) throws Exception
{
System.setProperty("hadoop.home.dir", "/home/saung/hadoop-2.7.4/");
Configuration config = new Configuration() ;
JobConf conf = new JobConf(config, SkipData.class);
conf.setJobName("SkipData");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(LongWritable.class);
conf.setMapperClass(MapClass.class);
conf.setCombinerClass(LongSumReducer.class);
conf.setReducerClass(LongSumReducer.class);
FileInputFormat.setInputPaths(conf,args[0]) ;
FileOutputFormat.setOutputPath(conf, new Path(args[1])) ;
JobClient.runJob(conf);
}
}
我使用了8个映射任务,错误如下所示:
18/02/27 14:21:50 INFO mapred.Task: Task 'attempt_local1745706455_0001_m_000007_0' done.
18/02/27 14:21:50 INFO mapred.LocalJobRunner: Finishing task: attempt_local1745706455_0001_m_000007_0
18/02/27 14:21:50 INFO mapred.LocalJobRunner: map task executor complete.
18/02/27 14:21:50 DEBUG ipc.Client: IPC Client (1882349076) connection to localhost/127.0.0.1:9000 from saung sending #18
18/02/27 14:21:50 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:670)
18/02/27 14:21:50 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320)
18/02/27 14:21:50 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320)
18/02/27 14:21:50 DEBUG ipc.Client: IPC Client (1882349076) connection to localhost/127.0.0.1:9000 from naychi got value #18
18/02/27 14:21:50 DEBUG ipc.ProtobufRpcEngine: Call: delete took 69ms
18/02/27 14:21:50 WARN mapred.LocalJobRunner: job_local1745706455_0001
java.lang.Exception: java.lang.RuntimeException: Found skiptext
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: Found skiptext
at mapredpack.SkipData$MapClass.map(SkipData.java:81)
at mapredpack.SkipData$MapClass.map(SkipData.java:1)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner .java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
18/02/27 14:21:50 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:3 31)
18/02/27 14:21:51 INFO mapred.LocalJobRunner: hdfs://localhost:9000 /taskfailure/skipdata.txt:4194304+4194304 > map
18/02/27 14:21:51 INFO mapreduce.Job: map 69% reduce 0%
18/02/27 14:21:51 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:670)
18/02/27 14:21:51 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320)
18/02/27 14:21:51 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320)
18/02/27 14:21:51 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:670)
18/02/27 14:21:51 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320)
18/02/27 14:21:51 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320)
18/02/27 14:21:51 INFO mapreduce.Job: Job job_local1745706455_0001 failed with state FAILED due to: NA
18/02/27 14:21:51 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.getCounters(Job.java:758)
18/02/27 14:21:51 INFO mapreduce.Job: Counters: 23
File System Counters
FILE: Number of bytes read=24806
FILE: Number of bytes written=1741540
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=107677710
HDFS: Number of bytes written=0
HDFS: Number of read operations=70
HDFS: Number of large read operations=0
HDFS: Number of write operations=7
Map-Reduce Framework
Map input records=1381528
Map output records=1381527
Map output bytes=26249013
Map output materialized bytes=135
Input split bytes=588
Combine input records=1161148
Combine output records=5
Spilled Records=5
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=2000
Total committed heap usage (bytes)=3535798272
File Input Format Counters
Bytes Read=20743175
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:873)
at mapredpack.SkipData.main(SkipData.java:100)
18/02/27 14:21:51 DEBUG ipc.Client: stopping client from cache: org.apache.hadoop.ipc.Client@2cd2a21f
18/02/27 14:21:51 DEBUG ipc.Client: removing client from cache: org.apache.hadoop.ipc.Client@2cd2a21f
18/02/27 14:21:51 DEBUG ipc.Client: stopping actual client because no more references remain: org.apache.hadoop.ipc.Client@2cd2a21f
18/02/27 14:21:51 DEBUG ipc.Client: Stopping client
18/02/27 14:21:51 DEBUG ipc.Client: IPC Client (1882349076) connection to localhost/127.0.0.1:9000 from saung: closed
18/02/27 14:21:51 DEBUG ipc.Client: IPC Client (1882349076) connection to localhost/127.0.0.1:9000 from saung: stopped, remaining connections 0
18/02/27 14:21:51 INFO mapred.LocalJobRunner: hdfs://localhost:9000/taskfailure/skipdata.txt:12582912+4194304 > map
在我看来,这似乎是预期的行为
- 您有一个
,该映射器设计用于在遇到记录(行)映射器
时引发异常“skiptext”
- 您的输入包含该表单的一些行
- 正在引发异常
- 。。。。这项工作正在失败
org.apache.hadoop.mapred.SkipBadRecords
。。。但是简单地在Java中导入一个类实际上并没有做任何事情
根据教程:
默认情况下,此功能处于禁用状态
要启用它,请参阅SkipBadRecords.setMapperMaxSkipRecords(配置,长)
和SkipBadRecords.setreducerMaxkipGroups(配置,长)
请提供所需数据的完整stacktraceI。我应该怎么做来防止失败的工作?谢谢。我试图在main()中添加这两行,但仍然存在此问题。你能再给我一次建议吗?你应该阅读这些方法的教程和javadocs,这样你才能理解你在做什么。(然后重读我写的内容。首先,我没有建议您添加两行,我引用的教程中的行也没有。)
File.open('skipdata.txt','w') do |file|
3.times do
500000.times{file.write("A valid record\n")}
5.times{file.write("skiptext\n")}
end
500000.times{file.write("A valid record\n")}
end
18/02/27 14:21:50 INFO mapred.Task: Task 'attempt_local1745706455_0001_m_000007_0' done.
18/02/27 14:21:50 INFO mapred.LocalJobRunner: Finishing task: attempt_local1745706455_0001_m_000007_0
18/02/27 14:21:50 INFO mapred.LocalJobRunner: map task executor complete.
18/02/27 14:21:50 DEBUG ipc.Client: IPC Client (1882349076) connection to localhost/127.0.0.1:9000 from saung sending #18
18/02/27 14:21:50 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:670)
18/02/27 14:21:50 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320)
18/02/27 14:21:50 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320)
18/02/27 14:21:50 DEBUG ipc.Client: IPC Client (1882349076) connection to localhost/127.0.0.1:9000 from naychi got value #18
18/02/27 14:21:50 DEBUG ipc.ProtobufRpcEngine: Call: delete took 69ms
18/02/27 14:21:50 WARN mapred.LocalJobRunner: job_local1745706455_0001
java.lang.Exception: java.lang.RuntimeException: Found skiptext
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: Found skiptext
at mapredpack.SkipData$MapClass.map(SkipData.java:81)
at mapredpack.SkipData$MapClass.map(SkipData.java:1)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner .java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
18/02/27 14:21:50 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:3 31)
18/02/27 14:21:51 INFO mapred.LocalJobRunner: hdfs://localhost:9000 /taskfailure/skipdata.txt:4194304+4194304 > map
18/02/27 14:21:51 INFO mapreduce.Job: map 69% reduce 0%
18/02/27 14:21:51 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:670)
18/02/27 14:21:51 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320)
18/02/27 14:21:51 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320)
18/02/27 14:21:51 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:670)
18/02/27 14:21:51 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320)
18/02/27 14:21:51 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320)
18/02/27 14:21:51 INFO mapreduce.Job: Job job_local1745706455_0001 failed with state FAILED due to: NA
18/02/27 14:21:51 DEBUG security.UserGroupInformation: PrivilegedAction as:saung (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.getCounters(Job.java:758)
18/02/27 14:21:51 INFO mapreduce.Job: Counters: 23
File System Counters
FILE: Number of bytes read=24806
FILE: Number of bytes written=1741540
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=107677710
HDFS: Number of bytes written=0
HDFS: Number of read operations=70
HDFS: Number of large read operations=0
HDFS: Number of write operations=7
Map-Reduce Framework
Map input records=1381528
Map output records=1381527
Map output bytes=26249013
Map output materialized bytes=135
Input split bytes=588
Combine input records=1161148
Combine output records=5
Spilled Records=5
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=2000
Total committed heap usage (bytes)=3535798272
File Input Format Counters
Bytes Read=20743175
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:873)
at mapredpack.SkipData.main(SkipData.java:100)
18/02/27 14:21:51 DEBUG ipc.Client: stopping client from cache: org.apache.hadoop.ipc.Client@2cd2a21f
18/02/27 14:21:51 DEBUG ipc.Client: removing client from cache: org.apache.hadoop.ipc.Client@2cd2a21f
18/02/27 14:21:51 DEBUG ipc.Client: stopping actual client because no more references remain: org.apache.hadoop.ipc.Client@2cd2a21f
18/02/27 14:21:51 DEBUG ipc.Client: Stopping client
18/02/27 14:21:51 DEBUG ipc.Client: IPC Client (1882349076) connection to localhost/127.0.0.1:9000 from saung: closed
18/02/27 14:21:51 DEBUG ipc.Client: IPC Client (1882349076) connection to localhost/127.0.0.1:9000 from saung: stopped, remaining connections 0
18/02/27 14:21:51 INFO mapred.LocalJobRunner: hdfs://localhost:9000/taskfailure/skipdata.txt:12582912+4194304 > map