Java 从eclipse运行hadoop(Cloudera-2.0.0-cdh4.4.0)作业时出错?

Java 从eclipse运行hadoop(Cloudera-2.0.0-cdh4.4.0)作业时出错?,java,hadoop,cloudera,Java,Hadoop,Cloudera,您好,我正在运行eclipse中的hadoop wordcount示例,并收到以下错误:- 13/11/24 22:17:08 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder sending #12 13/11/24 22:17:08 DEBUG ipc.Client: IPC Client (2010005445) connection to local

您好,我正在运行eclipse中的hadoop wordcount示例,并收到以下错误:-

13/11/24 22:17:08 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder sending #12
13/11/24 22:17:08 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder got value #12
13/11/24 22:17:08 DEBUG ipc.ProtobufRpcEngine: Call: delete took 11ms
13/11/24 22:17:08 WARN mapred.LocalJobRunner: job_local1690217234_0001
java.lang.AbstractMethodError
    at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:96)
    at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:54)
    at java.io.DataOutputStream.write(DataOutputStream.java:90)
    at org.apache.hadoop.mapred.TextOutputFormat$LineRecordWriter.writeObject(TextOutputFormat.java:76)
    at org.apache.hadoop.mapred.TextOutputFormat$LineRecordWriter.write(TextOutputFormat.java:91)
    at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:483)
    at org.hadoop.par.WordCount$Reduce.reduce(WordCount.java:34)
    at org.hadoop.par.WordCount$Reduce.reduce(WordCount.java:1)
    at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449)
13/11/24 22:17:09 INFO mapred.JobClient:  map 100% reduce 0%
13/11/24 22:17:09 INFO mapred.JobClient: Job complete: job_local1690217234_0001
13/11/24 22:17:09 INFO mapred.JobClient: Counters: 26
13/11/24 22:17:09 INFO mapred.JobClient:   File System Counters
13/11/24 22:17:09 INFO mapred.JobClient:     FILE: Number of bytes read=172
13/11/24 22:17:09 INFO mapred.JobClient:     FILE: Number of bytes written=91974
13/11/24 22:17:09 INFO mapred.JobClient:     FILE: Number of read operations=0
13/11/24 22:17:09 INFO mapred.JobClient:     FILE: Number of large read operations=0
13/11/24 22:17:09 INFO mapred.JobClient:     FILE: Number of write operations=0
13/11/24 22:17:09 INFO mapred.JobClient:     HDFS: Number of bytes read=91
13/11/24 22:17:09 INFO mapred.JobClient:     HDFS: Number of bytes written=0
13/11/24 22:17:09 INFO mapred.JobClient:     HDFS: Number of read operations=5
13/11/24 22:17:09 INFO mapred.JobClient:     HDFS: Number of large read operations=0
13/11/24 22:17:09 INFO mapred.JobClient:     HDFS: Number of write operations=1
13/11/24 22:17:09 INFO mapred.JobClient:   Map-Reduce Framework
13/11/24 22:17:09 INFO mapred.JobClient:     Map input records=15
13/11/24 22:17:09 INFO mapred.JobClient:     Map output records=17
13/11/24 22:17:09 INFO mapred.JobClient:     Map output bytes=152
13/11/24 22:17:09 INFO mapred.JobClient:     Input split bytes=112
13/11/24 22:17:09 INFO mapred.JobClient:     Combine input records=17
13/11/24 22:17:09 INFO mapred.JobClient:     Combine output records=13
13/11/24 22:17:09 INFO mapred.JobClient:     Reduce input groups=1
13/11/24 22:17:09 INFO mapred.JobClient:     Reduce shuffle bytes=0
13/11/24 22:17:09 INFO mapred.JobClient:     Reduce input records=1
13/11/24 22:17:09 INFO mapred.JobClient:     Reduce output records=0
13/11/24 22:17:09 INFO mapred.JobClient:     Spilled Records=13
13/11/24 22:17:09 INFO mapred.JobClient:     CPU time spent (ms)=0
13/11/24 22:17:09 INFO mapred.JobClient:     Physical memory (bytes) snapshot=0
13/11/24 22:17:09 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=0
13/11/24 22:17:09 INFO mapred.JobClient:     Total committed heap usage (bytes)=138477568
13/11/24 22:17:09 INFO mapred.JobClient:   org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
13/11/24 22:17:09 INFO mapred.JobClient:     BYTES_READ=91
13/11/24 22:17:09 INFO mapred.JobClient: Job Failed: NA
Exception in thread "main" java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1372)
    at org.hadoop.par.WordCount.main(WordCount.java:62)
13/11/24 22:17:09 DEBUG hdfs.DFSClient: Waiting for ack for: -1
13/11/24 22:17:09 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder sending #13
13/11/24 22:17:09 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder got value #13
13/11/24 22:17:09 ERROR hdfs.DFSClient: Failed to close file /user/harinder/test_output/_temporary/_attempt_local1690217234_0001_r_000000_0/part-00000
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /user/harinder/test_output/_temporary/_attempt_local1690217234_0001_r_000000_0/part-00000: File does not exist. Holder DFSClient_NONMAPREDUCE_-559950586_1 does not have any open files.
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2445)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2437)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:2503)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:2480)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:556)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:337)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44958)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1751)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1747)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1745)

    at org.apache.hadoop.ipc.Client.call(Client.java:1237)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
    at com.sun.proxy.$Proxy9.complete(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
    at com.sun.proxy.$Proxy9.complete(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:329)
    at org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:1769)
    at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1756)
    at org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:696)
    at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:713)
    at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:559)
    at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2399)
    at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2415)
    at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
13/11/24 22:17:09 DEBUG ipc.Client: Stopping client
13/11/24 22:17:09 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder: closed
13/11/24 22:17:09 DEBUG ipc.Client: IPC Client (2010005445) connection to localhost/127.0.0.1:8020 from harinder: stopped, remaining connections 0
我已经在我的项目中添加了所有必需的JAR。以下是我正在运行的代码:-

import java.io.IOException;
import java.util.*;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;

public class WordCount {

  public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
      String line = value.toString();
      StringTokenizer tokenizer = new StringTokenizer(line);
      while (tokenizer.hasMoreTokens()) {
        word.set(tokenizer.nextToken());
        output.collect(word, one);
      }
    }
  }

  public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> {
    public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
      int sum = 0;
      while (values.hasNext()) {
        sum += values.next().get();
      }
      output.collect(key, new IntWritable(sum));
    }
  }

  public static void main(String[] args) throws Exception {
    JobConf conf = new JobConf(WordCount.class);

    conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
    conf.addResource(new Path("/etc/hadoop/conf/hdfs-site.xml"));

    conf.setJobName("wordcount");

    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(IntWritable.class);

    conf.setMapperClass(Map.class);
    conf.setCombinerClass(Reduce.class);
    conf.setReducerClass(Reduce.class);

    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);


    FileInputFormat.setInputPaths(conf, new Path("/user/harinder/test_data/"));
    FileOutputFormat.setOutputPath(conf, new Path("/user/harinder/test_output"));
    //FileInputFormat.setInputPaths(conf, new Path(args[0]));
    //FileOutputFormat.setOutputPath(conf, new Path(args[1]));

    JobClient.runJob(conf);
  }
}
import java.io.IOException;
导入java.util.*;
导入org.apache.hadoop.fs.Path;
导入org.apache.hadoop.conf.*;
导入org.apache.hadoop.io.*;
导入org.apache.hadoop.mapred.*;
导入org.apache.hadoop.util.*;
公共类字数{
公共静态类映射扩展MapReduceBase实现映射器{
私有最终静态IntWritable one=新的IntWritable(1);
私有文本字=新文本();
公共void映射(LongWritable键、文本值、OutputCollector输出、Reporter报告器)引发IOException{
字符串行=value.toString();
StringTokenizer标记器=新的StringTokenizer(行);
while(tokenizer.hasMoreTokens()){
set(tokenizer.nextToken());
输出。收集(字,一);
}
}
}
公共静态类Reduce扩展MapReduceBase实现Reducer{
公共void reduce(文本键、迭代器值、OutputCollector输出、Reporter报告器)引发IOException{
整数和=0;
while(values.hasNext()){
sum+=values.next().get();
}
collect(key,newintwriteable(sum));
}
}
公共静态void main(字符串[]args)引发异常{
JobConf conf=newjobconf(WordCount.class);
conf.addResource(新路径(“/etc/hadoop/conf/core site.xml”);
conf.addResource(新路径(“/etc/hadoop/conf/hdfs site.xml”);
conf.setJobName(“字数”);
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(Map.class);
conf.setCombinerClass(Reduce.class);
conf.setReducerClass(Reduce.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
setInputPath(conf,新路径(“/user/harinder/test\u data/”);
setOutputPath(conf,新路径(“/user/harinder/test_output”);
//setInputPath(conf,新路径(args[0]);
//setOutputPath(conf,新路径(args[1]);
runJob(conf);
}
}
以下是添加到项目中的jar文件:-

如果您使用的是旧API(JobConf),请使用新API(Job)

此外,您还需要将程序编译为Jar,或者定义job.setJarByClass(YourMapReduce.class)的部分


程序将搜索不存在的要分发的jar文件。

为MapReduce程序创建一个jar,并将其放入项目中。然后运行它。此JAR将在集群中提交

感谢您的回复。我将代码更改为setJarByClass并尝试了,得到了相同的错误。。我不明白你说的使用Job的意思,看看你发的博客,他们也在使用同一个JobConf对象,我应该在哪里更改它?不幸的是,没有太多新的api示例检查这一点,当你按类设置Jar时,程序搜索要分发的Jar文件(你的map reduce程序)。因为您还没有构建jar,所以会出现错误。因此,您首先需要将项目编译成jar并运行jar。或者将MapReduce程序拆分为单独的jar,并在启动代码中引用它。