Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop:java.lang.ClassCastException:org.apache.Hadoop.io.LongWritable不能强制转换为org.apache.Hadoop.io.Text_Java_Hadoop_Mapreduce - Fatal编程技术网

Hadoop:java.lang.ClassCastException:org.apache.Hadoop.io.LongWritable不能强制转换为org.apache.Hadoop.io.Text

Hadoop:java.lang.ClassCastException:org.apache.Hadoop.io.LongWritable不能强制转换为org.apache.Hadoop.io.Text,java,hadoop,mapreduce,Java,Hadoop,Mapreduce,我的程序看起来像 public class TopKRecord extends Configured implements Tool { public static class MapClass extends Mapper<Text, Text, Text, Text> { public void map(Text key, Text value, Context context) throws IOException, InterruptedExce

我的程序看起来像

public class TopKRecord extends Configured implements Tool {

    public static class MapClass extends Mapper<Text, Text, Text, Text> {

        public void map(Text key, Text value, Context context) throws IOException, InterruptedException {
            // your map code goes here
            String[] fields = value.toString().split(",");
            String year = fields[1];
            String claims = fields[8];

            if (claims.length() > 0 && (!claims.startsWith("\""))) {
                context.write(new Text(year.toString()), new Text(claims.toString()));
            }
        }
    }
   public int run(String args[]) throws Exception {
        Job job = new Job();
        job.setJarByClass(TopKRecord.class);

        job.setMapperClass(MapClass.class);

        FileInputFormat.setInputPaths(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        job.setJobName("TopKRecord");
        job.setMapOutputValueClass(Text.class);
        job.setNumReduceTasks(0);
        boolean success = job.waitForCompletion(true);
        return success ? 0 : 1;
    }

    public static void main(String args[]) throws Exception {
        int ret = ToolRunner.run(new TopKRecord(), args);
        System.exit(ret);
    }
}
运行此程序时,我在控制台上看到以下内容

12/08/02 12:43:34 INFO mapred.JobClient: Task Id : attempt_201208021025_0007_m_000000_0, Status : FAILED
java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.Text
    at com.hadoop.programs.TopKRecord$MapClass.map(TopKRecord.java:26)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
我相信类类型映射正确, ,


请让我知道我在这里做错了什么?

当您使用M/R程序读取文件时,映射器的输入键应该是文件中行的索引,而输入值将是整行

因此,这里发生的事情是,您试图将行索引作为
Text
对象,这是错误的,您需要一个
LongWritable
,这样Hadoop就不会抱怨类型了

请尝试以下方法:

public class TopKRecord extends Configured implements Tool {

    public static class MapClass extends Mapper<LongWritable, Text, Text, Text> {

        public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
            // your map code goes here
            String[] fields = value.toString().split(",");
            String year = fields[1];
            String claims = fields[8];

            if (claims.length() > 0 && (!claims.startsWith("\""))) {
                context.write(new Text(year.toString()), new Text(claims.toString()));
            }
        }
    }

    ...
}
public类TopKRecord扩展了配置的工具{
公共静态类映射器类扩展映射器{
公共void映射(LongWritable键、文本值、上下文上下文)引发IOException、InterruptedException{
//你的地图代码在这里
字符串[]字段=value.toString().split(“,”);
字符串年份=字段[1];
字符串声明=字段[8];
if(claims.length()>0&(!claims.startsWith(“\”)){
write(新文本(year.toString()),新文本(claims.toString());
}
}
}
...
}

在代码中还有一件您可能需要重新考虑的事情,您正在为正在处理的每个记录创建2个
Text
对象。您应该只在开始时创建这2个对象,然后在映射器中使用
set
方法设置它们的值。如果正在处理记录,这将节省大量时间数据量相当大。

您需要设置输入格式类

job.setInputFormatClass(KeyValueTextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);

说得好!这就是我的例子中的错误,通过将InputFormatClass设置为
SequenceFileInputFormat.class
,解决了这个问题。当此作业的输入是前一个作业的输出时,这就起作用了
job.setInputFormatClass(KeyValueTextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);