最高温度Mapreduce java代码中的运行时错误
我正在运行mapreduce代码,收到的错误是最高温度Mapreduce java代码中的运行时错误,java,apache,hadoop,mapreduce,hadoop-streaming,Java,Apache,Hadoop,Mapreduce,Hadoop Streaming,我正在运行mapreduce代码,收到的错误是 Error: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable at test.temp$Mymapper.map(temp.java:1) at org.apache.hadoop.mapreduce.Mapper.run(Map
Error: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable
at test.temp$Mymapper.map(temp.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
代码如下:
package test;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
//import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class temp {
public static class Mymapper extends Mapper<Object, Text, IntWritable,Text> {
public void map(Object key, Text value,Context context) throws IOException, InterruptedException{
int month=Integer.parseInt(value.toString().substring(17, 19));
IntWritable mon=new IntWritable(month);
String temp=value.toString().substring(27,31);
String t=null;
for(int i=0;i<temp.length();i++){
if(temp.charAt(i)==',')
break;
else
t=t+temp.charAt(i);
}
Text data=new Text(value.toString().substring(22, 26)+t);
context.write(mon, data);
}
}
public static class Myreducer extends Reducer<IntWritable,Text,IntWritable,IntWritable> {
public void reduce(IntWritable key,Iterable<Text> values,Context context) throws IOException, InterruptedException{
String temp="";
int max=0;
for(Text t:values)
{
temp=t.toString();
if(temp.substring(0, 4)=="TMAX"){
if(Integer.parseInt(temp.substring(4,temp.length()))>max){
max=Integer.parseInt(temp.substring(4,temp.length()));
}
}
}
context.write(key,new IntWritable(max));
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "temp");
job.setJarByClass(temp.class);
job.setMapperClass(Mymapper.class);
job.setCombinerClass(Myreducer.class);
job.setReducerClass(Myreducer.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}
}
封装测试;
导入java.io.IOException;
导入org.apache.hadoop.conf.Configuration;
导入org.apache.hadoop.fs.Path;
导入org.apache.hadoop.io.IntWritable;
导入org.apache.hadoop.io.Text;
//导入org.apache.hadoop.mapred.JobConf;
导入org.apache.hadoop.mapreduce.Job;
导入org.apache.hadoop.mapreduce.Mapper;
导入org.apache.hadoop.mapreduce.Reducer;
导入org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
导入org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
公共类临时工{
公共静态类Mymapper扩展了Mapper{
公共void映射(对象键、文本值、上下文上下文)引发IOException、InterruptedException{
int month=Integer.parseInt(value.toString().substring(17,19));
IntWritable mon=新的IntWritable(月);
字符串温度=value.toString()子字符串(27,31);
字符串t=null;
对于(int i=0;imax){
max=Integer.parseInt(临时子字符串(4,临时长度());
}
}
}
write(key,新的IntWritable(max));
}
}
公共静态void main(字符串[]args)引发异常{
Configuration conf=新配置();
Job Job=Job.getInstance(conf,“temp”);
作业设置JarbyClass(临时类);
setMapperClass(Mymapper.class);
job.setCombinerClass(Myreducer.class);
job.setReducerClass(Myreducer.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);
addInputPath(作业,新路径(args[0]);
setOutputPath(作业,新路径(args[1]);
job.waitForCompletion(true);
}
}
输入文件是
USC00037919000101,TMAX,-78,,,6,
USC00037919000101,TMAX,-133,,,6,
USC00037919000101,TMAX,127,,,6
请回复和帮助 认为您正在使用TextInputFormat作为作业的输入格式。这将生成LongWritable/Text,Hadoop将从中派生映射输出类 尝试显式设置映射输出类并删除组合器:
job.setMapOutputKeyClass(IntWritable.class);
job.setMapOutputValueClass(Text.class);
// job.setCombinerClass(Myreducer.class);
只有在map和REDUCT输出兼容的情况下,组合器才会工作 您已经在驱动程序中设置了以下内容:
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);
setMapOutputKeyClass();
setMapOutputValueClass();
这意味着,映射器和reducer输出键类都应该是intwriteable
,值类应该是intwriteable
减速器良好:
public static class Myreducer extends Reducer<IntWritable,Text,IntWritable,IntWritable>
这里的输出键类是intwriteable
。但是,输出值类是Text
(它应该是intwriteable
)
如果映射器的输出键/值类与reducer的输出键/值类不同,则需要向驱动程序显式添加以下语句:
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);
setMapOutputKeyClass();
setMapOutputValueClass();
在代码中进行以下更改:
- 设置映射输出键和值类:在您的情况下,由于映射器和还原器输出键和值类不同,因此需要设置以下内容:
job.setMapOutputKeyClass(IntWritable.class); job.setMapOutputValueClass(Text.class); job.setOutputKeyClass(IntWritable.class); job.setOutputValueClass(IntWritable.class);
- 禁用组合器:由于您正在为
,组合器使用
代码减速机
的输出将是组合器
和可写的
。但是,可写的
希望输入为Reducer
和intwriteable
。因此,您将得到以下异常,因为它的值为Text
,而不是intwriteable
:Text
要删除此错误,您需要禁用组合器:Error: java.io.IOException: wrong value class: class org.apache.hadoop.io.IntWritable is not class org.apache.hadoop.io.Text
job.setCombinerClass(Myreducer.class);
- 不要将reducer用作组合器:如果您确实需要使用组合器,请编写一个组合器,其输出键/值为
和intwriteable
Text
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);
它为mapper和reducer定义了输出
类,而不仅仅是reducer
这意味着您的映射程序应该有connect.write(IntWritable,IntWritable)
,但您已经编写了connect.write(IntWritable,Text)
修复:当映射输出类型与reduce输出不同时,需要显式设置映射器的输出类型。因此,在您的驱动程序代码中添加以下内容
job.setMapOutputKeyClass(IntWritable.class);
job.setMapOutputValueClass(Text.class);
这就是我所做的改变
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "temp");
job.setJarByClass(Temp.class);
job.setMapperClass(Mymapper.class);
job.setReducerClass(Myreducer.class);
job.setMapOutputKeyClass(IntWritable.class);
job.setMapOutputValueClass(Text.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setNumReduceTasks(1);
job.waitForCompletion(true);
}
输出:
10 0
有关解释,请参阅Manjunath Ballur的帖子。好的,更新了我的答案。您还需要设置映射输出值类并取消组合器的设置。通过这些更改,我成功地完成了您的代码!我做了更改,它仍然给出相同的错误@oae,您确定它成功运行了吗?是的,我复制了您的代码,添加了设置键和值类的两行代码,并删除了组合器类!你能仔细看看它是否完全相同吗?因为在我完成修复之前,我收到了各种各样的错误消息,这些消息看起来都一样,但看了两遍,我发现它们略有不同!补充了一个答案。检查它是否有效。