Java 映射减少字符串连接上的字符串越界错误

Java 映射减少字符串连接上的字符串越界错误,java,mapreduce,Java,Mapreduce,我正在尝试编写一个MapReduce代码,该代码接受存储在文本文件中的表。该表有两个属性。一个是id,第二个是name,代码应该使用相同id的所有值并将它们连接起来。例:1 xyz 2 xyz 1 abc应导致1 xyzabc 2 xyz。 以下是我的代码版本。作为一个初学者,我修改了MaxTemperature代码来学习这样做 import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input

我正在尝试编写一个MapReduce代码,该代码接受存储在文本文件中的表。该表有两个属性。一个是id,第二个是name,代码应该使用相同id的所有值并将它们连接起来。例:1 xyz 2 xyz 1 abc应导致1 xyzabc 2 xyz。 以下是我的代码版本。作为一个初学者,我修改了MaxTemperature代码来学习这样做

import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.mapreduce.Mapper;
import java.io.IOException;
import org.apache.hadoop.mapreduce.lib.input.KeyValueTextInputFormat;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

public class MaxTemperature {

public static class MaxTemperatureMapper
  extends Mapper<Text, Text, Text, Text> {

  @Override
  public void map(Text key, Text value, Context context)
  throws IOException, InterruptedException {

   String line = value.toString();
   String lastWord = line.substring(line.lastIndexOf(" ")+1);
    Text valq = new Text();
   valq.set(line.substring(0,4));
     context.write(new Text(lastWord), valq );
      }
   }

public static class MaxTemperatureReducer
 extends Reducer<Text, Text, Text, Text> {

  @Override
  public void reduce(Text key, Iterable<Text> values,
     Context context)
     throws IOException, InterruptedException {
    String p="";
  for (Text value : values) {
  p=p+value.toString();
   }
Text aa= new Text();
aa.set(p);  
context.write(key, new Text(aa));
  }
}

public static void main(String[] args) throws Exception {
   if (args.length != 2) {
     System.err.println("Usage: MaxTemperature <input path> <output path>");
     System.exit(-1);
   }

    Job job = new Job();
    job.setJarByClass(MaxTemperature.class);
    job.setJobName("Max temperature");

    FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));

job.setMapperClass(MaxTemperatureMapper.class);
job.setReducerClass(MaxTemperatureReducer.class);

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(Text.class);
   job.setInputFormatClass(KeyValueTextInputFormat.class);

    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}
~
预期产量

123456 name,name,age (all values with index 123456)
124589 hyderabad (al values with index 124589)
我犯了以下错误

  java.lang.StringIndexOutOfBoundsException: String index out of range: 4
    at java.lang.String.substring(String.java:1907)
    at MaxTemperature$MaxTemperatureMapper.map(MaxTemperature.java:39)
    at MaxTemperature$MaxTemperatureMapper.map(MaxTemperature.java:26)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
    at org.apache.hadoop.mapred.Child.main(Child.java:262)
三件事:

  • 您没有很好地描述预期的输入,尤其是在代码的上下文中
  • 你还没有描述你试图用map/reduce方法做什么,即使我能理解你想做什么
  • 您应该检查Javadoc中的String.substring(int,int):,int)

  • 对于
    substring()
    doc说
    IndexOutOfBoundsException-如果beginIndex为负值,或者endIndex大于此字符串对象的长度,或者beginIndex大于endIndex。
    。确保最终索引4足够了。是的,我把它当作15456654,即一个巨大的数字,但它仍然显示相同
      java.lang.StringIndexOutOfBoundsException: String index out of range: 4
        at java.lang.String.substring(String.java:1907)
        at MaxTemperature$MaxTemperatureMapper.map(MaxTemperature.java:39)
        at MaxTemperature$MaxTemperatureMapper.map(MaxTemperature.java:26)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
        at org.apache.hadoop.mapred.Child.main(Child.java:262)