Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/373.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java hadoop映射器输入处理十六进制值_Java_String_Hadoop_Unicode_Utf 8 - Fatal编程技术网

Java hadoop映射器输入处理十六进制值

Java hadoop映射器输入处理十六进制值,java,string,hadoop,unicode,utf-8,Java,String,Hadoop,Unicode,Utf 8,我将tweet列表作为hdfs的输入,并尝试执行map reduce任务。这是我的映射器实现: @Override public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { try { String[] fields = value.toString().split("\t"); StringBuilder sb = new

我将tweet列表作为hdfs的输入,并尝试执行map reduce任务。这是我的映射器实现:

@Override
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
  try {
    String[] fields = value.toString().split("\t");
    StringBuilder sb = new StringBuilder();
    for (int i = 1; i < fields.length; i++) {
      if (i > 1) {
        sb.append("\t");
      }
      sb.append(fields[i]);
    }
    tid.set(fields[0]);
    content.set(sb.toString());
    context.write(tid, content);
  } catch(DecoderException e) {
    e.printStackTrace();
  }
}
下面是另一个例子:

2014\x0934447260\x09RBEKP\x090\x090\x09\xE2\x80\x9C@LENEsipper: Wild lmfaooo RT @Yerrp08: L**o some
 n***a nutt up while gettin twerked
我注意到
\x09
应该是制表符(ascii09是制表符),所以我尝试使用
apachehex

    String tmp = value.toString();
    byte[] bytes = Hex.decodeHex(tmp.toCharArray());
但是
decodehx
函数返回null

这很奇怪,因为有些角色是十六进制的,而另一些则不是。我怎样才能破译它们

编辑: 还要注意,除了
选项卡
表情符号
也被编码为十六进制值

    String tmp = value.toString();
    byte[] bytes = Hex.decodeHex(tmp.toCharArray());