Hadoop:java.lang.Exception:java.lang.RuntimeException:配置对象时出错_Java_Hadoop_Mapreduce

Hadoop:java.lang.Exception:java.lang.RuntimeException:配置对象时出错

java hadoop mapreduce

Hadoop:java.lang.Exception:java.lang.RuntimeException:配置对象时出错,java,hadoop,mapreduce,Java,Hadoop,Mapreduce,首先谢谢你的帮助。在map类中，我实例化了另一个类WebPageToText。我的第一个问题：在Hadoop中运行代码时，map类中的print会出现吗？第二个问题：请帮我纠正这个错误我一直遇到这个问题： 14/04/02 20:39:35 INFO util.NativeCodeLoader: Loaded the native-hadoop library 14/04/02 20:39:36 WARN snappy.LoadSnappy: Snappy native

首先谢谢你的帮助。在map类中，我实例化了另一个类WebPageToText。我的第一个问题：在Hadoop中运行代码时，map类中的print会出现吗？第二个问题：请帮我纠正这个错误

我一直遇到这个问题：

    14/04/02 20:39:35 INFO util.NativeCodeLoader: Loaded the native-hadoop library
    14/04/02 20:39:36 WARN snappy.LoadSnappy: Snappy native library is available
    14/04/02 20:39:36 INFO snappy.LoadSnappy: Snappy native library loaded
    14/04/02 20:39:36 INFO mapred.FileInputFormat: Total input paths to process : 1
    14/04/02 20:39:36 INFO mapred.JobClient: Running job: job_local1947041074_0001
    14/04/02 20:39:36 INFO mapred.LocalJobRunner: Waiting for map tasks
    14/04/02 20:39:36 INFO mapred.LocalJobRunner: Starting task: attempt_local1947041074_0001_m_000000_0
    14/04/02 20:39:36 INFO util.ProcessTree: setsid exited with exit code 0
    14/04/02 20:39:36 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1cf7491
    14/04/02 20:39:36 INFO mapred.MapTask: Processing split: file:/usr/local/hadoop/project/input1/url.txt:0+68
    14/04/02 20:39:36 INFO mapred.MapTask: numReduceTasks: 1
    14/04/02 20:39:36 INFO mapred.MapTask: io.sort.mb = 100
    14/04/02 20:39:36 INFO mapred.MapTask: data buffer = 79691776/99614720
    14/04/02 20:39:36 INFO mapred.MapTask: record buffer = 262144/327680
    14/04/02 20:39:36 INFO mapred.LocalJobRunner: Map task executor complete.
    14/04/02 20:39:36 WARN mapred.LocalJobRunner: job_local1947041074_0001
    java.lang.Exception: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
    Caused by: java.lang.RuntimeException: Error in configuring object
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
        at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:701)
    Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:622)
        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
        ... 11 more
    Caused by: java.lang.NoClassDefFoundError: de/l3s/boilerpipe/BoilerpipeProcessingException
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:270)
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:881)
        at org.apache.hadoop.mapred.JobConf.getMapperClass(JobConf.java:968)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
        ... 16 more
    Caused by: java.lang.ClassNotFoundException: de.l3s.boilerpipe.BoilerpipeProcessingException
        at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
        ... 23 more
    14/04/02 20:39:37 INFO mapred.JobClient:  map 0% reduce 0%
    14/04/02 20:39:37 INFO mapred.JobClient: Job complete: job_local1947041074_0001
    14/04/02 20:39:37 INFO mapred.JobClient: Counters: 0
    14/04/02 20:39:37 INFO mapred.JobClient: Job Failed: NA
    Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
        at webPageToTxt.ConfMain.run(ConfMain.java:33)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at webPageToTxt.ConfMain.main(ConfMain.java:40)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:622)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:160)

conf类：包装网页

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;


public class ConfMain extends Configured implements Tool{

    public int run(String[] args) throws Exception
    {

          //creating a JobConf object and assigning a job name for identification purposes
          JobConf conf = new JobConf(getConf(), ConfMain.class);
          conf.setJobName("webpage to txt");

          //Setting configuration object with the Data Type of output Key and Value
          conf.setOutputKeyClass(Text.class);
          conf.setOutputValueClass(Text.class);

          //Providing the mapper and reducer class names
          conf.setMapperClass(WebPageToTxtMapper.class);
          conf.setReducerClass(WebPageToTxtReducer.class);

          //the hdfs input and output directory to be fetched from the command line
          FileInputFormat.addInputPath(conf, new Path(args[0]));
          FileOutputFormat.setOutputPath(conf, new Path(args[1]));

          JobClient.runJob(conf);
          System.out.println("configuration is done");
          return 0;
    }


    public static void main(String[] args) throws Exception {
         int res = ToolRunner.run(new Configuration(), new ConfMain(),args);
         System.exit(res);
    }
}

地图类

package webPageToTxt;

import java.io.IOException;
import java.util.Scanner;

import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;

import de.l3s.boilerpipe.BoilerpipeProcessingException;


public class WebPageToTxtMapper extends MapReduceBase implements Mapper<Text, Text, Text, Text>
{
         private Text url = new Text();
         private Text wordList = new Text();
      //map method that performs the tokenizer job and framing the initial key value pairs
      public void map(Text key, Text value, OutputCollector<Text, Text> output, Reporter reporter) throws IOException
      {
            try {
                System.out.println("Prepare to get into webpage");
//              String val = WebPageToTxt.webPageToTxt("http://en.wikipedia.org/wiki/Sun\nhttp://en.wikipedia.org/wiki/Earth");

                String val = WebPageToTxt.webPageToTxt(value.toString());
                System.out.println("Webpage main function implemented");

                Scanner scanner = new Scanner(val);
                while (scanner.hasNextLine()) {
                    String line = scanner.nextLine();
                      // process the line
                    String[] arr = line.split("`", 2);
                    url.set(arr[0]);
                    wordList.set(line);
                    output.collect(url, wordList);
                }
                scanner.close();
            } catch (BoilerpipeProcessingException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
      }
}

包webpacketotxt；
导入java.io.IOException；
导入java.util.Scanner；
导入org.apache.hadoop.io.Text；
导入org.apache.hadoop.mapred.MapReduceBase；
导入org.apache.hadoop.mapred.Mapper；
导入org.apache.hadoop.mapred.OutputCollector；
导入org.apache.hadoop.mapred.Reporter；
导入de.l3s.boilerpipe.BoilerpipeProcessingException；
公共类WebPageToTxtMapper扩展MapReduceBase实现Mapper
{
私有文本url=新文本（）；
私有文本字列表=新文本（）；
//执行标记器作业并对初始键值对进行帧化的map方法
公共void映射（文本键、文本值、OutputCollector输出、报告器报告器）引发IOException
{
试一试{
System.out.println（“准备进入网页”）；
//字符串val=WebPaGetOText.WebPaGetOText（“http://en.wikipedia.org/wiki/Sun\nhttp://en.wikipedia.org/wiki/Earth");
字符串val=WebPaGetOText.WebPaGetOText（value.toString（））；
System.out.println（“实现的网页主功能”）；
扫描仪=新扫描仪（val）；
while（scanner.hasNextLine（））{
字符串行=scanner.nextLine（）；
//处理生产线
字符串[]arr=line.split（“`”，2）；
url.set（arr[0]）；
wordList.set（行）；
collect（url，wordList）；
}
scanner.close（）；
}捕获（锅炉管道处理异常e）{
//TODO自动生成的捕捉块
e、 printStackTrace（）；
}
}
}

mapper或reduce方法中的打印语句将不会出现在主作业输出中，以便使用Jobtracker webUI查看输出

从Jobid列中选择

correct Jobid

->单击种类列中的

Map/Reduce

->单击相应的Map/Reducetask->单击

All

Tasklog列

由于您在mapper方法中使用类de.l3s.boilerpipe.BoilerpipeProcessingException，因此需要使该类以分布式方式可用，如果您使用hadoop命令执行该应用程序，请使用-libjars通用选项

（需要实现ToolRunner类）或者只需将包含类

de.l3s.boilerpipe.BoilerpipeProcessingException的jar与主jar本身打包
 你找到解决办法了吗？