Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop 将数据从mongodb导入hdfs时出错_Hadoop - Fatal编程技术网

Hadoop 将数据从mongodb导入hdfs时出错

Hadoop 将数据从mongodb导入hdfs时出错,hadoop,Hadoop,将数据从mongodb导入hdfs时出错。 I.使用以下各项: Ambari沙盒[Hortonworks]Hadoop 2.7 MongoDB 3.0版 以下是我要包含的jar文件: mongo-java-driver-2.11.4.jar mongo-hadoop-core-1.3.0.jar 以下是我正在使用的代码: package com.mongo.test; import java.io.*; import org.apache.commons.loggi

将数据从mongodb导入hdfs时出错。 I.使用以下各项:

  • Ambari沙盒[Hortonworks]Hadoop 2.7
  • MongoDB 3.0版
以下是我要包含的jar文件:

  • mongo-java-driver-2.11.4.jar
  • mongo-hadoop-core-1.3.0.jar
以下是我正在使用的代码:

   package com.mongo.test;
    import java.io.*;
    import org.apache.commons.logging.*;
    import org.apache.hadoop.conf.*;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.*;
    import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
    import org.apache.hadoop.mapreduce.lib.output.*;
    import org.apache.hadoop.mapreduce.*;
    import org.bson.*;
    import com.mongodb.MongoClient;
    import com.mongodb.hadoop.*;
    import com.mongodb.hadoop.util.*;

    public class ImportFromMongoToHdfs {
    private static final Log log =  
    LogFactory.getLog(ImportFromMongoToHdfs.class);
    public static class ReadEmpDataFromMongo extends Mapper<Object,    
    BSONObject, Text, Text>{
    public void map(Object key, BSONObject value, Context context) throws  
    IOException, InterruptedException{
    System.out.println("Key: " + key);
    System.out.println("Value: " + value);
    String md5 = value.get("md5").toString();
    String name = value.get("name").toString();
    String dev = value.get("dev").toString();
    String salary = value.get("salary").toString();
    String location = value.get("location").toString();
    String output = "\t" + name + "\t" + dev + "\t" + salary + "\t" +   
    location;
    context.write( new Text(md5), new Text(output));
    }
    }
    public static void main(String[] args)throws Exception { 
    final Configuration conf = new Configuration();
    MongoConfigUtil.setInputURI(conf,"mongodb://10.25.3.196:27017/admin.emp")
    ;
    MongoConfigUtil.setCreateInputSplits(conf, false);
    System.out.println("Configuration: " + conf);
    final Job job = new Job(conf, "ReadWeblogsFromMongo");
    Path out = new Path("/mongodb3");
    FileOutputFormat.setOutputPath(job, out);
    job.setJarByClass(ImportFromMongoToHdfs.class);
    job.setMapperClass(ReadEmpDataFromMongo.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(Text.class);
    job.setInputFormatClass(com.mongodb.hadoop.MongoInputFormat.class);
    job.setOutputFormatClass(TextOutputFormat.class);
    job.setNumReduceTasks(0);
    System.exit(job.waitForCompletion(true) ? 0 : 1 );
    }
    }

有人知道怎么回事吗?

确保将
mongohadoop
jar保存在hadoop类路径中,然后重新启动hadoop。
应解决错误
java.lang.ClassNotFoundException:Class com.mongodb.hadoop.MongoInputFormat

您将获得ClassNotFoundException,因为您无法访问jar“mongo hadoop core*.jar”。您必须使“mongo hadoop core*.jar”可用于您的代码

有许多方法可以解决此错误-

  • 为你的程序创建胖罐子。Fat jar将包含所有必要的依赖jar。如果使用任何IDE,您都可以轻松创建胖jar

  • 提交工作时使用“-libjars”参数

  • 将mongo JAR复制到Hadoop_类路径位置


  • 我刚刚解决了这样一个问题。事实上,这在运行时是一个错误。如果我们将Hadoop\u类路径设置为指向外部必需的jar文件,这还不够。因为,我认为在运行时,Hadoop会在安装Hadoop的文件夹中查找jar文件。我意识到我们需要在Hadoop安装的文件夹中复制所有必要的外部jar文件。因此: 首先,您需要通过键入以下内容来检查
    HADOOP\u类路径
    : -hadoop类路径 然后将必要的外部jar文件复制到一个
    HADOOP\u类路径中。例如,我将把
    mongo-hadoop-1.5.1.jar
    和其他一些jar文件复制到文件夹
    /usr/local/hadoop/share/hadoop/mapreduce


    那对我来说就行了

    你能用它来格式化你的帖子吗,因为它很难阅读。感谢sras提供的宝贵答案,我已经将mongo hadoop jar粘贴到hadoop/lib文件夹中。然后重新启动hadoop,它仍然无法工作Tanks pradeep为您提供的宝贵答案,我已经将mongo hadoop jar粘贴到hadoop/lib文件夹中。并且重新启动hadoop,它仍然无法工作。如何查找Hadoop_类路径位置rams Hadoop。
     [root@sandbox ~]# hadoop jar /mongoinput/mongdbconnect.jar com.mongo.test.ImportFromMongoToHdfs
    
    WARNING: Use "yarn jar" to launch YARN applications.
    Configuration: Configuration: core-default.xml, core-site.xml
    15/09/09 09:22:51 INFO impl.TimelineClientImpl: Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/
    15/09/09 09:22:53 INFO client.RMProxy: Connecting to ResourceManager at sandbox.hortonworks.com/10.25.3.209:8050
    15/09/09 09:22:53 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
    15/09/09 09:22:54 INFO splitter.SingleMongoSplitter: SingleMongoSplitter calculating splits for mongodb://10.25.3.196:27017/admin.emp
    15/09/09 09:22:54 INFO mapreduce.JobSubmitter: number of splits:1
    15/09/09 09:22:55 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1441784509780_0003
    15/09/09 09:22:55 INFO impl.YarnClientImpl: Submitted application application_1441784509780_0003
    15/09/09 09:22:55 INFO mapreduce.Job: The url to track the job: http://sandbox.hortonworks.com:8088/proxy/application_1441784509780_0003/
    15/09/09 09:22:55 INFO mapreduce.Job: Running job: job_1441784509780_0003
    15/09/09 09:23:05 INFO mapreduce.Job: Job job_1441784509780_0003 running in uber mode : false
    15/09/09 09:23:05 INFO mapreduce.Job:  map 0% reduce 0%
    15/09/09 09:23:12 INFO mapreduce.Job: Task Id : attempt_1441784509780_0003_m_000000_0, Status : FAILED
    Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.mongodb.hadoop.MongoInputFormat not found
            at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
            at org.apache.hadoop.mapreduce.task.JobContextImpl.getInputFormatClass(JobContextImpl.java:174)
            at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:749)
            at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
            at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
            at java.security.AccessController.doPrivileged(Native Method)
            at javax.security.auth.Subject.doAs(Subject.java:415)
            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
            at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
    Caused by: java.lang.ClassNotFoundException: Class com.mongodb.hadoop.MongoInputFormat not found
            at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
            at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
            ... 8 more
    15/09/09 09:23:18 INFO mapreduce.Job: Task Id : attempt_1441784509780_0003_m_000000_1, Status : FAILED
    Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.mongodb.hadoop.MongoInputFormat not found
            at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
            at org.apache.hadoop.mapreduce.task.JobContextImpl.getInputFormatClass(JobContextImpl.java:174)
            at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:749)
            at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
            at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
            at java.security.AccessController.doPrivileged(Native Method)
            at javax.security.auth.Subject.doAs(Subject.java:415)
            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
            at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
    Caused by: java.lang.ClassNotFoundException: Class com.mongodb.hadoop.MongoInputFormat not found
            at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
            at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
            ... 8 more
    15/09/09 09:23:24 INFO mapreduce.Job: Task Id : attempt_1441784509780_0003_m_000000_2, Status : FAILED
    Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.mongodb.hadoop.MongoInputFormat not found
            at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
            at org.apache.hadoop.mapreduce.task.JobContextImpl.getInputFormatClass(JobContextImpl.java:174)
            at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:749)
            at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
            at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
            at java.security.AccessController.doPrivileged(Native Method)
            at javax.security.auth.Subject.doAs(Subject.java:415)
            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
    Caused by: java.lang.ClassNotFoundException: Class com.mongodb.hadoop.MongoInputFormat not found
            at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
            at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
            ... 8 more
    
    15/09/09 09:23:32 INFO mapreduce.Job:  map 100% reduce 0%
    15/09/09 09:23:32 INFO mapreduce.Job: Job job_1441784509780_0003 failed with state FAILED due to: Task failed task_1441784509780_0003_m_000000
    Job failed as tasks failed. failedMaps:1 failedReduces:0
    15/09/09 09:23:32 INFO mapreduce.Job: Counters: 9
            Job Counters
                    Failed map tasks=4
                    Launched map tasks=4
                    Other local map tasks=3
                    Rack-local map tasks=1
                    Total time spent by all maps in occupied slots (ms)=16996
                    Total time spent by all reduces in occupied slots (ms)=0
                    Total time spent by all map tasks (ms)=16996
                    Total vcore-seconds taken by all map tasks=16996
                    Total megabyte-seconds taken by all map tasks=4249000
    [root@sandbox ~]#