Mongodb 如何执行Mongo Hadoop连接器的传感器日志示例?

Mongodb 如何执行Mongo Hadoop连接器的传感器日志示例?,mongodb,hadoop,jar,Mongodb,Hadoop,Jar,我想将MongoDB与Hadoop结合起来。我发现的是。但是,我找不到有关此示例的完整文档 mongo hadoop/examples/sensors中有四个文件,分别位于mongo hadoop/examples/sensors,build,run\u job.sh,src,testdata\u generator.js。我使用testdata\u generator.js将数据导入MongoDB,dbs是demo。当我尝试运行run_job.sh时,出现异常: MongoDB shell v

我想将MongoDB与Hadoop结合起来。我发现的是。但是,我找不到有关此示例的完整文档

mongo hadoop/examples/sensors中有四个文件,分别位于
mongo hadoop/examples/sensors
build
run\u job.sh
src
testdata\u generator.js
。我使用
testdata\u generator.js
将数据导入MongoDB,dbs是
demo
。当我尝试运行
run_job.sh
时,出现异常:

MongoDB shell version: 2.6.1
connecting to: demo
false
Exception in thread "main" java.lang.ClassNotFoundException: -D
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:249)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:205)
运行作业。sh

#!/bin/sh


mongo demo --eval "db.logs_aggregate.drop()"
#Set your HADOOP_HOME directory here.
#export HADOOP_HOME="/Users/mike/hadoop/hadoop-2.0.0-cdh4.3.0" 
export HADOOP_HOME="/home/hduser/hadoop"

#FIRST PASS - map all the devices into an output collection
declare -a job1_args
job1_args=("jar" "`pwd`/build/libs/sensors-1.2.1-SNAPSHOT-hadoop_2.2.jar")
#job1_args=(${job1_args[@]} "com.mongodb.hadoop.examples.sensors.Devices")
job1_args=(${job1_args[@]} "-D" "mongo.job.input.format=com.mongodb.hadoop.MongoInputFormat")
job1_args=(${job1_args[@]} "-D" "mongo.input.uri=mongodb://localhost:27017/demo.devices")
job1_args=(${job1_args[@]} "-D" "mongo.job.mapper=com.mongodb.hadoop.examples.sensors.DeviceMapper")
job1_args=(${job1_args[@]} "-D" "mongo.job.reducer=com.mongodb.hadoop.examples.sensors.DeviceReducer")

job1_args=(${job1_args[@]} "-D" "mongo.job.output.key=org.apache.hadoop.io.Text")
job1_args=(${job1_args[@]} "-D" "mongo.job.output.value=org.apache.hadoop.io.Text")

job1_args=(${job1_args[@]} "-D" "mongo.output.uri=mongodb://localhost:27017/demo.logs_aggregate")
job1_args=(${job1_args[@]} "-D" "mongo.job.output.format=com.mongodb.hadoop.MongoOutputFormat")
$HADOOP_HOME/bin/hadoop "${job1_args[@]}" "$1"
我可以在我的计算机上运行基本的Map/Reduce示例,但是这个问题困扰了我很多天

新编辑的内容:

我可以通过以下步骤运行此示例:

  • 编译
    Devices.java
    DeviceMapper.java
    DeviceDucer.java
    , 和
    SensorDataGenerator.java
    to.class;命令是javac-classpath[library] 文件]-d[文件夹]Devices.java设备apper.java DeviceDucer.java SensorDataGenerator.java
  • 将.class文件编译为.jar;命令是jar-cvf[jar文件名]-C[path]
  • 执行hadoop;命令是hadoop
    jar[jar文件名][classname]
  • 但是我不知道为什么我不能成功地执行
    run\u job.sh

    Devices.java是本示例中的主java文件:

    public class Devices extends MongoTool {
    
        public Devices() throws UnknownHostException {
            Configuration conf = new Configuration();
            MongoConfig config = new MongoConfig(conf);
            setConf(conf);
    
            config.setInputFormat(MongoInputFormat.class);
            config.setInputURI("mongodb://localhost:27017/demo.devices");
            config.setOutputFormat(MongoOutputFormat.class);
            config.setOutputURI("mongodb://localhost:27017/demo.logs_aggregate");
    
            config.setMapper(DeviceMapper.class);
            config.setReducer(DeviceReducer.class);
            config.setMapperOutputKey(Text.class);
            config.setMapperOutputValue(Text.class);
            config.setOutputKey(IntWritable.class);
            config.setOutputValue(BSONWritable.class);
    
            new SensorDataGenerator().run();
        }
        public static void main(final String[] pArgs) throws Exception {
            System.exit(ToolRunner.run(new Devices(), pArgs));
        }
    }
    

    使用gradle运行它。这些bash脚本有点过时,应该删除:

    ./gradlew传感器数据