Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何使用pig在mongodb中按_id进行过滤_Mongodb_Hadoop_Apache Pig - Fatal编程技术网

如何使用pig在mongodb中按_id进行过滤

如何使用pig在mongodb中按_id进行过滤,mongodb,hadoop,apache-pig,Mongodb,Hadoop,Apache Pig,我有这样的mongo文档: db.activity_days.findOne() { "_id" : ObjectId("54b4ee617acf9ce0440a3185"), "aca" : 0, "ca" : 0, "cbdw" : true, "day" : ISODate("2014-12-10T00:00:00Z"), "dm" : 0, "fbc" : 0, "go" : 2500, "gs" : [ ],

我有这样的mongo文档:

db.activity_days.findOne()
{
    "_id" : ObjectId("54b4ee617acf9ce0440a3185"),
    "aca" : 0,
    "ca" : 0,
    "cbdw" : true,
    "day" : ISODate("2014-12-10T00:00:00Z"),
    "dm" : 0,
    "fbc" : 0,
    "go" : 2500,
    "gs" : [ ],
    "its" : [
        {
            "_id" : ObjectId("551ac8d44f9f322e2b055d3a"),
            "at" : 2000,
            "atn" : "Running",
            "cas" : 386.514909469507,
            "dis" : 2.788989730832084,
            "du" : 1472,
            "ibr" : false,
            "ide" : false,
            "lcs" : false,
            "pt" : 0,
            "rpt" : 0,
            "src" : 1001,
            "stp" : 0,
            "tcs" : [ ],
            "ts" : 1418257729,
            "u_at" : ISODate("2015-01-13T00:32:10.954Z")
        }
    ],
    "po" : 0,
    "se" : 0,
    "st" : 0,
    "tap3c" : [ ],
    "tzo" : -21600,
    "u_at" : ISODate("2015-01-13T00:32:10.952Z"),
    "uid" : ObjectId("545eb753ae9237b1df115649")
}
db.activity_day.find(_id:{$gt:ObjectId("54a48e000000000000000000"),$lt:ObjectId("54cd6c800000000000000000")})
我想使用pig过滤特殊的_id范围,我可以这样编写mongo查询:

db.activity_days.findOne()
{
    "_id" : ObjectId("54b4ee617acf9ce0440a3185"),
    "aca" : 0,
    "ca" : 0,
    "cbdw" : true,
    "day" : ISODate("2014-12-10T00:00:00Z"),
    "dm" : 0,
    "fbc" : 0,
    "go" : 2500,
    "gs" : [ ],
    "its" : [
        {
            "_id" : ObjectId("551ac8d44f9f322e2b055d3a"),
            "at" : 2000,
            "atn" : "Running",
            "cas" : 386.514909469507,
            "dis" : 2.788989730832084,
            "du" : 1472,
            "ibr" : false,
            "ide" : false,
            "lcs" : false,
            "pt" : 0,
            "rpt" : 0,
            "src" : 1001,
            "stp" : 0,
            "tcs" : [ ],
            "ts" : 1418257729,
            "u_at" : ISODate("2015-01-13T00:32:10.954Z")
        }
    ],
    "po" : 0,
    "se" : 0,
    "st" : 0,
    "tap3c" : [ ],
    "tzo" : -21600,
    "u_at" : ISODate("2015-01-13T00:32:10.952Z"),
    "uid" : ObjectId("545eb753ae9237b1df115649")
}
db.activity_day.find(_id:{$gt:ObjectId("54a48e000000000000000000"),$lt:ObjectId("54cd6c800000000000000000")})

但我不知道如何用pig编写,谁知道呢?

您可以尝试使用
mongohadoop
pig连接器,请参阅

一旦注册了jar(核心、pig和Java驱动程序),例如,
REGISTER/path to/mongo-hadoop-pig-.jar通过,您可以运行:

SET mongo.input.query '{"_id":{"\$gt":{"\$oid":"54a48e000000000000000000},"\$lt":{"\$oid":"54cd6c800000000000000000}}}'
rangeActivityDay = LOAD 'mongodb://localhost:27017/database.collection' USING com.mongodb.hadoop.pig.MongoLoader()
DUMP rangeActivityDay
您可能还希望在转储数据之前使用

上面的测试使用:
mongo-java-driver-3.0.0-rc1.jar
mongo-hadoop-pig-1.4.0.jar
mongo-hadoop-core-1.4.0.jar