Mongodb 无法在HDP中的配置单元查询中使用mongo hadoop连接器
我是hadoop新手。我已经安装了hortonworks沙盒2.1。 我正在尝试使用配置单元UI执行配置单元脚本。我想访问蜂巢内的mongo收藏。为此,我使用了以下查询:Mongodb 无法在HDP中的配置单元查询中使用mongo hadoop连接器,mongodb,hadoop,hive,Mongodb,Hadoop,Hive,我是hadoop新手。我已经安装了hortonworks沙盒2.1。 我正在尝试使用配置单元UI执行配置单元脚本。我想访问蜂巢内的mongo收藏。为此,我使用了以下查询: CREATE TABLE individuals ( id INT, name STRING, age INT, city STRING, hobby STRING ) STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler' WITH SERDEPR
CREATE TABLE individuals
(
id INT,
name STRING,
age INT,
city STRING,
hobby STRING
)
STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'
WITH SERDEPROPERTIES('mongo.columns.mapping'='{"id":"_id"}')
TBLPROPERTIES('mongo.uri'='mongodb://<hostIP>:27017/test.test');
有人能帮我告诉我我错过了什么吗
提前感谢。您需要映射mongodb集合中的所有项目,而不仅仅是“\u id”:
创建表个人
(
id INT,
名称字符串,
年龄智力,
城市弦,
爱好串
)
由'com.mongodb.hadoop.hive.mongostragehandler'存储
使用serdeProperty('mongo.columns.mapping'='{“id”:“\u id”,“name”:“,“age”:”等..}')
TBLProperty('mongo.uri'='mongodb://:27017/test.test');
谢谢你。我可以创建表格。但现在,当我尝试使用select*from table_name获取数据时,它失败了,出现了-java.io.IOException:com.mongodb.MongoTimeoutException:10000毫秒后等待连接时超时。
15/03/11 04:38:24 INFO exec.DDLTask: Use StorageHandler-supplied com.mongodb.hadoop.hive.BSONSerDe for table individuals
15/03/11 04:38:24 ERROR exec.DDLTask: java.lang.NoClassDefFoundError: com/mongodb/util/JSON
at com.mongodb.hadoop.hive.BSONSerDe.initialize(BSONSerDe.java:107)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:339)
at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:283)
at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:276)
at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:626)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:593)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4194)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:281)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1504)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1271)
at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState.execute(BeeswaxServiceImpl.java:349)
at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1$1.run(BeeswaxServiceImpl.java:614)
at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1$1.run(BeeswaxServiceImpl.java:603)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:356)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1537)
at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState$1.run(BeeswaxServiceImpl.java:603)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
CREATE TABLE individuals
(
id INT,
name STRING,
age INT,
city STRING,
hobby STRING
)
STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'
WITH SERDEPROPERTIES('mongo.columns.mapping'='{"id":"_id","name":"<corresponding name in your collection>", "age":"<same here>", etc...}')
TBLPROPERTIES('mongo.uri'='mongodb://<hostIP>:27017/test.test');