Amazon ec2 无法从EMR配置单元群集连接到远程EC2 hbase群集
在我提问之前,让我先解释一下情况。我在Amazon ec2 无法从EMR配置单元群集连接到远程EC2 hbase群集,amazon-ec2,hbase,hive,apache-zookeeper,amazon-emr,Amazon Ec2,Hbase,Hive,Apache Zookeeper,Amazon Emr,在我提问之前,让我先解释一下情况。我在ec2上设置了hbase集群,有3个实例: i-xxxxxxx-- master, zookeeper1, regionserver1 i-xxxxxxx-- slave1, zookeeper2, regionserver2 i-xxxxxxx-- slave2, zookeeper3, regionserver3. 它工作得非常好。现在,我正试图通过安装了hive的远程EMR实例连接到此集群的主机。 因此,我遵循了amazon aws文档的以下链接:
ec2
上设置了hbase
集群,有3个实例:
i-xxxxxxx-- master, zookeeper1, regionserver1
i-xxxxxxx-- slave1, zookeeper2, regionserver2
i-xxxxxxx-- slave2, zookeeper3, regionserver3.
它工作得非常好。现在,我正试图通过安装了hive的远程EMR实例连接到此集群的主机。
因此,我遵循了amazon aws文档的以下链接:
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-hbase-access-hive.html
在你创建表格之前。让配置单元知道远程hbase主机的公共DNS。
设置hbase.zookeeper.quorum=公共DNS名称;
我确实做了刚才提到的事。但是,我无法连接到主机。我尝试使用以下脚本创建一个表:
CREATE TABLE hauto(cookie string, timespent string, pageviews string, visit string, logdate string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = "m:timespent, m:pageviews, m:visit, m:logdate")
TBLPROPERTIES ("hbase.table.name" = "hauto");
所以,它给出了这个错误:
FAILED: Error in metadata: MetaException(message:org.apache.hadoop.hbase.MasterNotRunningException: Retried 10 times
at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:127)
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:74)
at org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:148)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:467)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:460)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
at com.sun.proxy.$Proxy14.createTable(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:600)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3791)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:258)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:144)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:310)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:231)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:466)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:819)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:674)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:187)
)
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
我正在使用hadoop 1.0.1、hbase 0.94.11、hive 0.11.0和zookeeper-3.4.3。动物园管理员是外部经理
而且,在那之后,我尝试了另一种方法。这次我尝试从安装在EC2 Hbase群集上的hive CLI连接到EMR Hbase。现在,我能够在EMR hbase上创建相同的表
更新问题:
这似乎是亚马逊ec2的一个问题。我向您提供了这些问题的快照,还提供了以下链接:
http://hbase.apache.org/book.html#trouble.log.gc
您是否启用了安全组(防火墙)中所有必要的端口?是的,我将my Ec2 hbase群集保留在同一个安全组中,默认情况下,EMR会在该安全组中旋转Hadoop配置单元实例。在该组中,我允许组内所有端口上的tcp连接。您应该允许udp连接。对不起,我忘了前面提到的。但是,我还允许在该安全组的所有端口上使用ICMP和UDP连接。请尝试在该端口上使用telnet。
http://hbase.apache.org/book.html#trouble.log.gc