Hadoop Hbase master不断消亡,声称Hbase:命名空间已存在

Hadoop Hbase master不断消亡,声称Hbase:命名空间已存在,hadoop,hbase,Hadoop,Hbase,在今天的hbase节目中,我陷入了绝境,我们遇到了一个问题,hbase主机启动后很快就会死掉。我的主日志如下所示: 2014-06-20 12:52:40,469 FATAL [master:hdev01:60000] master.HMaster: Master serve r abort: loaded coprocessors are: [] 2014-06-20 12:52:40,470 FATAL [master:hdev01:60000] master.HMaster: Unhand

在今天的hbase节目中,我陷入了绝境,我们遇到了一个问题,hbase主机启动后很快就会死掉。我的主日志如下所示:

2014-06-20 12:52:40,469 FATAL [master:hdev01:60000] master.HMaster: Master serve
r abort: loaded coprocessors are: []
2014-06-20 12:52:40,470 FATAL [master:hdev01:60000] master.HMaster: Unhandled ex
ception. Starting shutdown.
org.apache.hadoop.hbase.TableExistsException: hbase:namespace
        at org.apache.hadoop.hbase.master.handler.CreateTableHandler.prepare(Cre
ateTableHandler.java:120)
        at org.apache.hadoop.hbase.master.TableNamespaceManager.createNamespaceT
able(TableNamespaceManager.java:232)
        at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNames
paceManager.java:86)
        at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:106
2)
        at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.j
ava:926)
        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:615)
        at java.lang.Thread.run(Thread.java:662)
2014-06-20 12:52:40,473 INFO  [master:hdev01:60000] master.HMaster: Aborting
2014-06-20 12:52:40,473 DEBUG [master:hdev01:60000] master.HMaster: Stopping ser
vice threads
2014-06-20 12:52:40,473 INFO  [master:hdev01:60000] ipc.RpcServer: Stopping serv
er on 60000
2014-06-20 12:52:40,473 INFO  [CatalogJanitor-hdev01:60000] master.CatalogJanito
r: CatalogJanitor-hdev01:60000 exiting
2014-06-20 12:52:40,473 INFO  [hdev01,60000,1403283149823-BalancerChore] balance
r.BalancerChore: hdev01,60000,1403283149823-BalancerChore exiting
2014-06-20 12:52:40,474 INFO  [RpcServer.listener,port=60000] ipc.RpcServer: Rpc
Server.listener,port=60000: stopping
2014-06-20 12:52:40,474 INFO  [RpcServer.responder] ipc.RpcServer: RpcServer.res
ponder: stopped
2014-06-20 12:52:40,474 INFO  [master:hdev01:60000] master.HMaster: Stopping inf
oServer
2014-06-20 12:52:40,474 INFO  [RpcServer.responder] ipc.RpcServer: RpcServer.res
ponder: stopping
2014-06-20 12:52:40,474 INFO  [master:hdev01:60000.oldLogCleaner] cleaner.LogCle
aner: master:hdev01:60000.oldLogCleaner exiting
2014-06-20 12:52:40,475 INFO  [hdev01,60000,1403283149823-ClusterStatusChore] ba
lancer.ClusterStatusChore: hdev01,60000,1403283149823-ClusterStatusChore exiting

2014-06-20 12:52:40,476 INFO  [master:hdev01:60000.oldLogCleaner] master.Replica
tionLogCleaner: Stopping replicationLogCleaner-0x246ba2ab1e4001c, quorum=hdev02:
5181,hdev01:5181,hdev03:5181, baseZNode=/hbase
2014-06-20 12:52:40,479 INFO  [master:hdev01:60000] mortbay.log: Stopped SelectC
hannelConnector@0.0.0.0:16010
2014-06-20 12:52:40,478 INFO  [master:hdev01:60000.archivedHFileCleaner] cleaner
.HFileCleaner: master:hdev01:60000.archivedHFileCleaner exiting
2014-06-20 12:52:40,483 INFO  [master:hdev01:60000.oldLogCleaner] zookeeper.ZooK
eeper: Session: 0x246ba2ab1e4001c closed
2014-06-20 12:52:40,484 INFO  [master:hdev01:60000-EventThread] zookeeper.Client
Cnxn: EventThread shut down
2014-06-20 12:52:40,589 DEBUG [master:hdev01:60000] catalog.CatalogTracker: Stop
ping catalog tracker org.apache.hadoop.hbase.catalog.CatalogTracker@f3f348b
2014-06-20 12:52:40,591 INFO  [master:hdev01:60000] client.HConnectionManager$HC
onnectionImplementation: Closing zookeeper sessionid=0x246ba2ab1e4001b
2014-06-20 12:52:40,592 INFO  [master:hdev01:60000] zookeeper.ZooKeeper: Session
: 0x246ba2ab1e4001b closed
2014-06-20 12:52:40,592 INFO  [master:hdev01:60000-EventThread] zookeeper.Client
Cnxn: EventThread shut down
2014-06-20 12:52:40,695 INFO  [hdev01,60000,1403283149823.splitLogManagerTimeout
Monitor] master.SplitLogManager$TimeoutMonitor: hdev01,60000,1403283149823.split
LogManagerTimeoutMonitor exiting
2014-06-20 12:52:40,696 INFO  [master:hdev01:60000] zookeeper.ZooKeeper: Session
: 0x246ba2ab1e4001a closed
2014-06-20 12:52:40,696 INFO  [main-EventThread] zookeeper.ClientCnxn: EventThre
ad shut down
2014-06-20 12:52:40,696 INFO  [master:hdev01:60000] master.HMaster: HMaster main
 thread exiting
2014-06-20 12:52:40,697 ERROR [main] master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: HMaster Aborted
        at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMaster
CommandLine.java:194)
        at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandL
ine.java:135)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLi
ne.java:126)
        at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2803)
我认为这可能是旧运行的一些残余,所以我删除了hbase数据目录、zookeers数据目录和hdfs中的文件。我还是犯了同样的错误。奇怪的是,当我运行stop-hbase.sh时,我的HMaster popper又临时恢复了,尽管我对此无能为力

我的Hbase版本是98.3,hadoop版本是2.2.0。我的hbase-site.comf是

<configuration>
<property>
  <name>hbase.master</name>
  <value>hdev01:60000</value>
  <description>The host and port that the HBase master runs at.
                                                     A value of 'local' runs the master and a regionserver
                                                     in a single process.
                                </description>
</property>
<property>
  <name>hbase.rootdir</name>
  <value>hdfs://hdev01:9000/hbase</value>
  <description>The directory shared by region servers.</description>
</property>
<property>
  <name>hbase.cluster.distributed</name>
  <value>true</value>
  <description>The mode the cluster will be in. Possible values are
                                false: standalone and pseudo-distributed setups with managed
                                Zookeeper true: fully-distributed with unmanaged Zookeeper
                                Quorum (see hbase-env.sh)
                                </description>
</property>
<property>
  <name>hbase.zookeeper.property.clientPort</name>
  <value>5181</value>
  <description>Property from ZooKeeper's config zoo.cfg.
    The port at which the clients will connect.
    </description>
</property>
<property>
  <name>zookeeper.session.timeout</name>
  <value>10000</value>
  <description></description>
</property>
<property>
  <name>hbase.client.retries.number</name>
  <value>10</value>
  <description></description>
</property>
<property>
  <name>hbase.zookeeper.quorum</name>
  <value>hdev01,hdev02,hdev03</value>
  <description>Comma separated list of servers in the ZooKeeper Quorum. For example, "host1.mydomain.com,host2.mydomain.com". By default this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper quorum servers. If
                                     HBASE_MANAGES_ZK is set in hbase-env.sh
                                     this is the list of servers which we will start/stop
                                     ZooKeeper on.
                </description>
</property>
</configuration>

hbase.master
hdev01:60000
HBase主机运行的主机和端口。
值“local”运行主服务器和regionserver
在一个过程中。
hbase.rootdir
hdfs://hdev01:9000/hbase
区域服务器共享的目录。
hbase.cluster.distributed
真的
群集将处于的模式。可能的值是
false:独立和伪分布式设置,带有托管
Zookeeper true:与非托管Zookeeper一起完全分发
法定人数(见hbase env.sh)
hbase.zookeeper.property.clientPort
5181
来自ZooKeeper的config zoo.cfg的属性。
客户端将连接的端口。
zookeeper.session.timeout
10000
hbase.client.retries.number
10
hbase.zookeeper.quorum
hdev01、hdev02、hdev03
ZooKeeper仲裁中以逗号分隔的服务器列表。例如,“host1.mydomain.com,host2.mydomain.com”。默认情况下,对于本地和伪分布式操作模式,这设置为localhost。对于完全分布式设置,应将其设置为ZooKeeper仲裁服务器的完整列表。如果
HBASE__ZK在HBASE-env.sh中设置
这是我们将启动/停止的服务器列表
动物园管理员继续。
编辑 尝试了hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair,现在我的错误是
hbase文件布局需要升级。你的版本为空,我想要版本8。您的hbase.rootdir有效吗?如果是这样,您可能需要运行“hbase hbck-fixVersionFile”
这是没有帮助的,因为没有主hbck将无法实际运行。 编辑
我取消并重新启动了dfs,然后再次尝试修复和启动,现在我又回到了开始的位置。

hbase名称空间是hbase用于其自身管理表的内部名称空间。尝试运行脱机修复工具 从$HBASE_主目录:

 ./bin/hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair
su-hdfs

hbase org.apache.hadoop.hbase.util.hbck.OfflineMetaRepair

(重新启动hbase主机。如果仍然存在问题,请执行以下操作)

zookeeper客户端(输入)

rmr/hbase

退出

然后重新启动hbase主服务

@shash: 当HBase管理ZooKeeper(即HBase_管理_ZK=true)时,访问和清理HBase数据的命令为:
hbase zkcli。然后,使用命令rmr/hbase清理hbae,然后退出

这至少改变了我的错误。现在它告诉我警告!需要升级HBase文件布局。你的版本为空,我想要版本8。您的hbase.rootdir有效吗?如果是这样,您可能需要运行“hbase hbck-fixVersionFile”。这是没有帮助的,因为我缺少运行hbck的主机,在取消dfs并重新运行脱机修复工具后,我回到了开始的位置。您也可以登录zookeeper并删除/hbase dir(hbase zkcli和rmr/hbase dir-小心不要删除任何其他内容)所以最后,我尝试了各种方法来关闭和打开它。谢天谢地,这一次似乎奏效了,非常感谢你的帮助@arnon rotem gal oz-删除/hbase znode也会导致所有hbase数据丢失,包括表、区域服务器等。?或者Hbase Master是否能够构建该信息并且不会丢失数据?如果Zookeeper由Hbase内部管理,即Hbase_manages_ZK=true,该怎么办。那么如何删除zookeeper中的数据呢?我尝试了修复命令,但没有用。同样的错误