Hive 配置单元元存储读取超时

Hive 配置单元元存储读取超时,hive,hive-metastore,Hive,Hive Metastore,我们正在使用Hive2.3.3以及Hadoop2.7.7和Spark 2.4.4 我们使用MariaDB作为Metastore的后端数据库 Metastore服务的启动情况良好,我能够访问配置单元CLI并执行查询操作。我启动HS2不是为了隔离问题 但是过了一段时间,Hive Metastore服务突然停止,甚至对简单的查询show databases或show tables也没有响应 我们对其执行其他查询的所有表都是空的(到目前为止没有创建分区),并且这个环境是新创建的 hive.log错误文件

我们正在使用Hive2.3.3以及Hadoop2.7.7和Spark 2.4.4

我们使用MariaDB作为Metastore的后端数据库

Metastore服务的启动情况良好,我能够访问配置单元CLI并执行查询操作。我启动HS2不是为了隔离问题

但是过了一段时间,Hive Metastore服务突然停止,甚至对简单的查询
show databases
show tables
也没有响应

我们对其执行其他查询的所有表都是空的(到目前为止没有创建分区),并且这个环境是新创建的

hive.log错误文件:

2020-10-09T18:43:56,971 DEBUG [IPC Client (1223050066) connection to master/10.28.66.65:8020 from hadoopuser] ipc.Client: IPC Client (1223050066) connection to master/10.28.66.65:8020 from hadoopuser: closed
2020-10-09T18:43:56,971 DEBUG [IPC Client (1223050066) connection to master/10.28.66.65:8020 from hadoopuser] ipc.Client: IPC Client (1223050066) connection to master/10.28.66.65:8020 from hadoopuser: stopped, remaining connections 0
2020-10-09T18:53:53,769  WARN [9edaa669-999e-41a9-a7b3-bcee9d6198f4 main] metastore.RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect (1 of 1) after 5s. getAllFunctions
org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
    at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) ~[hive-exec-2.3.3.jar:2.3.3]
    at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) ~[hive-exec-2.3.3.jar:2.3.3]
    at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) ~[hive-exec-2.3.3.jar:2.3.3]
    at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) ~[hive-exec-2.3.3.jar:2.3.3]
    at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) ~[hive-exec-2.3.3.jar:2.3.3]
    at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77) ~[hive-exec-2.3.3.jar:2.3.3]
    at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_all_functions(ThriftHiveMetastore.java:3812) ~[hive-exec-2.3.3.jar:2.3.3]
    at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_all_functions(ThriftHiveMetastore.java:3800) ~[hive-exec-2.3.3.jar:2.3.3]
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllFunctions(HiveMetaStoreClient.java:2393) ~[hive-exec-2.3.3.jar:2.3.3]
....


2020-10-09T18:53:58,777  INFO [9edaa669-999e-41a9-a7b3-bcee9d6198f4 main] hive.metastore: Closed a connection to metastore, current connections: 0
2020-10-09T18:53:58,777  INFO [9edaa669-999e-41a9-a7b3-bcee9d6198f4 main] hive.metastore: Trying to connect to metastore with URI thrift://master:9083
2020-10-09T18:53:58,779  INFO [9edaa669-999e-41a9-a7b3-bcee9d6198f4 main] hive.metastore: Opened a connection to metastore, current connections: 1
2020-10-09T18:53:58,805  INFO [9edaa669-999e-41a9-a7b3-bcee9d6198f4 main] hive.metastore: Connected to metastore.

由于您也失去了与namenode的连接(从hadoopuser:closed到master/10.28.66.65:8020的连接),我将首先查看您的网络设置。检查表中有多少个分区?如果超时次数太多,则发生超时的可能性很高&同时检查
hive.metastore.client.socket.timeout
值?您好,感谢您的回复。与8020(纱线主机)的连接没有故障。从metastore到数据库的连接失败,我猜是到数据库。而且,即使hadoop中没有任何数据,问题也会发生,因此我忽略了分区问题。在之间,我已将hive.metastore.client.socket.timeout的值设置为1800。由于您也正在断开与namenode的连接(hadoopuser:closed与master/10.28.66.65:8020的连接),我将首先查看您的网络设置。检查表中有多少个分区?如果超时次数太多,则发生超时的可能性很高&同时检查
hive.metastore.client.socket.timeout
值?您好,感谢您的回复。与8020(纱线主机)的连接没有故障。从metastore到数据库的连接失败,我猜是到数据库。而且,即使hadoop中没有任何数据,问题也会发生,因此我忽略了分区问题。在之间,我已将hive.metastore.client.socket.timeout的值设置为1800。