Hadoop 使用Kerberos安全性连接到映射器中的配置单元
我的目标是在安全集群(kerberos)HDP2.3上的Oozie workflow scheduler中运行连接到配置单元的MapReduce 我能够以直线连接到hive,或者当我使用以下连接字符串作为java应用程序(Thread jar)运行hive时:Hadoop 使用Kerberos安全性连接到映射器中的配置单元,hadoop,mapreduce,hive,kerberos,oozie,Hadoop,Mapreduce,Hive,Kerberos,Oozie,我的目标是在安全集群(kerberos)HDP2.3上的Oozie workflow scheduler中运行连接到配置单元的MapReduce 我能够以直线连接到hive,或者当我使用以下连接字符串作为java应用程序(Thread jar)运行hive时: DriverManager.getConnection("jdbc:hive2://host:10000/;principal=hive/_HOST@REALM", "", ""); 但当我在Mapper中运行它时,它失败了 ERRO
DriverManager.getConnection("jdbc:hive2://host:10000/;principal=hive/_HOST@REALM", "", "");
但当我在Mapper中运行它时,它失败了
ERROR [main] org.apache.thrift.transport.TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:190)
...
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
如何使其在Mapper中工作?错误“找不到任何Kerberos tgt”非常简单。在连接到正在执行映射程序的节点上的配置单元之前,您需要获得kerberos票证。它与配置单元委派令牌一起工作:
- 添加属性:
hive2.server.principal=hive/_HOST@REALM hive2.jdbc.url=jdbc:hive2://{host}:10000/default
- 将凭据设置为hive2
- 映射器示例:
public class HiveMapperExample extends Mapper<LongWritable, Text, Text, Text> { @Override protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { try { Class.forName("org.apache.hive.jdbc.HiveDriver"); Connection connect = DriverManager.getConnection("jdbc:hive2://{host}:10000/;auth=delegationToken", "", ""); Statement state = connect.createStatement(); ResultSet resultSet = state.executeQuery("select * from some_table"); while (resultSet.next()) { ... } } catch (Exception e) { ... } } }
public类HiveMapperExample扩展映射器{ @凌驾 受保护的void映射(LongWritable键、文本值、上下文)引发IOException、InterruptedException{ 试一试{ Class.forName(“org.apache.hive.jdbc.HiveDriver”); Connection connect=DriverManager.getConnection(“jdbc:hive2://{host}:10000/;auth=delegationToken”,“”,“”); 语句状态=connect.createStatement(); ResultSet ResultSet=state.executeQuery(“从某个表中选择*); while(resultSet.next()){ ... } }捕获(例外e){ ... } } }
public class HiveTestApplication extends Configured implements Tool {
public static void main(String[] args) throws Exception {
System.exit(ToolRunner.run(new HiveTestApplication(), args));
}
@Override
public int run(String[] args) throws Exception {
Configuration conf = new Configuration();
//set your conf
Job job = Job.getInstance(conf);
job.setMapperClass(HiveMapperExample.class);
addHiveDelegationToken(job.getCredentials(), "jdbc:hive2://{host}:10000/", "hive/_HOST@REALM");
job.waitForCompletion(true);
return 0;
}
public void addHiveDelegationToken(Credentials creds, String url, String principal) throws Exception {
Class.forName("org.apache.hive.jdbc.HiveDriver");
Connection con = DriverManager.getConnection(url + ";principal=" + principal);
// get delegation token for the given proxy user
String tokenStr = ((HiveConnection) con).getDelegationToken(UserGroupInformation.getCurrentUser().getShortUserName(), principal);
con.close();
Token<DelegationTokenIdentifier> hive2Token = new Token<>();
hive2Token.decodeFromUrlString(tokenStr);
creds.addToken(new Text("hive.server2.delegation.token"), hive2Token);
creds.addToken(new Text(HiveAuthFactory.HS2_CLIENT_TOKEN), hive2Token);
}
}
公共类HiveTestApplication扩展配置的工具{
公共静态void main(字符串[]args)引发异常{
exit(ToolRunner.run(new HiveTestApplication(),args));
}
@凌驾
公共int运行(字符串[]args)引发异常{
Configuration conf=新配置();
//设置您的配置
Job Job=Job.getInstance(conf);
setMapperClass(HiveMapperExample.class);
addHiveDelegationToken(job.getCredentials(),“jdbc:hive2://{host}:10000/”,“hive/_HOST@REALM");
job.waitForCompletion(true);
返回0;
}
public void addHiveDelegationToken(凭据凭据凭据、字符串url、字符串主体)引发异常{
Class.forName(“org.apache.hive.jdbc.HiveDriver”);
Connection con=DriverManager.getConnection(url+“principal=“+principal”);
//获取给定代理用户的委派令牌
字符串tokenStr=((HiveConnection)con.getDelegationToken(UserGroupInformation.getCurrentUser().getShortUserName(),主体);
con.close();
令牌hive2Token=新令牌();
hive2Token.decodeFromUrlString(tokenStr);
creds.addToken(新文本(“hive.server2.delegation.token”)、hive2Token;
creds.addToken(新文本(HiveAuthFactory.HS2_CLIENT_TOKEN),hive2Token);
}
}
怎么做?你能举个例子吗?我目前正在调查委托令牌-如果您尝试这样做,那么要做到这一点并不容易。运行在Thread中的代码具有安全管理器和大量其他设置,可防止“仅获取tgt”,这在map reduce作业之外是微不足道的,但在map reduce作业中却被阻止。第一步中的“将凭据设置为hive2”是什么意思?如何做到这一点?在哪里?好吧,我想我知道你的意思了;将hive2的设置和令牌添加到工作流中,如下所述:-至此,它开始为我工作。