Apache 有没有办法在配置单元UDF中获取数据库名称

Apache 有没有办法在配置单元UDF中获取数据库名称,apache,hadoop,mapreduce,hql,Apache,Hadoop,Mapreduce,Hql,我正在写一个蜂巢UDF 我必须获得数据库的名称(函数部署在中)。然后,我需要根据数据库环境从hdfs访问一些文件。您能否帮助我使用哪个函数从配置单元UDF运行HQL查询 编写UDF类并准备jar文件 在配置单元查询中使用此UDF,如下所述 hive>使用mydb; 好啊 所用时间:0.454秒 hive> ADD jar /root/MyUdf.jar; Added [/root/MyUdf.jar] to class path Added resources: [/root/MyUdf.

我正在写一个蜂巢UDF

我必须获得数据库的名称(函数部署在中)。然后,我需要根据数据库环境从hdfs访问一些文件。您能否帮助我使用哪个函数从配置单元UDF运行HQL查询

  • 编写UDF类并准备jar文件
  • 在配置单元查询中使用此UDF,如下所述
  • hive>使用mydb; 好啊 所用时间:0.454秒

    hive> ADD jar /root/MyUdf.jar;
    Added [/root/MyUdf.jar] to class path
    Added resources: [/root/MyUdf.jar]
    
    hive> create temporary function myUdfFunction as 'com.hiveudf.strmnp.MyHiveUdf';
    OK
    Time taken: 0.018 seconds
    
    hive> select myUdfFunction(username,current_database()) from users;
    Query ID = root_20170407151010_2ae29523-cd9f-4585-b334-e0b61db2c57b
    Total jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_1491484583384_0004, Tracking URL = http://mac127:8088/proxy/application_1491484583384_0004/
    Kill Command = /opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/lib/hadoop/bin/hadoop job  -kill job_1491484583384_0004
    Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
    2017-04-07 15:11:11,376 Stage-1 map = 0%,  reduce = 0%
    2017-04-07 15:11:19,766 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.12 sec
    MapReduce Total cumulative CPU time: 3 seconds 120 msec
    Ended Job = job_1491484583384_0004
    MapReduce Jobs Launched:
    Stage-Stage-1: Map: 1   Cumulative CPU: 3.12 sec   HDFS Read: 21659 HDFS Write: 381120 SUCCESS
    Total MapReduce CPU Time Spent: 3 seconds 120 msec
    OK
    
    mydb.user1
    mydb.user2
    mydb.user3
    
    Time taken: 2.137 seconds, Fetched: 3 row(s)
    hive>
    

    谢谢你的帮助。如何从UDF执行sql?是否有一个java函数我可以帮助执行HQL?您可以在UDF函数中传递数据库名称,并在那里使用它,要从查询中传递它,您需要使用当前的_数据库()函数,如从mytable中选择myfunction(current_database());这并没有回答我在commentNo中指定的问题,但我仍在试图找出如何获取要从中部署UDF的数据库名称。
    hive> ADD jar /root/MyUdf.jar;
    Added [/root/MyUdf.jar] to class path
    Added resources: [/root/MyUdf.jar]
    
    hive> create temporary function myUdfFunction as 'com.hiveudf.strmnp.MyHiveUdf';
    OK
    Time taken: 0.018 seconds
    
    hive> select myUdfFunction(username,current_database()) from users;
    Query ID = root_20170407151010_2ae29523-cd9f-4585-b334-e0b61db2c57b
    Total jobs = 1
    Launching Job 1 out of 1
    Number of reduce tasks is set to 0 since there's no reduce operator
    Starting Job = job_1491484583384_0004, Tracking URL = http://mac127:8088/proxy/application_1491484583384_0004/
    Kill Command = /opt/cloudera/parcels/CDH-5.9.0-1.cdh5.9.0.p0.23/lib/hadoop/bin/hadoop job  -kill job_1491484583384_0004
    Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
    2017-04-07 15:11:11,376 Stage-1 map = 0%,  reduce = 0%
    2017-04-07 15:11:19,766 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.12 sec
    MapReduce Total cumulative CPU time: 3 seconds 120 msec
    Ended Job = job_1491484583384_0004
    MapReduce Jobs Launched:
    Stage-Stage-1: Map: 1   Cumulative CPU: 3.12 sec   HDFS Read: 21659 HDFS Write: 381120 SUCCESS
    Total MapReduce CPU Time Spent: 3 seconds 120 msec
    OK
    
    mydb.user1
    mydb.user2
    mydb.user3
    
    Time taken: 2.137 seconds, Fetched: 3 row(s)
    hive>