Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/ruby/23.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop 对动态分区表的配置单元插入永远运行/挂起_Hadoop_Hive_Hql_Emr - Fatal编程技术网

Hadoop 对动态分区表的配置单元插入永远运行/挂起

Hadoop 对动态分区表的配置单元插入永远运行/挂起,hadoop,hive,hql,emr,Hadoop,Hive,Hql,Emr,假设我们有两个蜂箱表,表A和表B。 我正在分解tableA,将它与其他几个表连接起来,然后插入tableB 当tableB没有分区,或者插入是使用静态分区完成时,Insert可以正常工作 但是,当存在动态分区时,map reduce作业甚至不会启动。它有点悬着 要进行更多调试,我在初始化配置单元时设置了以下参数: -hiveconf hive.root.logger=DEBUG,console 现在,我可以看到这项工作实际上并没有被搁置。 它连续打印日志,如: ........ 16

假设我们有两个蜂箱表,表A和表B。 我正在分解tableA,将它与其他几个表连接起来,然后插入tableB

当tableB没有分区,或者插入是使用静态分区完成时,Insert可以正常工作

但是,当存在动态分区时,map reduce作业甚至不会启动。它有点悬着

要进行更多调试,我在初始化配置单元时设置了以下参数:

-hiveconf hive.root.logger=DEBUG,console
现在,我可以看到这项工作实际上并没有被搁置。 它连续打印日志,如:

........

    16/02/11 09:25:50 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
    16/02/11 09:25:50 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2139 and EX_2140 as parent of FS_68 and child of EX_2138
    16/02/11 09:25:55 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
    16/02/11 09:25:55 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2141 and EX_2142 as parent of FS_68 and child of EX_2140
    16/02/11 09:25:59 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
    16/02/11 09:25:59 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2143 and EX_2144 as parent of FS_68 and child of EX_2142
    16/02/11 09:26:03 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
    16/02/11 09:26:03 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2145 and EX_2146 as parent of FS_68 and child of EX_2144
    16/02/11 09:26:08 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
    16/02/11 09:26:08 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2147 and EX_2148 as parent of FS_68 and child of EX_2146
    16/02/11 09:26:12 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
    16/02/11 09:26:12 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2149 and EX_2150 as parent of FS_68 and child of EX_2148
    16/02/11 09:26:17 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
    16/02/11 09:26:17 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2151 and EX_2152 as parent of FS_68 and child of EX_2150
    16/02/11 09:26:19 [Thread-5]: INFO metrics.MetricsSaver: Saved 8:22 records to /mnt/var/em/raw/i-63eec5e6_20160211_RunJar_14276_raw.bin
    16/02/11 09:26:21 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
    16/02/11 09:26:21 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2153 and EX_2154 as parent of FS_68 and child of EX_2152
    16/02/11 09:26:26 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
    16/02/11 09:26:26 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2155 and EX_2156 as parent of FS_68 and child of EX_2154
    16/02/11 09:26:30 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
    16/02/11 09:26:30 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2157 and EX_2158 as parent of FS_68 and child of EX_2156
    16/02/11 09:26:35 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
    16/02/11 09:26:35 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2159 and EX_2160 as parent of FS_68 and child of EX_2158
    16/02/11 09:26:40 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
    16/02/11 09:26:40 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2161 and EX_2162 as parent of FS_68 and child of EX_2160
    16/02/11 09:26:45 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
    16/02/11 09:26:45 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2163 and EX_2164 as parent of FS_68 and child of EX_2162
    16/02/11 09:26:49 [Thread-5]: INFO metrics.MetricsSaver: Saved 8:22 records to /mnt/var/em/raw/i-63eec5e6_20160211_RunJar_14276_raw.bin
    16/02/11 09:26:50 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
    16/02/11 09:26:50 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2165 and EX_2166 as parent of FS_68 and child of EX_2164
    16/02/11 09:26:56 [main]: INFO optimizer.SortedDynPartitionOptimizer: Sorted dynamic partitioning optimization kicked in..
    16/02/11 09:26:56 [main]: INFO optimizer.SortedDynPartitionOptimizer: Inserted RS_2167 and EX_2168 as parent of FS_68 and child of EX_2166

..............
这些日志像永远一样打印! 但是,如果没有动态分区,完整的插入查询将在大约10分钟内成功完成

另外,整个表中动态分区的不同值的数量只有3个,因此我使用不合适的列作为动态分区的情况并非如此

因此,

  • 打印日志意味着什么

  • 这种情况需要什么样的优化/补救措施


  • 非常感谢您提前提供的帮助

    设置以下参数有效:

    SET hive.optimize.sort.dynamic.partition=false
    
    我的配置单元版本是0.13.1。 为此参数引用apache wiki:

    hive.optimize.sort.dynamic.partition

    默认值:在配置单元0.13.0和0.13.1中为true;在Hive 0.14.0及更高版本(Hive-8151)中为假 添加到:带有Hive-6455的Hive 0.13.0 启用时,动态分区列将被全局排序。通过这种方式,我们可以为reducer中的每个分区值只打开一个记录编写器,从而减少reducer上的内存压力

    谢谢