Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/file/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
File 蜂巢分割和扣合_File_Hive - Fatal编程技术网

File 蜂巢分割和扣合

File 蜂巢分割和扣合,file,hive,File,Hive,我是新来的蜂巢人,想把桌子从一张平桌上搬起来。 我的平桌如下 create table data(auth string, file string, documents string) row format delimited fields terminated by '\t' ; 我的桶表如下 create table test(auth string, documents string) partitioned by (file string) clustered by(auth) int

我是新来的蜂巢人,想把桌子从一张平桌上搬起来。 我的平桌如下

create table data(auth string, file string, documents string)
row format delimited
fields terminated by '\t' ;
我的桶表如下

create table test(auth string, documents string)
partitioned by (file string)
clustered by(auth) into 2 buckets ;
我必须撰写A和B以及他们的10-10份文件,
当我尝试在bucketed table中插入数据时,执行成功,但问题是希望每个作者的所有10个文件都在同一分区中,但我得到一个包含所有10个文件内容的文件。

我假设以下表结构: 平板式:

CREATE TABLE flattable (id INT, author STRING, book STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ',';
可弯曲的:

CREATE TABLE bucketedtable (id INT, book STRING)
partitioned by (author STRING)
CLUSTERED BY (book) INTO 10 BUCKETS;
在配置单元中设置属性:

set hive.enforce.bucketing = true; 
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
插入可从易燃材料中取出的扣件

INSERT INTO TABLE bucketedtable
PARTITION (author)
SELECT  id, book, author
FROM flattable;
您只需要交换“分区依据”和“群集依据”字段


也就是说,我得到了A和B的两个不同的文件,每个文件都包含关于作者的所有10个文件内容。我想要10个A文件和10个B文件在它们的作者分区中