Sql 条件聚合效率
我们要两张桌子Sql 条件聚合效率,sql,h2,conditional-aggregation,Sql,H2,Conditional Aggregation,我们要两张桌子 A(id int primary key, groupby int, fkb int, search int, padding varchar(1000)) B(id int primary key, groupby int, search int) 它们是使用以下脚本创建的。第一个表是大的1M行,第二个表是小的10k行 CREATE TABLE A( id int not null primary key, groupby int null, fkb int
A(id int primary key, groupby int, fkb int, search int, padding varchar(1000))
B(id int primary key, groupby int, search int)
它们是使用以下脚本创建的。第一个表是大的1M行,第二个表是小的10k行
CREATE TABLE A(
id int not null primary key,
groupby int null,
fkb int null,
search int null,
padding varchar(1000) null
) AS
WITH x AS
(
SELECT 0 n FROM dual
union all
SELECT 1 FROM dual
union all
SELECT 2 FROM dual
union all
SELECT 3 FROM dual
union all
SELECT 4 FROM dual
union all
SELECT 5 FROM dual
union all
SELECT 6 FROM dual
union all
SELECT 7 FROM dual
union all
SELECT 8 FROM dual
union all
SELECT 9 FROM dual
), t1 AS
(
SELECT ones.n + 10 * tens.n + 100 * hundreds.n + 1000 * thousands.n + 10000 * tenthousands.n + 100000 * hundredthousands.n as id
FROM x ones, x tens, x hundreds, x thousands, x tenthousands, x hundredthousands
), t2 AS
(
SELECT id,
mod(id, 100) groupby
FROM t1
)
SELECT cast(id as int) id,
cast(groupby as int) groupby,
cast(mod(orderby, 9173) as int) fkb,
cast(mod(id, 911) as int) search
FROM t2;
CREATE TABLE B(
id int not null primary key,
groupby int null,
search int null
) AS
WITH x AS
(
SELECT 0 n FROM dual
union all
SELECT 1 FROM dual
union all
SELECT 2 FROM dual
union all
SELECT 3 FROM dual
union all
SELECT 4 FROM dual
union all
SELECT 5 FROM dual
union all
SELECT 6 FROM dual
union all
SELECT 7 FROM dual
union all
SELECT 8 FROM dual
union all
SELECT 9 FROM dual
), t1 AS
(
SELECT ones.n + 10 * tens.n + 100 * hundreds.n + 1000 * thousands.n as id
FROM x ones, x tens, x hundreds, x thousands
)
SELECT cast(id as int) id,
cast(mod(id + floor(100000 / (id+1)) , 100) as int) groupby,
cast(mod(id, 901) as int) search,
rpad(concat('Value ', id), 1000, '*') as padding
FROM t1;
我希望尽可能快地在H2中处理以下条件聚合查询,但不添加任何其他索引
SELECT B.groupby,
count(CASE WHEN A.search = 1 THEN 1 END) as search1,
count(CASE WHEN A.search = 900 THEN 1 END) as search2
FROM B
LEFT JOIN A ON A.fkb = B.id
WHERE B.search < 10
GROUP BY B.groupby
是否可以重写查询运行时间最长为2分钟的查询?我尝试过许多不同的重写,但是,每一个都会持续运行数分钟。我将Java虚拟机内存设置为4GB-Xmx4G
如果我在MySQL中尝试相同的测试,并且在不到10秒内处理查询。您的初始化脚本有语法错误,我将按以下方式修改它们:
CREATE TABLE A(
id int not null primary key,
groupby int null,
fkb int null,
search int null,
padding varchar(1000) null
) AS
SELECT cast(x as int) id,
cast(mod(x, 100) as int) groupby,
cast(mod(mod(x, 100), 9173) as int) fkb,
cast(mod(x, 911) as int) search,
rpad(concat('Value ', x), 1000, '*') as padding
FROM SYSTEM_RANGE(0, 999999);
CREATE TABLE B(
id int not null primary key,
groupby int null,
search int null
) AS
SELECT cast(x as int) id,
cast(mod(x + floor(100000 / (x+1)), 100) as int) groupby,
cast(mod(x, 901) as int) search
FROM SYSTEM_RANGE(0, 9999);
为了简单起见,我还使用了H2特定的SYSTEM_范围
带查询的EXPLAIN命令显示以下执行计划
SELECT
"B"."GROUPBY",
COUNT(CASE WHEN ("A"."SEARCH" = 1) THEN 1 END) AS "SEARCH1",
COUNT(CASE WHEN ("A"."SEARCH" = 900) THEN 1 END) AS "SEARCH2"
FROM "PUBLIC"."B"
/* PUBLIC.B.tableScan */
/* WHERE B.SEARCH < 10
*/
LEFT OUTER JOIN "PUBLIC"."A"
/* PUBLIC.A.tableScan */
ON "A"."FKB" = "B"."ID"
WHERE "B"."SEARCH" < 10
GROUP BY "B"."GROUPBY"
有了这样的约束,执行计划会更好:
SELECT
"B"."GROUPBY",
COUNT(CASE WHEN ("A"."SEARCH" = 1) THEN 1 END) AS "SEARCH1",
COUNT(CASE WHEN ("A"."SEARCH" = 900) THEN 1 END) AS "SEARCH2"
FROM "PUBLIC"."B"
/* PUBLIC.B.tableScan */
/* WHERE B.SEARCH < 10
*/
LEFT OUTER JOIN "PUBLIC"."A"
/* PUBLIC.A_FKB_FK_INDEX_4: FKB = B.ID */
ON "A"."FKB" = "B"."ID"
WHERE "B"."SEARCH" < 10
GROUP BY "B"."GROUPBY"
在我的旧电脑上,您的查询大约需要11秒
您也可以在查询中使用COUNT*过滤器,其中A.search=1与H2一起使用,但是这样的查询将与MySQL不兼容,MySQL还不支持标准的SQL:2003筛选子句,并且筛选子句并没有真正提高此查询的性能,它只提供更好的可读性
SELECT
"B"."GROUPBY",
COUNT(CASE WHEN ("A"."SEARCH" = 1) THEN 1 END) AS "SEARCH1",
COUNT(CASE WHEN ("A"."SEARCH" = 900) THEN 1 END) AS "SEARCH2"
FROM "PUBLIC"."B"
/* PUBLIC.B.tableScan */
/* WHERE B.SEARCH < 10
*/
LEFT OUTER JOIN "PUBLIC"."A"
/* PUBLIC.A_FKB_FK_INDEX_4: FKB = B.ID */
ON "A"."FKB" = "B"."ID"
WHERE "B"."SEARCH" < 10
GROUP BY "B"."GROUPBY"