Sql 尽管子查询中有ORA-904,但更新工作正常(但速度非常非常慢)
我有一个UPDATE语句,其中有一个子查询,用于查找重复项。子查询在运行子查询本身时会显示错误,但在UPDATE语句中运行时不会显示错误,DML运行正常,但速度非常慢 请参见表格设置:Sql 尽管子查询中有ORA-904,但更新工作正常(但速度非常非常慢),sql,oracle,oracle12c,Sql,Oracle,Oracle12c,我有一个UPDATE语句,其中有一个子查询,用于查找重复项。子查询在运行子查询本身时会显示错误,但在UPDATE语句中运行时不会显示错误,DML运行正常,但速度非常慢 请参见表格设置: CREATE TABLE RAW_table ( ERROR_LEVEL NUMBER(3), RAW_DATA_ROW_ID INTEGER, ATTRIBUTE_1 VARCHAR2(4000 BYTE) ) ; INSERT INTO RAW_table VALUES (
CREATE TABLE RAW_table
(
ERROR_LEVEL NUMBER(3),
RAW_DATA_ROW_ID INTEGER,
ATTRIBUTE_1 VARCHAR2(4000 BYTE)
)
;
INSERT INTO RAW_table VALUES (0, 2, '509NTQD9Q868');
INSERT INTO RAW_table VALUES (0, 2, '509NTQD9Q868');
INSERT INTO RAW_table VALUES (0, 2, '509NTQD9Q868');
INSERT INTO RAW_table VALUES (0, 3, '509NTVS9Q863');
INSERT INTO RAW_table VALUES (0, 3, '509NTVS9Q863');
INSERT INTO RAW_table VALUES (0, 3, '509NTVS9Q863');
COMMIT;
出现错误的查询是:
SELECT UPPER(ATTRIBUTE_1), rid
FROM ( SELECT UPPER(ATTRIBUTE_1)
, ROWID AS rid
, ROW_NUMBER() OVER ( PARTITION BY UPPER (ATTRIBUTE_1) ORDER BY RAW_DATA_ROW_ID) AS RN
FROM RAW_table
)
WHERE RN > 1;
它在运行时给出ORA-00904:属性_1:无效标识符
但是,下面的DML在WHERE语句的第4行使用了上述查询:
set timing on
UPDATE RAW_table
SET ERROR_LEVEL = 4
WHERE (UPPER (ATTRIBUTE_1), ROWID)
IN (SELECT UPPER (ATTRIBUTE_1), rid
FROM (SELECT UPPER (ATTRIBUTE_1), ROWID AS rid
, ROW_NUMBER() OVER ( PARTITION BY UPPER (ATTRIBUTE_1) ORDER BY RAW_DATA_ROW_ID) AS RN
FROM RAW_table
)
WHERE RN > 1
)
;
4 rows updated.
Elapsed: 00:00:00.36
为什么??为什么?为什么?
我预计更新会失败,ORA-00904:ATTRIBUTE_1:标识符也无效。为什么它没有失败
然而,真正的问题并不是更新确实有效,而是更新的速度非常慢
当我更正子查询不触发ORA-00904:ATTRIBUTE_1:invalid identifier时,如下所示:
UPDATE RAW_table
SET ERROR_LEVEL = 4
WHERE (UPPER (ATTRIBUTE_1), ROWID)
IN (SELECT checked_column, rid
FROM (SELECT UPPER (ATTRIBUTE_1) AS checked_column, ROWID AS rid
, ROW_NUMBER() OVER ( PARTITION BY UPPER (ATTRIBUTE_1) ORDER BY RAW_DATA_ROW_ID) AS RN
FROM RAW_table
)
WHERE RN > 1
)
;
在11000行的测试数据集上,查询加速了近400倍:
SELECT COUNT(*) FROM RAW_table;
COUNT(*)
----------
11004
1 row selected.
更正的查询:
1005 rows updated.
Elapsed: 00:00:00.28
UPDATE STATEMENT ALL_ROWS Cost: **36 637** Bytes: 3 374 235 Cardinality: 835
7 UPDATE RAW_TABLE
6 HASH JOIN RIGHT SEMI Cost: 36 637 Bytes: 3 374 235 Cardinality: 835
4 VIEW VIEW SYS.VW_NSO_1 Cost: 30 486 Bytes: 168 197 196 Cardinality: 83 514
3 VIEW Cost: 30 486 Bytes: 169 282 878 Cardinality: 83 514
2 WINDOW SORT Cost: 30 486 Bytes: 169 282 878 Cardinality: 83 514
1 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 169 282 878 Cardinality: 83 514
5 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 169 282 878 Cardinality: 83 514
UPDATE STATEMENT ALL_ROWS Cost: **3 123** Bytes: 1 453 595 Cardinality: 715
7 UPDATE RAW_TABLE
6 HASH JOIN SEMI Cost: 3 123 Bytes: 1 453 595 Cardinality: 715
5 VIEW VIEW SYS.VW_NSO_1 Cost: 427 Bytes: 143 950 650 Cardinality: 71 475
4 VIEW Cost: 427 Bytes: 144 879 825 Cardinality: 71 475
3 WINDOW SORT Cost: 427 Bytes: 1 358 025 Cardinality: 71 475
2 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 1 358 025 Cardinality: 71 475
1 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 1 358 025 Cardinality: 71 475
使用ORA-904进行查询:
1005 rows updated.
Elapsed: 00:01:48.40
UPDATE STATEMENT ALL_ROWS Cost: **2 544 985 615** Bytes: 8 464 752 Cardinality: 4 176
7 UPDATE RAW_TABLE
6 FILTER
1 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 169 282 878 Cardinality: 83 514
5 VIEW Cost: 30 486 Bytes: 2 087 850 Cardinality: 83 514
4 WINDOW SORT Cost: 30 486 Bytes: 169 282 878 Cardinality: 83 514
3 FILTER
2 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 169 282 878 Cardinality: 83 514
UPDATE STATEMENT ALL_ROWS Cost: **29 381 690** Bytes: 38 Cardinality: 2
7 UPDATE RAW_TABLE
6 FILTER
1 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 1 358 025 Cardinality: 71 475
5 VIEW Cost: 427 Bytes: 1 786 875 Cardinality: 71 475
4 WINDOW SORT Cost: 427 Bytes: 1 358 025 Cardinality: 71 475
3 FILTER
2 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 1 358 025 Cardinality: 71 475
我没有足够的耐心等到71.000行测试结束:
SELECT COUNT(*) FROM RAW_table;
COUNT(*)
----------
71475
1 row selected.
Corrected query
11004 rows updated.
Elapsed: 00:00:00.60
Query with ORA-904
30分钟后取消
使用ORA-904解释查询计划:
1005 rows updated.
Elapsed: 00:01:48.40
UPDATE STATEMENT ALL_ROWS Cost: **2 544 985 615** Bytes: 8 464 752 Cardinality: 4 176
7 UPDATE RAW_TABLE
6 FILTER
1 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 169 282 878 Cardinality: 83 514
5 VIEW Cost: 30 486 Bytes: 2 087 850 Cardinality: 83 514
4 WINDOW SORT Cost: 30 486 Bytes: 169 282 878 Cardinality: 83 514
3 FILTER
2 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 169 282 878 Cardinality: 83 514
UPDATE STATEMENT ALL_ROWS Cost: **29 381 690** Bytes: 38 Cardinality: 2
7 UPDATE RAW_TABLE
6 FILTER
1 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 1 358 025 Cardinality: 71 475
5 VIEW Cost: 427 Bytes: 1 786 875 Cardinality: 71 475
4 WINDOW SORT Cost: 427 Bytes: 1 358 025 Cardinality: 71 475
3 FILTER
2 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 1 358 025 Cardinality: 71 475
解释纠正查询的计划:
1005 rows updated.
Elapsed: 00:00:00.28
UPDATE STATEMENT ALL_ROWS Cost: **36 637** Bytes: 3 374 235 Cardinality: 835
7 UPDATE RAW_TABLE
6 HASH JOIN RIGHT SEMI Cost: 36 637 Bytes: 3 374 235 Cardinality: 835
4 VIEW VIEW SYS.VW_NSO_1 Cost: 30 486 Bytes: 168 197 196 Cardinality: 83 514
3 VIEW Cost: 30 486 Bytes: 169 282 878 Cardinality: 83 514
2 WINDOW SORT Cost: 30 486 Bytes: 169 282 878 Cardinality: 83 514
1 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 169 282 878 Cardinality: 83 514
5 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 169 282 878 Cardinality: 83 514
UPDATE STATEMENT ALL_ROWS Cost: **3 123** Bytes: 1 453 595 Cardinality: 715
7 UPDATE RAW_TABLE
6 HASH JOIN SEMI Cost: 3 123 Bytes: 1 453 595 Cardinality: 715
5 VIEW VIEW SYS.VW_NSO_1 Cost: 427 Bytes: 143 950 650 Cardinality: 71 475
4 VIEW Cost: 427 Bytes: 144 879 825 Cardinality: 71 475
3 WINDOW SORT Cost: 427 Bytes: 1 358 025 Cardinality: 71 475
2 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 1 358 025 Cardinality: 71 475
1 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 1 358 025 Cardinality: 71 475
分析该表后,成本与计划相同。
使用ORA-904解释查询计划:
1005 rows updated.
Elapsed: 00:01:48.40
UPDATE STATEMENT ALL_ROWS Cost: **2 544 985 615** Bytes: 8 464 752 Cardinality: 4 176
7 UPDATE RAW_TABLE
6 FILTER
1 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 169 282 878 Cardinality: 83 514
5 VIEW Cost: 30 486 Bytes: 2 087 850 Cardinality: 83 514
4 WINDOW SORT Cost: 30 486 Bytes: 169 282 878 Cardinality: 83 514
3 FILTER
2 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 169 282 878 Cardinality: 83 514
UPDATE STATEMENT ALL_ROWS Cost: **29 381 690** Bytes: 38 Cardinality: 2
7 UPDATE RAW_TABLE
6 FILTER
1 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 1 358 025 Cardinality: 71 475
5 VIEW Cost: 427 Bytes: 1 786 875 Cardinality: 71 475
4 WINDOW SORT Cost: 427 Bytes: 1 358 025 Cardinality: 71 475
3 FILTER
2 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 1 358 025 Cardinality: 71 475
解释纠正查询的计划:
1005 rows updated.
Elapsed: 00:00:00.28
UPDATE STATEMENT ALL_ROWS Cost: **36 637** Bytes: 3 374 235 Cardinality: 835
7 UPDATE RAW_TABLE
6 HASH JOIN RIGHT SEMI Cost: 36 637 Bytes: 3 374 235 Cardinality: 835
4 VIEW VIEW SYS.VW_NSO_1 Cost: 30 486 Bytes: 168 197 196 Cardinality: 83 514
3 VIEW Cost: 30 486 Bytes: 169 282 878 Cardinality: 83 514
2 WINDOW SORT Cost: 30 486 Bytes: 169 282 878 Cardinality: 83 514
1 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 169 282 878 Cardinality: 83 514
5 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 169 282 878 Cardinality: 83 514
UPDATE STATEMENT ALL_ROWS Cost: **3 123** Bytes: 1 453 595 Cardinality: 715
7 UPDATE RAW_TABLE
6 HASH JOIN SEMI Cost: 3 123 Bytes: 1 453 595 Cardinality: 715
5 VIEW VIEW SYS.VW_NSO_1 Cost: 427 Bytes: 143 950 650 Cardinality: 71 475
4 VIEW Cost: 427 Bytes: 144 879 825 Cardinality: 71 475
3 WINDOW SORT Cost: 427 Bytes: 1 358 025 Cardinality: 71 475
2 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 1 358 025 Cardinality: 71 475
1 TABLE ACCESS FULL TABLE RAW_TABLE Cost: 54 Bytes: 1 358 025 Cardinality: 71 475
解释计划成本说明了一切,但为什么它如此不同
在计算了表上的统计数据后,我刚刚再次触发了一个71000行测试,但它已经运行了几分钟
所有这些都在Oracle Database 12c Enterprise Edition 12.1.0.2.0-64位版本上。这就是别名非常非常有用的原因
SELECT UPPER(ATTRIBUTE_1), rid
FROM ( SELECT UPPER(ATTRIBUTE_1) ATTRIBUTE_1
, ROWID AS rid
, ROW_NUMBER() OVER ( PARTITION BY UPPER (ATTRIBUTE_1) ORDER BY RAW_DATA_ROW_ID) AS RN
FROM RAW_table
)
WHERE RN > 1
在查询中
UPDATE RAW_table
SET ERROR_LEVEL = 4
WHERE (UPPER (ATTRIBUTE_1), ROWID)
IN (SELECT UPPER (ATTRIBUTE_1), rid
FROM (SELECT UPPER (ATTRIBUTE_1), ROWID AS rid
, ROW_NUMBER() OVER ( PARTITION BY UPPER (ATTRIBUTE_1)
ORDER BY RAW_DATA_ROW_ID) AS RN
FROM RAW_table
)
WHERE RN > 1
)
SELECT UPPER属性_1有效,因为它可以解析为对正在更新的表的引用,而不是对FROM中的表的引用。对于别名,该查询相当于
UPDATE RAW_table dest
SET dest.ERROR_LEVEL = 4
WHERE (UPPER (dest.ATTRIBUTE_1), ROWID)
IN (SELECT UPPER (dest.ATTRIBUTE_1), src.rid
FROM (SELECT UPPER (rt.ATTRIBUTE_1), rt.ROWID AS rid
, ROW_NUMBER() OVER ( PARTITION BY UPPER (rt.ATTRIBUTE_1)
ORDER BY rt.RAW_DATA_ROW_ID) AS RN
FROM RAW_table rt
) src
WHERE src.rid > 1
)
当然,如果您是这样写的,那么很明显您引用的是dest.attribute_1而不是src.attribute_1。这和许多其他原因就是为什么给列别名是一个好主意——它可以清楚地表明要引用哪个对象,并在预期引用无效时抛出错误,而不是潜在地将其解析为您不想要的内容。您的选择失败,因为在名为ATTRIBUTE_1的子查询中没有列。您需要分配名称:
SELECT UPPER(ATTRIBUTE_1), rid
FROM ( SELECT UPPER(ATTRIBUTE_1) as ATTRIBUTE_1,
ROWID AS rid,
ROW_NUMBER() OVER (PARTITION BY UPPER(ATTRIBUTE_1) ORDER BY RAW_DATA_ROW_ID) AS RN
FROM RAW_table
)
WHERE RN > 1;
更新不会生成错误,因为它从外部查询中提取值:
UPDATE RAW_table
-------^
| SET ERROR_LEVEL = 4
| WHERE (UPPER (ATTRIBUTE_1), ROWID) IN
| (SELECT checked_column, rid
| FROM (SELECT UPPER(ATTRIBUTE_1) AS checked_column, ROWID AS rid,
------------------------------^ This is interpreted as RAW_table.ATTRIBUTE_1
ROW_NUMBER() OVER (PARTITION BY UPPER(ATTRIBUTE_1) ORDER BY RAW_DATA_ROW_ID) AS RN
FROM RAW_table
)
WHERE RN > 1
)
这种相关性可能不是您想要的,也是我建议列名总是限定的一个原因,即包括表别名。也许这些版本更快,至少它们更紧凑:
UPDATE RAW_table
SET ERROR_LEVEL = 4
WHERE ROWID <>ALL (SELECT MIN(ROWID) FROM RAW_table GROUP BY UPPER(ATTRIBUTE_1));
UPDATE RAW_table
SET ERROR_LEVEL = 4
WHERE ROWID <>ALL (SELECT FIRST_VALUE(ROWID) OVER (PARTITION BY UPPER(ATTRIBUTE_1) ORDER BY RAW_DATA_ROW_ID) FROM RAW_table);
注意,ALL等同于NOT IN——我个人喜欢使用ALL。Ugh。。是的,当你出现的时候,这是很明显的。并解释了这个相关的解释计划。。。顺便说一句,我想你的意思是src.RN>1,而不是src.rid>1。我想你想用错误注释查询,而不使用checked_列别名,并将ASCII art放低一行。无论如何,非常感谢,因为我终于得到了它!