Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/sql/69.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/oracle/9.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
SQL:大案例条件GROUPBY子句慢_Sql_Oracle - Fatal编程技术网

SQL:大案例条件GROUPBY子句慢

SQL:大案例条件GROUPBY子句慢,sql,oracle,Sql,Oracle,我需要根据号码范围提取基于LC呼叫号码的报告。呼叫号码格式如下,我需要提取标点符号前的第二个字段进行分组: CALL_NO_ID1 -------------- a!3243 .m43 12 a#435 234 1999 cs"345 1973. ... 下面是我的sql select count("CALL_NO_ID1") "No_of_Items", case WHEN (LPAD(CAST(regexp_replace(REGEXP_SUBSTR(REGEXP_REPLACE

我需要根据号码范围提取基于LC呼叫号码的报告。呼叫号码格式如下,我需要提取标点符号前的第二个字段进行分组:

CALL_NO_ID1
--------------
a!3243 .m43 12

a#435 234 1999

cs"345 1973.

...
下面是我的sql

select count("CALL_NO_ID1") "No_of_Items", 
case

WHEN (LPAD(CAST(regexp_replace(REGEXP_SUBSTR(REGEXP_REPLACE("CALL_NO_ID1",'["]|[#]|[!]', ' '),'[^ ]+|["]|[#]',1,2), '[^0-9]+', '') as number),7,'0') BETWEEN 0 AND 999)AND ("CALL_NO_DESC1" LIKE 'KG %') THEN 'KG 0-999  - Federal law Common and collective state law Individual states US - Latin AmericaGeneral'

WHEN (LPAD(CAST(regexp_replace(REGEXP_SUBSTR(REGEXP_REPLACE("CALL_NO_ID1",'["]|[#]|[!]', ' '),'[^ ]+|["]|[#]',1,2), '[^0-9]+', '') as number),7,'0') BETWEEN 0 AND 999)AND ("CALL_NO_DESC1" LIKE 'KH %') THEN 'KH 0-999 - Federal law Common and collective state law Individual states US -  South AmericaGeneral '

WHEN (LPAD(CAST(regexp_replace(REGEXP_SUBSTR(REGEXP_REPLACE("CALL_NO_ID1",'["]|[#]|[!]', ' '),'[^ ]+|["]|[#]',1,2), '[^0-9]+', '') as number),7,'0') BETWEEN 1 AND 100)AND ("CALL_NO_DESC1" LIKE 'DE %') THEN 'DE 1-100  - HistoryGeneral - The Mediterranean Region The Greco-Roman World'

WHEN (LPAD(CAST(regexp_replace(REGEXP_SUBSTR(REGEXP_REPLACE("CALL_NO_ID1",'["]|[#]|[!]', ' '),'[^ ]+|["]|[#]',1,2), '[^0-9]+', '') as number),7,'0') BETWEEN 1 AND 1050)AND ("CALL_NO_DESC1" LIKE 'TR %') THEN 'TR 1-1050  - Photography'

...  

... (around 450 case conditions)

...

else "CALL_NO_ID1"

end "Primary Call"

from DWH_FACT_ITEMS

group by 

case

WHEN (LPAD(CAST(regexp_replace(REGEXP_SUBSTR(REGEXP_REPLACE("CALL_NO_ID1",'["]|[#]|[!]', ' '),'[^ ]+|["]|[#]',1,2), '[^0-9]+', '') as number),7,'0') BETWEEN 0 AND 999)AND ("CALL_NO_DESC1" LIKE 'KG %') THEN 'KG 0-999  - Federal law Common and collective state law Individual states US - Latin AmericaGeneral'

WHEN (LPAD(CAST(regexp_replace(REGEXP_SUBSTR(REGEXP_REPLACE("CALL_NO_ID1",'["]|[#]|[!]', ' '),'[^ ]+|["]|[#]',1,2), '[^0-9]+', '') as number),7,'0') BETWEEN 0 AND 999)AND ("CALL_NO_DESC1" LIKE 'KH %') THEN 'KH 0-999 - Federal law Common and collective state law Individual states US -  South AmericaGeneral '

WHEN (LPAD(CAST(regexp_replace(REGEXP_SUBSTR(REGEXP_REPLACE("CALL_NO_ID1",'["]|[#]|[!]', ' '),'[^ ]+|["]|[#]',1,2), '[^0-9]+', '') as number),7,'0') BETWEEN 1 AND 100)AND ("CALL_NO_DESC1" LIKE 'DE %') THEN 'DE 1-100  - HistoryGeneral - The Mediterranean Region The Greco-Roman World'

WHEN (LPAD(CAST(regexp_replace(REGEXP_SUBSTR(REGEXP_REPLACE("CALL_NO_ID1",'["]|[#]|[!]', ' '),'[^ ]+|["]|[#]',1,2), '[^0-9]+', '') as number),7,'0') BETWEEN 1 AND 1050)AND ("CALL_NO_DESC1" LIKE 'TR %') THEN 'TR 1-1050  - Photography'
...  
... (around 450 case conditions)
...
然而,这将需要很长的时间才能得到结果(2~3hr),我想知道有什么建议可以改进我的sql吗

谢谢


莫里斯

我会添加一个额外的列
CALL\u NO\u CLEARED
来保留这个号码。将表达式应用于所有值以填充人造列

您可以在插入/更新时添加触发器
,以在添加或更改列时动态填充该列

然后,您可以在选择的引入索引中使用
CALL\u NO\u CLEARED
,使其更快

更新:

我可以建议另一种方法。似乎最耗时的过程是通话

LPAD(CAST(regexp_replace(REGEXP_SUBSTR(REGEXP_REPLACE("CALL_NO_ID1",'["]|[#]|[!]', ' '),'[^ ]+|["]|[#]',1,2), '[^0-9]+', '') as number),7,'0')
因此,对于每一行,我们计算它450次(对于提到的
中的每一行)

尝试将计算放在子查询中,然后稍后应用组,例如

select *
FROM (
    select 
    CALL_NO_ID1,
    LPAD(CAST(regexp_replace(REGEXP_SUBSTR(REGEXP_REPLACE("CALL_NO_ID1",'["]|[#]|[!]', ' '),'[^ ]+|["]|[#]',1,2), '[^0-9]+', '') as number),7,'0') as sub_num
    from DWH_FACT_ITEMS) sub
group by 
case
WHEN (sub.sub_num BETWEEN 0 AND 999)AND (sub.CALL_NO_DESC1 LIKE 'KG %') THEN 'KG 0-999  - Federal law Common and collective state law Individual states US - Latin AmericaGeneral'
...

您有几种可能改进查询。我的测试表明 通过(使用子查询)消除group by中的case可以提高查询的可维护性和大小,但性能保持不变

通过对案例陈述进行排序,观察到了特别的改善,以便将最常见的情况放在顶部

这个想法很简单,如果在案例的早期完成匹配,则跳过其余条件

通过对WHEN语句中的谓词重新排序,实现了更好的改进。如果call\u NO\u DESC子字符串不匹配,则不会调用regexp处理

WHEN ("CALL_NO_DESC1" LIKE 'TR %') and 
(LPAD(CAST(regexp_replace(REGEXP_SUBSTR(REGEXP_REPLACE("CALL_NO_ID1",'["]|[#]|[!]', ' '),'[^ ]+|["]|[#]',1,2), '[^0-9]+', '') as number),7,'0') BETWEEN 1 AND 1050)   
THEN 'TR 1-4050  - Photography'
最后一步是在子查询中只调用一次REGEXP处理

总而言之,我以这个查询结束,它大大减少了运行时间(使用我的测试数据)


使用
WITH
子句和
/*+MATERIALIZE*/
提示,让Oracle只执行一次昂贵的操作

这在400000行上的性能应该比2-3小时好得多:

WITH parsed_call_numbers as ( SELECT /*+ MATERIALIZE */
SELECT CALL_NO_ID1,
       (LPAD(CAST(regexp_replace(REGEXP_SUBSTR(REGEXP_REPLACE("CALL_NO_ID1",'["]|[#]|[!]', ' '),'[^ ]+|["]|[#]',1,2), '[^0-9]+', '') as number),7,'0') call_Number_part,
       CALL_NO_DESC1
from DWH_FACT_ITEMS ) ,
primary_calls AS ( SELECT /*+ MATERIALIZE */ 
CALL_NO_ID1,
case
WHEN call_number_part BETWEEN 0 AND 999)AND ("CALL_NO_DESC1" LIKE 'KG %') THEN 'KG 0-999  - Federal law Common and collective state law Individual states US - Latin AmericaGeneral'
WHEN call_number_part BETWEEN 0 AND 999)AND ("CALL_NO_DESC1" LIKE 'KH %') THEN 'KH 0-999 - Federal law Common and collective state law Individual states US -  South AmericaGeneral '
WHEN call_number_part BETWEEN 1 AND 100)AND ("CALL_NO_DESC1" LIKE 'DE %') THEN 'DE 1-100  - HistoryGeneral - The Mediterranean Region The Greco-Roman World'
WHEN call_number_part BETWEEN 1 AND 1050)AND ("CALL_NO_DESC1" LIKE 'TR %') THEN 'TR 1-1050  - Photography'
--...  
--... (around 450 case conditions)
--...
else "CALL_NO_ID1"
end "Primary Call"
from parsed_call_numbers )
select count("CALL_NO_ID1") "No_of_Items", "Primary Call"
FROM primary_calls
group by "Primary Call"

您能告诉我们您的
CASE
语句中使用的逻辑和分组过程吗?当分组基于LC分类,并且第一个或“第一和第二个”必须是英文字符时,您需要一个算法来剪切这些CASE。谢谢回复。这是信用证分类,第一个或“第一和第二个”必须是英文字符。然后后跟一个带有字段分隔符“空白”、!、#的数字或“a!3243.m43 12(以分贝记录)-->a(第一字段)-->3243(第二字段)-->.m43(第三字段)a#435 234 1999(以分贝记录)-->a(第一字段)-->234(第二字段)-->1999(第三字段)cs”345 1973。(在db中记录)-->cs(第一个字段)-->345(第二个字段)-->1973。(第三个字段)因为要求根据给定的分组第一和第二个字段提取范围内的记录总数,例如,字段1(以a开头)和字段2范围(1-400)是组字段1(以a开头),字段2范围(500-4000)是B组字段1(以B开头),字段2范围(1-400)是C组。。。。我认为以上情况是不规则的。谢谢你的建议,如果你能应用你的建议,应该会得到很好的结果。但db由供应商维护。修改该表可能违反合同,并可能导致未来升级出现问题。感谢您的建议,我尝试按照您的建议修改sql。执行时间没有差别。嗨,StanislavL,即使我改变了顺序或案例,我也能感觉到显著的改进。非常感谢,莫里斯
WITH parsed_call_numbers as ( SELECT /*+ MATERIALIZE */
SELECT CALL_NO_ID1,
       (LPAD(CAST(regexp_replace(REGEXP_SUBSTR(REGEXP_REPLACE("CALL_NO_ID1",'["]|[#]|[!]', ' '),'[^ ]+|["]|[#]',1,2), '[^0-9]+', '') as number),7,'0') call_Number_part,
       CALL_NO_DESC1
from DWH_FACT_ITEMS ) ,
primary_calls AS ( SELECT /*+ MATERIALIZE */ 
CALL_NO_ID1,
case
WHEN call_number_part BETWEEN 0 AND 999)AND ("CALL_NO_DESC1" LIKE 'KG %') THEN 'KG 0-999  - Federal law Common and collective state law Individual states US - Latin AmericaGeneral'
WHEN call_number_part BETWEEN 0 AND 999)AND ("CALL_NO_DESC1" LIKE 'KH %') THEN 'KH 0-999 - Federal law Common and collective state law Individual states US -  South AmericaGeneral '
WHEN call_number_part BETWEEN 1 AND 100)AND ("CALL_NO_DESC1" LIKE 'DE %') THEN 'DE 1-100  - HistoryGeneral - The Mediterranean Region The Greco-Roman World'
WHEN call_number_part BETWEEN 1 AND 1050)AND ("CALL_NO_DESC1" LIKE 'TR %') THEN 'TR 1-1050  - Photography'
--...  
--... (around 450 case conditions)
--...
else "CALL_NO_ID1"
end "Primary Call"
from parsed_call_numbers )
select count("CALL_NO_ID1") "No_of_Items", "Primary Call"
FROM primary_calls
group by "Primary Call"