Oracle-优化查询、大型数据库表、CLOB字段

Oracle-优化查询、大型数据库表、CLOB字段,oracle,clob,Oracle,Clob,所以我一直在绞尽脑汁想这个问题,不可否认,我对甲骨文不是很在行。我们有一个表,它包含大约6000万条记录,其中存储了建筑物的值。在我认为合适的地方添加了适当的索引,但性能仍然很差。以下是查询,因为它应该会有所帮助: SELECT count(*) FROM viewBuildings INNER JOIN tblValues ON viewBuildings.bldg_id = tblValues.bldg_id WHERE bldg_deleted

所以我一直在绞尽脑汁想这个问题,不可否认,我对甲骨文不是很在行。我们有一个表,它包含大约6000万条记录,其中存储了建筑物的值。在我认为合适的地方添加了适当的索引,但性能仍然很差。以下是查询,因为它应该会有所帮助:

  SELECT count(*)
    FROM viewBuildings
   INNER JOIN tblValues
           ON viewBuildings.bldg_id = tblValues.bldg_id
   WHERE bldg_deleted = 0
     AND (bldg_summary = 1
         OR (bldg_root = 0 AND bldg_def = 0)
         OR bldg_parent = 1)
     AND field_id IN (207)
     AND UPPER(dbms_lob.substr(v_value, 2000, 1)) = UPPER('2320')
因此,上面只是可以构造的查询的一个示例。它在v_值CLOB字段的TBL值中查找“2320”匹配项。它使用大写,因为它可以搜索数字和文本值。tblValues拥有6000万条记录。它由建筑id和字段id索引

SQL_ID  d4aq8nsr1p6uw, child number 0
-------------------------------------
SELECT  /*+ gather_plan_statistics */ count(*)     FROM 
viewAssetsForUser1    INNER JOIN tblCurrentValues            ON 
viewAssetsForUser1.as_id = tblCurrentValues.as_id    WHERE as_deleted = 
:"SYS_B_0"      AND (as_summary = :"SYS_B_1"          OR (as_root = 
:"SYS_B_2" AND as_asset_def = :"SYS_B_3")          OR 
as_sub_asset_parent = :"SYS_B_4")      AND fe_id IN (:"SYS_B_5")      
AND UPPER(dbms_lob.substr(cv_value, :"SYS_B_6", :"SYS_B_7")) = 
UPPER(:"SYS_B_8")

Plan hash value: 4033422776

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                           |      1 |        |      1 |00:08:43.19 |   56589 |  56084 |       |       |          |
|   1 |  SORT AGGREGATE                       |                           |      1 |      1 |      1 |00:08:43.19 |   56589 |  56084 |       |       |          |
|*  2 |   FILTER                              |                           |      1 |        |      0 |00:08:43.19 |   56589 |  56084 |       |       |          |
|   3 |    NESTED LOOPS                       |                           |      1 |        |      0 |00:08:43.19 |   56589 |  56084 |       |       |          |
|   4 |     NESTED LOOPS                      |                           |      1 |    115 |      0 |00:08:43.19 |   56589 |  56084 |       |       |          |
|*  5 |      FILTER                           |                           |      1 |        |      0 |00:08:43.19 |   56589 |  56084 |       |       |          |
|*  6 |       HASH JOIN RIGHT OUTER           |                           |      1 |     82 |      0 |00:08:43.19 |   56589 |  56084 |  1348K|  1348K|  742K (0)|
|   7 |        TABLE ACCESS FULL              | TBLASSETSTATUSES          |      1 |      4 |      4 |00:00:00.01 |       3 |      0 |       |       |          |
|   8 |        NESTED LOOPS                   |                           |      1 |        |      0 |00:08:43.19 |   56586 |  56084 |       |       |          |
|   9 |         NESTED LOOPS                  |                           |      1 |    163 |      0 |00:08:43.19 |   56586 |  56084 |       |       |          |
|* 10 |          TABLE ACCESS BY INDEX ROWID  | TBLCURRENTVALUES          |      1 |    163 |      0 |00:08:43.19 |   56586 |  56084 |       |       |          |
|* 11 |           INDEX RANGE SCAN            | IDX_CURVAL_FE_ID          |      1 |  16283 |  61357 |00:00:05.98 |     132 |    132 |       |       |          |
|* 12 |          INDEX RANGE SCAN             | SAA_1                     |      0 |      1 |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|* 13 |         TABLE ACCESS BY INDEX ROWID   | TBLASSETS                 |      0 |      1 |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|* 14 |      INDEX UNIQUE SCAN                | PK_TBLINSPECTORBRIDGEMAP2 |      0 |      1 |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|* 15 |     TABLE ACCESS BY GLOBAL INDEX ROWID| TBLINSPECTORASSETMAP      |      0 |      1 |      0 |00:00:00.01 |       0 |      0 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(:SYS_B_0=0)
   5 - filter(("TBLASSETSTATUSES"."ASSET_STATUS_HIDE_REPORTS" IS NULL OR "TBLASSETSTATUSES"."ASSET_STATUS_HIDE_REPORTS"=0))
   6 - access("TBLASSETSTATUSES"."ASSET_STATUS_ID"="TBLASSETS"."ASSET_STATUS_ID")
  10 - filter(UPPER("DBMS_LOB"."SUBSTR"("TBLCURRENTVALUES"."CV_VALUE",:SYS_B_6,:SYS_B_7))=SYS_OP_C2C(UPPER(:SYS_B_8)))
  11 - access("TBLCURRENTVALUES"."FE_ID"=:SYS_B_5)
  12 - access("TBLASSETS"."AS_DELETED"=:SYS_B_0 AND "TBLASSETS"."AS_ID"="TBLCURRENTVALUES"."AS_ID")
  13 - filter((("TBLASSETS"."AS_ROOT"=:SYS_B_2 AND "TBLASSETS"."AS_ASSET_DEF"=:SYS_B_3) OR "TBLASSETS"."AS_SUMMARY"=:SYS_B_1 OR 
              "TBLASSETS"."AS_SUB_ASSET_PARENT"=:SYS_B_4))
  14 - access("TBLASSETS"."AS_ID"="TBLINSPECTORASSETMAP"."AS_ID" AND "TBLINSPECTORASSETMAP"."IN_ID"=1)
  15 - filter(("TBLINSPECTORASSETMAP"."IAM_ASSET_ACCESS_LEVEL"=0 OR "TBLINSPECTORASSETMAP"."IAM_ASSET_ACCESS_LEVEL"=1))
我可能需要提供更多的信息,但就统计数据而言,我得到的数据是“一致的”。一致gets=74069。这是一个很大的数字吗

任何建议都很好,主要是在处理大型数据库表上的CLOB字段时。无法使用上下文类型索引,因为我需要精确匹配,并且正在查找的数据可以是数字或字符串

编辑(更多信息): tblBuildings是viewBuildings(视图)的一部分,拥有80000条记录 t左值具有每栋建筑的值,具有68000000条记录 tblValues每个建筑大约有550个字段(字段id)

所需结果:查询以在<5秒内返回结果。这不合理吗?有时它会无限期地运行,有时可能是80秒

解释计划结果

Plan hash value: 1480138519
-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                             | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------------|
|   0 | SELECT STATEMENT                    |                                  |     1 |   192 |    32   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE                     |                                  |     1 |   192 |            |          |
|   2 |   NESTED LOOPS                      |                                  |     1 |   192 |    15   (0)| 00:00:01 |
|   3 |    NESTED LOOPS                     |                                  |     1 |   183 |    12   (0)| 00:00:01 |
|*  4 |     FILTER                          |                                  |       |       |            |          |
|   5 |      NESTED LOOPS OUTER             |                                  |     1 |    64 |    10   (0)| 00:00:01 |
|*  6 |       TABLE ACCESS BY INDEX ROWID   | TBLBUILDINGS                     |     1 |    60 |     9   (0)| 00:00:01 |
|*  7 |        INDEX RANGE SCAN             | SAA_4                            |    17 |       |     3   (0)| 00:00:01 |
|   8 |         NESTED LOOPS                |                                  |     1 |    21 |     3   (0)| 00:00:01 |
|   9 |          TABLE ACCESS BY INDEX ROWID| TBLBUILDINGSTATUSES              |     1 |    15 |     2   (0)| 00:00:01 |
|* 10 |           INDEX RANGE SCAN          | IDX_BUILDINGSTATUS_EXCLUDEQUERY  |     1 |       |     1   (0)| 00:00:01 |
|* 11 |          INDEX RANGE SCAN           | IDX_BUILDING_STATUS_ASID_DELETED |     1 |     6 |     1   (0)| 00:00:01 |
|  12 |       TABLE ACCESS BY INDEX ROWID   | TBLBUILDINGSTATUSES              |     1 |     4 |     1   (0)| 00:00:01 |
|* 13 |        INDEX UNIQUE SCAN            | PK_TBLBUILDINGSTATUS             |     1 |       |     0   (0)| 00:00:01 |
|* 14 |     TABLE ACCESS BY INDEX ROWID     | TBLVALUES                        |     1 |   119 |     2   (0)| 00:00:01 |
|* 15 |      INDEX UNIQUE SCAN              | PK_SAA_6                         |     1 |       |     1   (0)| 00:00:01 |
|  16 |    INLIST ITERATOR                  |                                  |       |       |            |          |
|* 17 |     INDEX RANGE SCAN                | SAA_7                            |     1 |     9 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

   4 - filter("TBLBUILDINGSTATUSES"."BUILDING_STATUS_HIDE_REPORTS" IS NULL OR
              "TBLBUILDINGSTATUSES"."BUILDING_STATUS_HIDE_REPORTS"=0)
   6 - filter("TBLBUILDINGS"."BLDG_SUMMARY"=1 OR "TBLBUILDINGS"."BLDG_SUB_BUILDING_PARENT"=1 OR
              "TBLBUILDINGS"."BLDG_BUILDING_DEF"=0 AND "TBLBUILDINGS"."BLDG_ROOT"=0)
   7 - access("TBLBUILDINGS"."BLDG_DELETED"=0)
       filter( NOT EXISTS (SELECT 0 FROM "TBLBUILDINGSTATUSES" "TBLBUILDINGSTATUSES","TBLBUILDINGS" "TBLBUILDINGS" WHERE
              "TBLBUILDINGS"."BLDG_ID"=:B1 AND "TBLBUILDINGSTATUSES"."BUILDING_STATUS_ID"="TBLBUILDINGS"."BUILDING_STATUS_ID" AND
              "TBLBUILDINGSTATUSES"."BUILDING_STATUS_EXCLUDE_QUERY"=1))
  10 - access("TBLBUILDINGSTATUSES"."BUILDING_STATUS_EXCLUDE_QUERY"=1)
  11 - access("TBLBUILDINGS"."BLDG_ID"=:B1 AND "TBLBUILDINGSTATUSES"."BUILDING_STATUS_ID"="TBLBUILDINGS"."BUILDING_STATUS_ID")
       filter("TBLBUILDINGSTATUSES"."BUILDING_STATUS_ID"="TBLBUILDINGS"."BUILDING_STATUS_ID")
  13 - access("TBLBUILDINGSTATUSES"."BUILDING_STATUS_ID"(+)="TBLBUILDINGS"."BUILDING_STATUS_ID")
  14 - filter(UPPER("DBMS_LOB"."SUBSTR"("TBLVALUES"."V_VALUE",2000,1))=U'2320')
  15 - access("TBLVALUES"."FE_ID"=207 AND "TBLBUILDINGS"."BLDG_ID"="TBLVALUES"."BLDG_ID")
  17 - access("TBLINSPECTORBUILDINGMAP"."IN_ID"=1 AND ("TBLINSPECTORBUILDINGMAP"."IAM_BUILDING_ACCESS_LEVEL"=0 OR
              "TBLINSPECTORBUILDINGMAP"."IAM_BUILDING_ACCESS_LEVEL"=1) AND "TBLBUILDINGS"."BLDG_ID"="TBLINSPECTORBUILDINGMAP"."BLDG_ID")

 44 rows selected

Plan hash value: 2137789089

---------------------------------------------------------------------------------------------
| Id  | Operation                         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |         |  8168 | 16336 |    29   (0)| 00:00:01 |
|   1 |  COLLECTION ITERATOR PICKLER FETCH| DISPLAY |  8168 | 16336 |    29   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
好的,我按照你的建议收集了统计数据,然后这是计划表输出。看起来IDX_曲线_FE_ID是这里的问题吗?这是字段id的值表上的索引

SQL_ID  d4aq8nsr1p6uw, child number 0
-------------------------------------
SELECT  /*+ gather_plan_statistics */ count(*)     FROM 
viewAssetsForUser1    INNER JOIN tblCurrentValues            ON 
viewAssetsForUser1.as_id = tblCurrentValues.as_id    WHERE as_deleted = 
:"SYS_B_0"      AND (as_summary = :"SYS_B_1"          OR (as_root = 
:"SYS_B_2" AND as_asset_def = :"SYS_B_3")          OR 
as_sub_asset_parent = :"SYS_B_4")      AND fe_id IN (:"SYS_B_5")      
AND UPPER(dbms_lob.substr(cv_value, :"SYS_B_6", :"SYS_B_7")) = 
UPPER(:"SYS_B_8")

Plan hash value: 4033422776

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                           |      1 |        |      1 |00:08:43.19 |   56589 |  56084 |       |       |          |
|   1 |  SORT AGGREGATE                       |                           |      1 |      1 |      1 |00:08:43.19 |   56589 |  56084 |       |       |          |
|*  2 |   FILTER                              |                           |      1 |        |      0 |00:08:43.19 |   56589 |  56084 |       |       |          |
|   3 |    NESTED LOOPS                       |                           |      1 |        |      0 |00:08:43.19 |   56589 |  56084 |       |       |          |
|   4 |     NESTED LOOPS                      |                           |      1 |    115 |      0 |00:08:43.19 |   56589 |  56084 |       |       |          |
|*  5 |      FILTER                           |                           |      1 |        |      0 |00:08:43.19 |   56589 |  56084 |       |       |          |
|*  6 |       HASH JOIN RIGHT OUTER           |                           |      1 |     82 |      0 |00:08:43.19 |   56589 |  56084 |  1348K|  1348K|  742K (0)|
|   7 |        TABLE ACCESS FULL              | TBLASSETSTATUSES          |      1 |      4 |      4 |00:00:00.01 |       3 |      0 |       |       |          |
|   8 |        NESTED LOOPS                   |                           |      1 |        |      0 |00:08:43.19 |   56586 |  56084 |       |       |          |
|   9 |         NESTED LOOPS                  |                           |      1 |    163 |      0 |00:08:43.19 |   56586 |  56084 |       |       |          |
|* 10 |          TABLE ACCESS BY INDEX ROWID  | TBLCURRENTVALUES          |      1 |    163 |      0 |00:08:43.19 |   56586 |  56084 |       |       |          |
|* 11 |           INDEX RANGE SCAN            | IDX_CURVAL_FE_ID          |      1 |  16283 |  61357 |00:00:05.98 |     132 |    132 |       |       |          |
|* 12 |          INDEX RANGE SCAN             | SAA_1                     |      0 |      1 |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|* 13 |         TABLE ACCESS BY INDEX ROWID   | TBLASSETS                 |      0 |      1 |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|* 14 |      INDEX UNIQUE SCAN                | PK_TBLINSPECTORBRIDGEMAP2 |      0 |      1 |      0 |00:00:00.01 |       0 |      0 |       |       |          |
|* 15 |     TABLE ACCESS BY GLOBAL INDEX ROWID| TBLINSPECTORASSETMAP      |      0 |      1 |      0 |00:00:00.01 |       0 |      0 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(:SYS_B_0=0)
   5 - filter(("TBLASSETSTATUSES"."ASSET_STATUS_HIDE_REPORTS" IS NULL OR "TBLASSETSTATUSES"."ASSET_STATUS_HIDE_REPORTS"=0))
   6 - access("TBLASSETSTATUSES"."ASSET_STATUS_ID"="TBLASSETS"."ASSET_STATUS_ID")
  10 - filter(UPPER("DBMS_LOB"."SUBSTR"("TBLCURRENTVALUES"."CV_VALUE",:SYS_B_6,:SYS_B_7))=SYS_OP_C2C(UPPER(:SYS_B_8)))
  11 - access("TBLCURRENTVALUES"."FE_ID"=:SYS_B_5)
  12 - access("TBLASSETS"."AS_DELETED"=:SYS_B_0 AND "TBLASSETS"."AS_ID"="TBLCURRENTVALUES"."AS_ID")
  13 - filter((("TBLASSETS"."AS_ROOT"=:SYS_B_2 AND "TBLASSETS"."AS_ASSET_DEF"=:SYS_B_3) OR "TBLASSETS"."AS_SUMMARY"=:SYS_B_1 OR 
              "TBLASSETS"."AS_SUB_ASSET_PARENT"=:SYS_B_4))
  14 - access("TBLASSETS"."AS_ID"="TBLINSPECTORASSETMAP"."AS_ID" AND "TBLINSPECTORASSETMAP"."IN_ID"=1)
  15 - filter(("TBLINSPECTORASSETMAP"."IAM_ASSET_ACCESS_LEVEL"=0 OR "TBLINSPECTORASSETMAP"."IAM_ASSET_ACCESS_LEVEL"=1))

坏指数成本如果统计数据是新的,并且优化器有一个相对较好的基数估计,为什么它会选择一个坏计划?也许有一个参数使索引看起来人为地便宜。看看:
select*from v$参数,其中name位于('optimizer\u index\u cost\u adj','optimizer\u index\u caching')它们是否与默认值100和0显著不同

另外,看看sys.aux_stats$中的
select*可能您的系统统计数据使完整表扫描看起来太昂贵了。Oracle的某些版本在工作负载统计数据方面存在缺陷,其中的数字错误了几个数量级

或者您的表太大了,16K索引读取是最好的访问路径。查看
DBA_SEGMENTS.BYTES
以查找表和LOB段的大小

即使表是中等大小的,并且计划更改为完整表扫描,也可能无法将运行时间缩短到5秒以内。但结合您的分区想法,这可能就足够了

LOB存储举个例子,我假设大多数CLOB都相对较小?可能您有一个不寻常的LOB设置,它会浪费大量空间,例如
禁用行中的存储
。您可能需要检查您的表DDL,或将其全部发布在此处。或者,如果您可以用VARCHAR2替换CLOB,那就更好了

FBI基于CLOB的函数索引可能会显著加快速度。但它可能是一个非常大的索引:
create index TBLCURRENTVALUES\u FBI on TBLCURRENTVALUES(UPPER(dbms_lob.substr(v_value,2000,1))


光标共享查询有点变化,这使得调整困难。看起来这个最新版本有
CURSOR\u SHARING=FORCE
,这是不寻常的。对于昂贵的查询,使用文本可能是一件好事——花在构建查询计划上的额外时间可能是值得的。如果系统参数无法更改,请查看提示
/*+光标\u共享\u精确的*/

您可以进行任意数量的优化,但最终导致问题的是大量数据。当您在
OEM
上执行查询并在性能图上跟踪它时,您会发现大部分时间将花在IO上。这就是从内存中获取数据

那么解决方案是什么呢:将表分区。每当数据量很大时,您应该将表分区,以便只处理相关数据。 为了对表进行分区,您需要一些点来隔离数据,并查看您的数据,它可以是构建id

您可以通过以下url了解更多信息:

分区还提供了许多其他特性,比如本地索引,它们有助于进一步优化查询

如果您一直在处理整个大型表数据,分区将不是一个解决方案,但这会给数据库模式打上一个问号


因此,是的,查询优化将有所帮助,但由于数据很大,您也应该评估表分区。

74069一致性gets意味着查询可能读取578 MB的数据。但这并不能告诉我们很多。该数字可能过高或过低。首先,您对这个查询的期望是什么?它是否返回一小部分您希望几乎立即显示的行,但需要X秒?我们还需要看看解释计划。发布结果:
解释[您的查询]的计划
然后
从表中选择*(dbms\u xplan.display)。它可以返回0到17000行之间的任意位置,具体取决于用户编写查询的方式。我希望我们能在5秒内达到这个结果,但考虑到尺寸,我不知道这是否现实。但可能有68000000行,但您也只在给定字段上搜索,因此这些结果应该更窄。我用解释计划结果更新了原来的帖子。如果你需要更多,请告诉我。谢谢你更新问题。有很多非常低的基数估计,行=1。在查询上花费太多时间之前,您可能需要重新收集统计信息,以确保优化器具有最新的信息