Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/mysql/68.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Mysql 什么是适当的作用域/索引,有助于通过rails使作用域查找更高效?_Mysql_Ruby On Rails_Database_Optimization_Scope - Fatal编程技术网

Mysql 什么是适当的作用域/索引,有助于通过rails使作用域查找更高效?

Mysql 什么是适当的作用域/索引,有助于通过rails使作用域查找更高效?,mysql,ruby-on-rails,database,optimization,scope,Mysql,Ruby On Rails,Database,Optimization,Scope,我有一个相对较大的4层关系数据设置,如下所示: ClientApplication has_many => ClientApplicationVersions ClientApplicationVersions has_many => CloudLogs CloudLogs has_many => Logs ClientApplication.find_all_by_account_id(1).where(public_key:

我有一个相对较大的4层关系数据设置,如下所示:

ClientApplication         has_many => ClientApplicationVersions
ClientApplicationVersions has_many => CloudLogs
CloudLogs                 has_many => Logs
ClientApplication.find_all_by_account_id(1).where(public_key: 'p0kZudG0').joins(:client_application_version).where("client_application_versions.public_key=?",'0HgoJRyE').logs.page(1)
YourObject.joins(:client_application).
           where(ClientApplication.arel_table[:public_key].eq(client_application_key))
客户端应用程序
:(可能有1000条记录)
-

-
account\u id

-
公钥

-
deleted\u在

客户端应用程序版本
:(可能有10000条记录)
-

-
client\u application\u id

-
公钥

-
deleted\u在

cloud\u日志
:(可能有1000000条记录)
-…
-
client\u application\u version\u id

-
公钥

-
deleted\u在

日志
:(可能有100000000条记录)
-

-
cloud\u log\u id

-
公钥

-
时间戳

-
deleted\u在


我仍在发展中,所以结构和设置不是一成不变的,但我希望它是设置好。使用Rails 3.2.11和InnoDB MySQL。数据库当前填充了一小部分数据(与最终的数据库大小相比)(
logs
只有500000行)。我有4个范围查询,其中3个有问题,用于检索日志

  • 抓取日志的第一页,按时间戳排序,受
    帐户id
    客户端应用程序限制。公钥
    客户端应用程序版本。公钥
    (超过100秒)
  • 抓取日志的第一页,按时间戳排序,受限于
    帐户id
    客户端应用程序。公钥
    (超过100秒)
  • 抓取日志的第一页,按时间戳排序,受限于
    帐户id
    (超过100秒)
  • 抓取日志的第一页,按时间戳排序(~2秒)

  • 我正在使用rails作用域来帮助进行以下调用:

      scope :account_id, proc {|account_id| joins(:client_application).where("client_applications.account_id = ?", account_id) }
      scope :client_application_key, proc {|client_application_key| joins(:client_application).where("client_applications.public_key = ?", client_application_key) }
      scope :client_application_version_key, proc {|client_application_version_key| joins(:client_application_version).where("client_application_versions.public_key = ?", client_application_version_key) }
    
      default_scope order('logs.timestamp DESC')
    
    我在
    公钥上的每个表上都有索引。我在
    logs
    表上有几个索引,包括优化器更喜欢使用的索引(
    index\u logs\u on\u cloud\u log\u id
    ),但查询仍需要花费数代时间才能运行


    下面是我如何在
    rails控制台中调用该方法的:

    Log.account_id(1).client_application_key('p0kZudG0').client_application_version_key('0HgoJRyE').page(1)
    
    。。。以下是rails将其转化为的内容:

    SELECT `logs`.* FROM `logs` INNER JOIN `cloud_logs` ON `cloud_logs`.`id` = `logs`.`cloud_log_id` INNER JOIN `client_application_versions` ON `client_application_versions`.`id` = `cloud_logs`.`client_application_version_id` INNER JOIN `client_applications` ON `client_applications`.`id` = `client_application_versions`.`client_application_id` INNER JOIN `cloud_logs` `cloud_logs_logs_join` ON `cloud_logs_logs_join`.`id` = `logs`.`cloud_log_id` INNER JOIN `client_application_versions` `client_application_versions_logs` ON `client_application_versions_logs`.`id` = `cloud_logs_logs_join`.`client_application_version_id` WHERE (logs.deleted_at IS NULL) AND (client_applications.account_id = 1) AND (client_applications.public_key = 'p0kZudG0') AND (client_application_versions.public_key = '0HgoJRyE') ORDER BY logs.timestamp DESC LIMIT 100 OFFSET 0
    
    。。。下面是该查询的解释语句

    +----+-------------+----------------------------------+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------+---------+------------------------------------------------------------------------+------+----------------------------------------------+
    | id | select_type | table                            | type   | possible_keys                                                                                                                                         | key                                               | key_len | ref                                                                    | rows | Extra                                        |
    +----+-------------+----------------------------------+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------+---------+------------------------------------------------------------------------+------+----------------------------------------------+
    |  1 | SIMPLE      | client_application_versions      | ref    | PRIMARY,index_client_application_versions_on_client_application_id,index_client_application_versions_on_public_key                                    | index_client_application_versions_on_public_key   | 768     | const                                                                  |    1 | Using where; Using temporary; Using filesort |
    |  1 | SIMPLE      | client_applications              | eq_ref | PRIMARY,index_client_applications_on_account_id,index_client_applications_on_public_key                                                               | PRIMARY                                           | 4       | cloudlog_production.client_application_versions.client_application_id  |    1 | Using where                                  |
    |  1 | SIMPLE      | cloud_logs                       | ref    | PRIMARY,index_cloud_logs_on_client_application_version_id                                                                                             | index_cloud_logs_on_client_application_version_id | 5       | cloudlog_production.client_application_versions.id                     |  481 | Using where; Using index                     |
    |  1 | SIMPLE      | cloud_logs_logs_join             | eq_ref | PRIMARY,index_cloud_logs_on_client_application_version_id                                                                                             | PRIMARY                                           | 4       | cloudlog_production.cloud_logs.id                                      |    1 |                                              |
    |  1 | SIMPLE      | client_application_versions_logs | eq_ref | PRIMARY                                                                                                                                               | PRIMARY                                           | 4       | cloudlog_production.cloud_logs_logs_join.client_application_version_id |    1 | Using index                                  |
    |  1 | SIMPLE      | logs                             | ref    | index_logs_on_cloud_log_id_and_deleted_at_and_timestamp,index_logs_on_cloud_log_id_and_deleted_at,index_logs_on_cloud_log_id,index_logs_on_deleted_at | index_logs_on_cloud_log_id                        | 5       | cloudlog_production.cloud_logs.id                                      |    4 | Using where                                  |
    +----+-------------+----------------------------------+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------+---------+------------------------------------------------------------------------+------+----------------------------------------------+
    

    这个问题有三个部分:

  • 我是否可以使用其他索引优化数据库,以帮助这些类型的连接相关排序查询变得更高效
  • 我可以优化rails代码以帮助这类
    查找
    以更高效的方式运行吗
  • 我是否只是在接近这个范围时为大型数据集找到了错误的方法



  • 2012年1月24日更新 正如Geoff和J_MCCaffrey在回答中所建议的那样,我将查询分为3个不同的部分,以尝试隔离问题。正如所料,处理最大的表是一个问题。MYSQL优化器通过使用不同的索引以不同的方式处理此问题,但延迟仍然存在。下面是对这种方法的解释

    ClientApplication.find_by_account_id_and_public_key(1, 'p0kZudG0').versions.select{|cav| cav.public_key = '0HgoJRyE'}.first.logs.page(2)
      ClientApplication Load (165.9ms)  SELECT `client_applications`.* FROM `client_applications` WHERE `client_applications`.`account_id` = 1 AND `client_applications`.`public_key` = 'p0kZudG0' AND (client_applications.deleted_at IS NULL) ORDER BY client_applications.id LIMIT 1
      ClientApplicationVersion Load (105.1ms)  SELECT `client_application_versions`.* FROM `client_application_versions` WHERE `client_application_versions`.`client_application_id` = 3 AND (client_application_versions.deleted_at IS NULL) ORDER BY client_application_versions.created_at DESC, client_application_versions.id DESC
      Log Load (57295.0ms)  SELECT `logs`.* FROM `logs` INNER JOIN `cloud_logs` ON `logs`.`cloud_log_id` = `cloud_logs`.`id` WHERE `cloud_logs`.`client_application_version_id` = 49 AND (logs.deleted_at IS NULL) AND (cloud_logs.deleted_at IS NULL) ORDER BY logs.timestamp DESC, cloud_logs.received_at DESC LIMIT 100 OFFSET 100
      EXPLAIN (214.5ms)  EXPLAIN SELECT `logs`.* FROM `logs` INNER JOIN `cloud_logs` ON `logs`.`cloud_log_id` = `cloud_logs`.`id` WHERE `cloud_logs`.`client_application_version_id` = 49 AND (logs.deleted_at IS NULL) AND (cloud_logs.deleted_at IS NULL) ORDER BY logs.timestamp DESC, cloud_logs.received_at DESC LIMIT 100 OFFSET 100
    EXPLAIN for: SELECT  `logs`.* FROM `logs` INNER JOIN `cloud_logs` ON `logs`.`cloud_log_id` = `cloud_logs`.`id` WHERE `cloud_logs`.`client_application_version_id` = 49 AND (logs.deleted_at IS NULL) AND (cloud_logs.deleted_at IS NULL) ORDER BY logs.timestamp DESC, cloud_logs.received_at DESC LIMIT 100 OFFSET 100
    +----+-------------+------------+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------+---------+-----------------------------------+------+-------------------------------------------------------------------------------------------------------------------------------------------------+
    | id | select_type | table      | type        | possible_keys                                                                                                                                         | key                                                                              | key_len | ref                               | rows | Extra                                                                                                                                           |
    +----+-------------+------------+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------+---------+-----------------------------------+------+-------------------------------------------------------------------------------------------------------------------------------------------------+
    |  1 | SIMPLE      | cloud_logs | index_merge | PRIMARY,index_cloud_logs_on_client_application_version_id,index_cloud_logs_on_deleted_at                                                              | index_cloud_logs_on_client_application_version_id,index_cloud_logs_on_deleted_at | 5,9     | NULL                              | 1874 | Using intersect(index_cloud_logs_on_client_application_version_id,index_cloud_logs_on_deleted_at); Using where; Using temporary; Using filesort |
    |  1 | SIMPLE      | logs       | ref         | index_logs_on_cloud_log_id_and_deleted_at_and_timestamp,index_logs_on_cloud_log_id_and_deleted_at,index_logs_on_cloud_log_id,index_logs_on_deleted_at | index_logs_on_cloud_log_id                                                       | 5       | cloudlog_production.cloud_logs.id |    4 | Using where                                                                                                                                     |
    +----+-------------+------------+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------+---------+-----------------------------------+------+-------------------------------------------------------------------------------------------------------------------------------------------------+
    



    2012年1月25日更新 以下是所有相关表格的索引:

    CLIENT_APPLICATIONS:
      PRIMARY KEY  (`id`),
      UNIQUE KEY `index_client_applications_on_key` (`key`),
      KEY `index_client_applications_on_account_id` (`account_id`),
      KEY `index_client_applications_on_deleted_at` (`deleted_at`),
      KEY `index_client_applications_on_public_key` (`public_key`)
    
    CLIENT_APPLICATION_VERSIONS:
      PRIMARY KEY  (`id`),
      KEY `index_client_application_versions_on_client_application_id` (`client_application_id`),
      KEY `index_client_application_versions_on_deleted_at` (`deleted_at`),
      KEY `index_client_application_versions_on_public_key` (`public_key`)
    
    CLOUD_LOGS:
      PRIMARY KEY  (`id`),
      KEY `index_cloud_logs_on_api_client_version_id` (`api_client_version_id`),
      KEY `index_cloud_logs_on_client_application_version_id` (`client_application_version_id`),
      KEY `index_cloud_logs_on_deleted_at` (`deleted_at`),
      KEY `index_cloud_logs_on_device_id` (`device_id`),
      KEY `index_cloud_logs_on_public_key` (`public_key`),
      KEY `index_cloud_logs_on_received_at` (`received_at`)
    
    LOGS:
      PRIMARY KEY  (`id`),
      KEY `index_logs_on_class_name` (`class_name`),
      KEY `index_logs_on_cloud_log_id_and_deleted_at_and_timestamp` (`cloud_log_id`,`deleted_at`,`timestamp`),
      KEY `index_logs_on_cloud_log_id_and_deleted_at` (`cloud_log_id`,`deleted_at`),
      KEY `index_logs_on_cloud_log_id` (`cloud_log_id`),
      KEY `index_logs_on_deleted_at` (`deleted_at`),
      KEY `index_logs_on_file_name` (`file_name`),
      KEY `index_logs_on_method_name` (`method_name`),
      KEY `index_logs_on_public_key` (`public_key`),
      KEY `index_logs_on_timestamp` USING BTREE (`timestamp`)
    

    不幸的是,我的Rails优化经验都是使用PostgreSQL的,所以大部分可能都不适用。尽管如此,我还是有一些可能适用的建议:

    尝试在您的作用域中使用
    连接
    而不是
    包含
    -
    包含
    用于触发紧急加载-完全可能是您看到的一些减速是加载了不需要的模型。即使不是这样,使用
    连接
    应该会生成一个更可读的查询-它的
    包含
    将所有列别名为“t2_r8”,以此类推


    此外,您还需要确保对任何可能被筛选的列进行索引-一般来说,以
    \u id
    结尾的列可能会以这种方式被引用,并且可能应该被索引,以及在范围中专门筛选的任何列(如
    客户端应用程序版本\u密钥
    )我写这篇文章是为了解决我自己的问题,希望能得到更好的答案。目前,数据库完全由关系型手册设置

    ClientApplication         has_many => ClientApplicationVersions
    ClientApplicationVersions has_many => CloudLogs
    CloudLogs                 has_many => Logs
    
    这意味着,当我需要查找属于客户端应用程序的日志时,我必须进行3次额外的连接才能获得它。通过对Logs表引入一些
    外键
    非规范化,我可以跳过所有连接:

    ClientApplication         has_many => ClientApplicationVersions
    ClientApplication         has_many => Logs
    ClientApplicationVersions has_many => CloudLogs
    ClientApplicationVersions has_many => Logs
    CloudLogs                 has_many => Logs
    
    最终的结果是,我的日志表中会有一些额外的列:
    client\u application\u key
    client\u application\u version\u key
    ,以及
    cloud\u log\u key


    虽然我冒着数据不一致的风险,但我可以避免这里的3个连接,它们有助于降低查询的性能。请有人告诉我不要这样。

    试着回答你的每个问题:

  • 当然可以!您正在搜索的任何内容都应该被编入索引。如果没有索引,则必须进行完整的表扫描。如果使用
    create_table
    中的
    references
    功能,则在执行初始迁移时创建的关联ID之上,您至少正在搜索以下内容:

    • logs.timestamp
    • 客户端\u应用程序\u版本.public\u密钥
    • 客户端应用程序。公钥
    • logs.deleted_在
    这些可能都应该编入索引。当然,如果在定义关联外键时没有使用
    引用
    ,那么也可以添加它们。当然,与指数之间存在权衡。它们就像阅读的魔法,但它们可能会显著减慢你的写作速度。它们在多大程度上减慢或加快了您的速度,可能也严重依赖于数据库

  • 我不这么认为。您的rails代码看起来与rig有关
    ClientApplication.find_all_by_account_id(1).where(public_key: 'p0kZudG0').joins(:client_application_version).where("client_application_versions.public_key=?",'0HgoJRyE').logs.page(1)
    
    SELECT
      `logs`.*
    FROM
      `logs` as l
      INNER JOIN `cloud_logs` as cl1
        ON
          cl1.id = l.cloud_log_id
      INNER JOIN `cloud_logs` as cl2
        ON
          cl2.id = l.cloud_log_id
      INNER JOIN `client_application_versions` as cav1
        ON
          cav1.id = cl1.client_application_version_id
      INNER JOIN `client_application_versions` as cav2
        ON
          cav2.id = cl2.client_application_version_id
      INNER JOIN `client_applications` as ca
        ON
          ca.id = cav1.client_application_id
    WHERE
      (l.deleted_at IS NULL)
        AND
      (ca.account_id = 1)
        AND
      (ca.public_key = 'p0kZudG0')
        AND
      (cav.public_key = '0HgoJRyE')
    ORDER BY
      logs.timestamp DESC
    LIMIT
      0, 100
    
    SELECT
      `logs`.*
    FROM
      `logs` as l
      INNER JOIN `cloud_logs` as cl1
        ON
          cl1.id = l.cloud_log_id
    --  INNER JOIN `cloud_logs` as cl2
    --    ON
    --      cl2.id = l.cloud_log_id
      INNER JOIN `client_application_versions` as cav1 use index for join (`index_cavs_on_client_application_id_and_public_key`)
        ON
          cav1.id = cl1.client_application_version_id
            AND
          cav1.public_key = '0HgoJRyE'
    
    --  INNER JOIN `client_application_versions` as cav2
    --    ON
    --      cav2.id = cl2.client_application_version_id
      INNER JOIN `client_applications` as ca
        ON
          ca.id = cav1.client_application_id
    WHERE
      (l.deleted_at IS NULL)
        AND
      (ca.account_id = 1)
        AND
      (ca.public_key = 'p0kZudG0')
    ORDER BY
      logs.timestamp DESC
    LIMIT
      0, 100
    
    Log.where(:deleted_at.ne => nil).order("logs.timestamp desc").joins(:cloud_logs) & \
    CloudLog.joins(:client_application_versions) & \
    ClientApplicationVersion.where(:public_key => '0HgoJRyE').joins(:client_applications) & \
    ClientApplication.where(:public_key => 'p0kZudG0', :account_id => 1)
    
    alter table client_application_versions add key (`client_application_id`, `public_key`);
    
    YourObject.joins(:client_application).
               where(ClientApplication.arel_table[:public_key].eq(client_application_key))