Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/mysql/58.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/github/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
MYSQL子查询与Join-两者对我都不好_Mysql - Fatal编程技术网

MYSQL子查询与Join-两者对我都不好

MYSQL子查询与Join-两者对我都不好,mysql,Mysql,我正在使用MYSQL。 我有三张桌子 人员表,由两列组成: id-表中的主键 姓名-人员姓名 收入表,其中包含人员表中人员的收入。 此表中的每条记录代表一个人的收入。 在此表中,一个人可能没有或有很多收入。 表格结构为: 人员id(“人员”表的外键) 金额(十进制类型-金额) 金额的小时数(整数类型-赚取此收入所需的小时数) 费用包含人员费用的表格。 此表中的每条记录表示一个人的支出, 他花了一笔钱买了多少东西。 一个人在此表中可以有零个或多个费用记录。 表格结构为: 人员id(“人员

我正在使用MYSQL。 我有三张桌子

  • 人员
    表,由两列组成:

    • id-表中的主键
    • 姓名-人员姓名
  • 收入
    表,其中包含人员表中人员的收入。 此表中的每条记录代表一个人的收入。 在此表中,一个人可能没有或有很多收入。 表格结构为:

    • 人员id(“人员”表的外键)
    • 金额(十进制类型-金额)
    • 金额的小时数(整数类型-赚取此收入所需的小时数)
  • 费用
    包含人员费用的表格。 此表中的每条记录表示一个人的支出, 他花了一笔钱买了多少东西。 一个人在此表中可以有零个或多个费用记录。 表格结构为:

    • 人员id(“人员”表的外键)
    • 金额(十进制类型的金额)
    • 购买的物品数量(整数类型-此费用中购买的物品数量)
  • 我想做的是创建一个查询,它会给我一个所有人的列表 (每人一份记录)每行我都有

    • 此人的姓名
    • 他所有收入的总和
    • 他工作的总小时数
    • 他所有费用的总和
    • 他买的东西的总数
    我尝试的第一种天真的方法逻辑性很好,但性能很差, 它看起来像这样:

    SELECT name, income_sum, work_hours_sum, expenses_sum, items_count
    FROM (people
          LEFT JOIN 
               (SELECT person_id, sum(amount) as income_sum, 
                       sum(number_of_hours_for_amount) as work_hours_sum
                FROM income
                GROUP BY person_id) as income_subquery
          ON people.id = income_subquery.person_id)
    
    LEFT JOIN
         (SELECT person_id, sum(amount) as expenses_sum, 
                 sum(number_of_items_bought) as items_count
          FROM expenses
          GROUP BY person_id) as income_subquery
    ON people.id = income_subquery.person_id
    
    据我所知,这个查询的问题是,一旦我从子查询中获得数据,连接的效率就会非常低 因为这些表是临时子查询表,所以在这些表上没有很好地使用索引

    充分利用现有索引的最佳方法是直接在三个表之间进行连接 而不是通过子查询。 但这不是一个正确的解决方案,因为它将创建一个笛卡尔积,该积将向聚合和添加重复的值 从那些比他们应该看到的更多的记录中

    (我尝试的另一个选择是将每个人的收入和支出值计算为一个select_表达式。) 在SELECT部分(依赖子查询)。这也没有足够快地工作)


    我正在寻找一个高效的查询并给出这些结果。

    试试这个。两个联接都应该使用
    people.id
    上的索引

    SELECT name, income_sum, work_hours_sum, expenses_sum, items_count
    FROM people
    
    LEFT JOIN 
         (SELECT person_id, sum(amount) as income_sum, 
                 sum(number_of_hours_for_amount) as work_hours_sum
          FROM income
          GROUP BY person_id) as income_subquery
    ON people.id = income_subquery.person_id
    
    LEFT JOIN
         (SELECT person_id, sum(amount) as expenses_sum, 
                 sum(number_of_items_bought) as items_count
          FROM expenses
          GROUP BY person_id) as expenses_subquery
    ON people.id = expenses_subquery.person_id
    
    理想情况下,一个好的查询优化器会意识到您的原始SQL与此等价。但是您使用的是MySQL,所以我不希望进行理想的优化


    请确保您在
    收入、人员id
    费用、人员id
    上有索引,这样子查询中的分组将非常有效。

    类似的内容将使您非常接近:

    select id, name, (select sum(amount) from income i where i.person_id = p.id) as 'total_income_amount',
                     (select sum(number_of_hours_for_amount) from income i where i.person_id = p.id) as 'total_number_of_hours_for_amount',
                     (select sum(amount) from expenses e where e.person_id = p.id) as 'total_expenses_amount',
                     (select sum(number_of_items_bought) from expenses e where e.person_id = p.id) as 'total_number_of_items_bought'
    from   people p;
    

    你是对的,这里有一个不可避免的笛卡尔积。您可以将此问题分解为两个子查询:

    一项收入:

    SELECT p.id, p.name, SUM(i.amount) AS income_sum, SUM(number_of_hours_for_amount) AS work_hours_sum
    FROM people p
    LEFT JOIN income i ON p.id = i.person_id
    GROUP BY p.id;
    
    +----+---------+------------+----------------+
    | id | name    | income_sum | work_hours_sum |
    +----+---------+------------+----------------+
    |  1 | Groucho |      20.00 |             20 |
    |  2 | Harpo   |      40.00 |             40 |
    |  3 | Chico   |      60.00 |             60 |
    +----+---------+------------+----------------+
    
    下面是对该查询的解释:

    +----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------------+
    | id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                                              |
    +----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------------+
    |  1 | SIMPLE      | p     | ALL  | PRIMARY       | NULL | NULL    | NULL |    3 | Using temporary; Using filesort                    |
    |  1 | SIMPLE      | i     | ALL  | NULL          | NULL | NULL    | NULL |    6 | Using where; Using join buffer (Block Nested Loop) |
    +----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------------+
    
    一项费用:

    SELECT p.id, SUM(e.amount) AS expenses_sum, SUM(number_of_items_bought) AS items_count
    FROM people p
    LEFT JOIN expenses e ON p.id = e.person_id
    GROUP BY p.id;
    
    +----+--------------+-------------+
    | id | expenses_sum | items_count |
    +----+--------------+-------------+
    |  1 |        30.00 |           4 |
    |  2 |        30.00 |           4 |
    |  3 |        30.00 |           4 |
    +----+--------------+-------------+
    
    下面是解释:

    +----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------------+
    | id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                                              |
    +----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------------+
    |  1 | SIMPLE      | p     | ALL  | PRIMARY       | NULL | NULL    | NULL |    3 | Using temporary; Using filesort                    |
    |  1 | SIMPLE      | e     | ALL  | NULL          | NULL | NULL    | NULL |    6 | Using where; Using join buffer (Block Nested Loop) |
    +----+-------------+-------+------+---------------+------+---------+------+------+----------------------------------------------------+
    
    我们在上面的解释报告中看到,查询在收入和支出表上使用表扫描(键入“ALL”),并在没有索引的情况下进行联接(“使用联接缓冲区”)。红色标志是连接中涉及两个表,其中两个表都使用了访问类型“ALL”。如果这些表中的行数不多,那么成本就会非常高。它通常伴随着“使用联接缓冲区”,这是一个代价高昂的查询的另一个危险信号

    最后,它通过使用临时表和文件排序来低效地执行组。这是另一个性能杀手

    是一个MySQL 5.6的东西。如果使用早期版本的MySQL,您将看不到这一点

    以下索引应有助于使这些查询变得更好:

    ALTER TABLE income ADD KEY (person_id, amount, number_of_hours_for_amount);
    ALTER TABLE expenses ADD KEY (person_id, amount, number_of_items_bought);
    
    现在解释报告不再显示低效的访问。连接使用索引(类型“ref”)完成,临时表和文件排序消失。“使用索引”表示它仅通过索引中的列访问连接的表,根本不需要触摸表行

    +----+-------------+-------+-------+---------------+-----------+---------+-----------+------+-------------+
    | id | select_type | table | type  | possible_keys | key       | key_len | ref       | rows | Extra       |
    +----+-------------+-------+-------+---------------+-----------+---------+-----------+------+-------------+
    |  1 | SIMPLE      | p     | index | PRIMARY       | PRIMARY   | 4       | NULL      |    3 | NULL        |
    |  1 | SIMPLE      | i     | ref   | person_id     | person_id | 5       | test.p.id |    1 | Using index |
    +----+-------------+-------+-------+---------------+-----------+---------+-----------+------+-------------+
    
    +----+-------------+-------+-------+---------------+-----------+---------+-----------+------+-------------+
    | id | select_type | table | type  | possible_keys | key       | key_len | ref       | rows | Extra       |
    +----+-------------+-------+-------+---------------+-----------+---------+-----------+------+-------------+
    |  1 | SIMPLE      | p     | index | PRIMARY       | PRIMARY   | 4       | NULL      |    3 | NULL        |
    |  1 | SIMPLE      | e     | ref   | person_id     | person_id | 5       | test.p.id |    1 | Using index |
    +----+-------------+-------+-------+---------------+-----------+---------+-----------+------+-------------+
    
    您说过要在一个查询中完成此操作,下面是如何完成此操作:

    我们可以将这两个单独的查询合并到一个查询中,以获得每人一行的结果:

    SELECT name, income_sum, work_hours_sum, expenses_sum, items_count
    FROM
    (SELECT p.id, p.name, SUM(i.amount) AS income_sum, SUM(number_of_hours_for_amount) AS work_hours_sum
     FROM people p
     LEFT OUTER JOIN income i ON p.id = i.person_id
     GROUP BY p.id) AS subq_i
    INNER JOIN
    (SELECT p.id, SUM(e.amount) AS expenses_sum, SUM(number_of_items_bought) AS items_count
     FROM people p
     LEFT OUTER JOIN expenses e ON p.id = e.person_id
     GROUP BY p.id) AS subq_e
    USING (id);
    
    +---------+------------+----------------+--------------+-------------+
    | name    | income_sum | work_hours_sum | expenses_sum | items_count |
    +---------+------------+----------------+--------------+-------------+
    | Groucho |      20.00 |             20 |        30.00 |           4 |
    | Harpo   |      40.00 |             40 |        30.00 |           4 |
    | Chico   |      60.00 |             60 |        30.00 |           4 |
    +---------+------------+----------------+--------------+-------------+
    
    即使对于这个联合查询,解释看起来也没那么糟糕。没有临时表、文件队列或联接缓冲区,并且很好地使用了覆盖索引

    +----+-------------+------------+-------+---------------+-------------+---------+-----------+------+-------------+
    | id | select_type | table      | type  | possible_keys | key         | key_len | ref       | rows | Extra       |
    +----+-------------+------------+-------+---------------+-------------+---------+-----------+------+-------------+
    |  1 | PRIMARY     | <derived2> | ALL   | NULL          | NULL        | NULL    | NULL      |    3 | NULL        |
    |  1 | PRIMARY     | <derived3> | ref   | <auto_key0>   | <auto_key0> | 4       | subq_i.id |    2 | NULL        |
    |  3 | DERIVED     | p          | index | PRIMARY       | PRIMARY     | 4       | NULL      |    3 | Using index |
    |  3 | DERIVED     | e          | ref   | person_id     | person_id   | 5       | test.p.id |    1 | Using index |
    |  2 | DERIVED     | p          | index | PRIMARY       | PRIMARY     | 4       | NULL      |    3 | NULL        |
    |  2 | DERIVED     | i          | ref   | person_id     | person_id   | 5       | test.p.id |    1 | Using index |
    +----+-------------+------------+-------+---------------+-------------+---------+-----------+------+-------------+
    
    +----+-------------+------------+-------+---------------+-------------+---------+-----------+------+-------------+
    |id |选择|类型|类型|可能的|键|键|列|参考|行|额外|
    +----+-------------+------------+-------+---------------+-------------+---------+-----------+------+-------------+
    |1 | PRIMARY | ALL | NULL | NULL | NULL | NULL | 3 | NULL|
    |1 |主| | | | | | |参考| | | 4 |子Q|U i.id | 2 |空|
    |3 |派生| p |索引|主|主| 4 |空| 3 |使用索引|
    |3 |导出| e |参考|人员| id |人员| id | 5 |测试p.id | 1 |使用索引|
    |2 |派生| p |索引|主|主| 4 |空| 3 |空|
    |2 |衍生| i |参考|个人| id |个人| id | 5 |测试p.id | 1 |使用索引|
    +----+-------------+------------+-------+---------------+-------------+---------+-----------+------+-------------+
    
    也许您可以完全跳过连接

    SELECT person_id
         , MIN(name) AS name
         , SUM(income_sum) AS income_sum
         , SUM(work_hours_sum) AS work_hours_sum
         , SUM(expenses_sum) AS expenses_sum
         , SUM(items_count) AS items_count
    FROM (
    SELECT id AS person_id
         , name
         , NULL AS income_sum
         , NULL AS work_hours_sum
         , NULL AS expenses_sum
         , NULL AS items_count
      FROM people
    UNION ALL
    SELECT person_id
         , NULL AS name
         , sum(amount) AS income_sum
         , sum(number_of_hours_for_amount) AS work_hours_sum
         , NULL AS expenses_sum
         , NULL AS items_count
      FROM income
     GROUP BY person_id
    UNION ALL
    SELECT person_id
         , NULL AS name
         , NULL AS income_sum
         , NULL AS work_hours_sum
         , sum(amount) AS expenses_sum
         , sum(number_of_items_bought) AS items_count
      FROM expenses
     GROUP BY person_id
    ) as d
    WHERE person_id IS NOT NULL -- my sql generates this row
     GROUP BY person_id
    

    您多久更新一次收入和支出?你能把它们的总和聚合到另一个表中吗?表中有多少行?“不快”有多慢?你能把解释贴出来吗