Mysql 多行SQL查询
我有一张这样的桌子:Mysql 多行SQL查询,mysql,sql,Mysql,Sql,我有一张这样的桌子: --------------------------- |housing_id | facility_id | --------------------------- | 1 | 7 | | 1 | 4 | | 2 | 7 | --------------------------- 现在我要做的是获得所有住房id,设施id为7和4。 因此,在这种情况下,查询应该只返
---------------------------
|housing_id | facility_id |
---------------------------
| 1 | 7 |
| 1 | 4 |
| 2 | 7 |
---------------------------
现在我要做的是获得所有住房id,设施id为7和4。
因此,在这种情况下,查询应该只返回housing_id 1。
数据库是mysql。另一种方法是-
SELECT housing_id
FROM mytable
WHERE housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=4)
AND housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=7)
SELECT housing_id
FROM mytable
WHERE facility_id IN (4,7)
GROUP BY housing_id
HAVING COUNT(DISTINCT facility_id) = 2
更新-受Josvic评论的启发,我决定做更多的测试,并认为我会包括我的发现
使用此查询的好处之一是很容易修改以包含更多设施ID。如果你想找到所有有设施ID 1、3、4和7的房屋ID,你只需要这样做-
SELECT housing_id
FROM mytable
WHERE facility_id IN (1,3,4,7)
GROUP BY housing_id
HAVING COUNT(DISTINCT facility_id) = 4
根据所采用的索引策略,这三种查询的性能差别很大。在我的测试数据集上,无论使用何种索引,我都无法从从属子查询版本获得合理的性能
Tim提供的自连接解决方案在给定两列上单独的单列索引时表现非常好,但随着条件数量的增加,其表现也不太好
以下是我的测试表上的一些基本统计数据-500k行-147963个房屋标识,设施标识的潜在值介于1和9之间
以下是用于运行所有这些测试的索引-
SHOW INDEXES FROM mytable;
+---------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type |
+---------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+
| mytable | 0 | UQ_housing_facility | 1 | housing_id | A | 500537 | NULL | NULL | | BTREE |
| mytable | 0 | UQ_housing_facility | 2 | facility_id | A | 500537 | NULL | NULL | | BTREE |
| mytable | 0 | UQ_facility_housing | 1 | facility_id | A | 12 | NULL | NULL | | BTREE |
| mytable | 0 | UQ_facility_housing | 2 | housing_id | A | 500537 | NULL | NULL | | BTREE |
| mytable | 1 | IX_housing | 1 | housing_id | A | 500537 | NULL | NULL | | BTREE |
| mytable | 1 | IX_facility | 1 | facility_id | A | 12 | NULL | NULL | | BTREE |
+---------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+
测试的第一个查询是从属子查询-
SELECT SQL_NO_CACHE DISTINCT housing_id
FROM mytable
WHERE housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=4)
AND housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=7);
17321 rows in set (9.15 sec)
+----+--------------------+---------+-----------------+----------------------------------------------------------------+---------------------+---------+------------+--------+---------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+---------+-----------------+----------------------------------------------------------------+---------------------+---------+------------+--------+---------------------------------------+
| 1 | PRIMARY | mytable | range | NULL | IX_housing | 4 | NULL | 500538 | Using where; Using index for group-by |
| 3 | DEPENDENT SUBQUERY | mytable | unique_subquery | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | func,const | 1 | Using index; Using where |
| 2 | DEPENDENT SUBQUERY | mytable | unique_subquery | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | func,const | 1 | Using index; Using where |
+----+--------------------+---------+-----------------+----------------------------------------------------------------+---------------------+---------+------------+--------+---------------------------------------+
SELECT SQL_NO_CACHE DISTINCT housing_id
FROM mytable
WHERE housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=1)
AND housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=3)
AND housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=4)
AND housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=7);
567 rows in set (9.30 sec)
+----+--------------------+---------+-----------------+----------------------------------------------------------------+---------------------+---------+------------+--------+---------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+---------+-----------------+----------------------------------------------------------------+---------------------+---------+------------+--------+---------------------------------------+
| 1 | PRIMARY | mytable | range | NULL | IX_housing | 4 | NULL | 500538 | Using where; Using index for group-by |
| 5 | DEPENDENT SUBQUERY | mytable | unique_subquery | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | func,const | 1 | Using index; Using where |
| 4 | DEPENDENT SUBQUERY | mytable | unique_subquery | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | func,const | 1 | Using index; Using where |
| 3 | DEPENDENT SUBQUERY | mytable | unique_subquery | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | func,const | 1 | Using index; Using where |
| 2 | DEPENDENT SUBQUERY | mytable | unique_subquery | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | func,const | 1 | Using index; Using where |
+----+--------------------+---------+-----------------+----------------------------------------------------------------+---------------------+---------+------------+--------+---------------------------------------+
接下来是我的版本使用组由。。。有伯爵
SELECT SQL_NO_CACHE housing_id
FROM mytable
WHERE facility_id IN (4,7)
GROUP BY housing_id
HAVING COUNT(DISTINCT facility_id) = 2;
17321 rows in set (0.79 sec)
+----+-------------+---------+-------+---------------------------------+-------------+---------+------+--------+------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+---------------------------------+-------------+---------+------+--------+------------------------------------------+
| 1 | SIMPLE | mytable | range | UQ_facility_housing,IX_facility | IX_facility | 4 | NULL | 198646 | Using where; Using index; Using filesort |
+----+-------------+---------+-------+---------------------------------+-------------+---------+------+--------+------------------------------------------+
SELECT SQL_NO_CACHE housing_id
FROM mytable
WHERE facility_id IN (1,3,4,7)
GROUP BY housing_id
HAVING COUNT(DISTINCT facility_id) = 4;
567 rows in set (1.25 sec)
+----+-------------+---------+-------+---------------------------------+-------------+---------+------+--------+------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+---------------------------------+-------------+---------+------+--------+------------------------------------------+
| 1 | SIMPLE | mytable | range | UQ_facility_housing,IX_facility | IX_facility | 4 | NULL | 407160 | Using where; Using index; Using filesort |
+----+-------------+---------+-------+---------------------------------+-------------+---------+------+--------+------------------------------------------+
最后但并非最不重要的是自连接-
SELECT SQL_NO_CACHE a.housing_id
FROM mytable a
INNER JOIN mytable b
ON a.housing_id = b.housing_id
WHERE a.facility_id = 4 AND b.facility_id = 7;
17321 rows in set (1.37 sec)
+----+-------------+-------+--------+----------------------------------------------------------------+---------------------+---------+-------------------------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------------------------------------------+---------------------+---------+-------------------------+-------+-------------+
| 1 | SIMPLE | b | ref | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | IX_facility | 4 | const | 94598 | Using index |
| 1 | SIMPLE | a | eq_ref | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | test.b.housing_id,const | 1 | Using index |
+----+-------------+-------+--------+----------------------------------------------------------------+---------------------+---------+-------------------------+-------+-------------+
SELECT SQL_NO_CACHE a.housing_id
FROM mytable a
INNER JOIN mytable b
ON a.housing_id = b.housing_id
INNER JOIN mytable c
ON a.housing_id = c.housing_id
INNER JOIN mytable d
ON a.housing_id = d.housing_id
WHERE a.facility_id = 1
AND b.facility_id = 3
AND c.facility_id = 4
AND d.facility_id = 7;
567 rows in set (1.64 sec)
+----+-------------+-------+--------+----------------------------------------------------------------+---------------------+---------+-------------------------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------------------------------------------+---------------------+---------+-------------------------+-------+--------------------------+
| 1 | SIMPLE | b | ref | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | IX_facility | 4 | const | 93782 | Using index |
| 1 | SIMPLE | d | eq_ref | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | test.b.housing_id,const | 1 | Using index |
| 1 | SIMPLE | c | eq_ref | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | test.b.housing_id,const | 1 | Using index |
| 1 | SIMPLE | a | eq_ref | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | test.d.housing_id,const | 1 | Using where; Using index |
+----+-------------+-------+--------+----------------------------------------------------------------+---------------------+---------+-------------------------+-------+--------------------------+
另一种方法是-
SELECT housing_id
FROM mytable
WHERE facility_id IN (4,7)
GROUP BY housing_id
HAVING COUNT(DISTINCT facility_id) = 2
更新-受Josvic评论的启发,我决定做更多的测试,并认为我会包括我的发现
使用此查询的好处之一是很容易修改以包含更多设施ID。如果你想找到所有有设施ID 1、3、4和7的房屋ID,你只需要这样做-
SELECT housing_id
FROM mytable
WHERE facility_id IN (1,3,4,7)
GROUP BY housing_id
HAVING COUNT(DISTINCT facility_id) = 4
根据所采用的索引策略,这三种查询的性能差别很大。在我的测试数据集上,无论使用何种索引,我都无法从从属子查询版本获得合理的性能
Tim提供的自连接解决方案在给定两列上单独的单列索引时表现非常好,但随着条件数量的增加,其表现也不太好
以下是我的测试表上的一些基本统计数据-500k行-147963个房屋标识,设施标识的潜在值介于1和9之间
以下是用于运行所有这些测试的索引-
SHOW INDEXES FROM mytable;
+---------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type |
+---------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+
| mytable | 0 | UQ_housing_facility | 1 | housing_id | A | 500537 | NULL | NULL | | BTREE |
| mytable | 0 | UQ_housing_facility | 2 | facility_id | A | 500537 | NULL | NULL | | BTREE |
| mytable | 0 | UQ_facility_housing | 1 | facility_id | A | 12 | NULL | NULL | | BTREE |
| mytable | 0 | UQ_facility_housing | 2 | housing_id | A | 500537 | NULL | NULL | | BTREE |
| mytable | 1 | IX_housing | 1 | housing_id | A | 500537 | NULL | NULL | | BTREE |
| mytable | 1 | IX_facility | 1 | facility_id | A | 12 | NULL | NULL | | BTREE |
+---------+------------+---------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+
测试的第一个查询是从属子查询-
SELECT SQL_NO_CACHE DISTINCT housing_id
FROM mytable
WHERE housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=4)
AND housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=7);
17321 rows in set (9.15 sec)
+----+--------------------+---------+-----------------+----------------------------------------------------------------+---------------------+---------+------------+--------+---------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+---------+-----------------+----------------------------------------------------------------+---------------------+---------+------------+--------+---------------------------------------+
| 1 | PRIMARY | mytable | range | NULL | IX_housing | 4 | NULL | 500538 | Using where; Using index for group-by |
| 3 | DEPENDENT SUBQUERY | mytable | unique_subquery | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | func,const | 1 | Using index; Using where |
| 2 | DEPENDENT SUBQUERY | mytable | unique_subquery | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | func,const | 1 | Using index; Using where |
+----+--------------------+---------+-----------------+----------------------------------------------------------------+---------------------+---------+------------+--------+---------------------------------------+
SELECT SQL_NO_CACHE DISTINCT housing_id
FROM mytable
WHERE housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=1)
AND housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=3)
AND housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=4)
AND housing_id IN (SELECT housing_id FROM mytable WHERE facility_id=7);
567 rows in set (9.30 sec)
+----+--------------------+---------+-----------------+----------------------------------------------------------------+---------------------+---------+------------+--------+---------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+---------+-----------------+----------------------------------------------------------------+---------------------+---------+------------+--------+---------------------------------------+
| 1 | PRIMARY | mytable | range | NULL | IX_housing | 4 | NULL | 500538 | Using where; Using index for group-by |
| 5 | DEPENDENT SUBQUERY | mytable | unique_subquery | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | func,const | 1 | Using index; Using where |
| 4 | DEPENDENT SUBQUERY | mytable | unique_subquery | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | func,const | 1 | Using index; Using where |
| 3 | DEPENDENT SUBQUERY | mytable | unique_subquery | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | func,const | 1 | Using index; Using where |
| 2 | DEPENDENT SUBQUERY | mytable | unique_subquery | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | func,const | 1 | Using index; Using where |
+----+--------------------+---------+-----------------+----------------------------------------------------------------+---------------------+---------+------------+--------+---------------------------------------+
接下来是我的版本使用组由。。。有伯爵
SELECT SQL_NO_CACHE housing_id
FROM mytable
WHERE facility_id IN (4,7)
GROUP BY housing_id
HAVING COUNT(DISTINCT facility_id) = 2;
17321 rows in set (0.79 sec)
+----+-------------+---------+-------+---------------------------------+-------------+---------+------+--------+------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+---------------------------------+-------------+---------+------+--------+------------------------------------------+
| 1 | SIMPLE | mytable | range | UQ_facility_housing,IX_facility | IX_facility | 4 | NULL | 198646 | Using where; Using index; Using filesort |
+----+-------------+---------+-------+---------------------------------+-------------+---------+------+--------+------------------------------------------+
SELECT SQL_NO_CACHE housing_id
FROM mytable
WHERE facility_id IN (1,3,4,7)
GROUP BY housing_id
HAVING COUNT(DISTINCT facility_id) = 4;
567 rows in set (1.25 sec)
+----+-------------+---------+-------+---------------------------------+-------------+---------+------+--------+------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+-------+---------------------------------+-------------+---------+------+--------+------------------------------------------+
| 1 | SIMPLE | mytable | range | UQ_facility_housing,IX_facility | IX_facility | 4 | NULL | 407160 | Using where; Using index; Using filesort |
+----+-------------+---------+-------+---------------------------------+-------------+---------+------+--------+------------------------------------------+
最后但并非最不重要的是自连接-
SELECT SQL_NO_CACHE a.housing_id
FROM mytable a
INNER JOIN mytable b
ON a.housing_id = b.housing_id
WHERE a.facility_id = 4 AND b.facility_id = 7;
17321 rows in set (1.37 sec)
+----+-------------+-------+--------+----------------------------------------------------------------+---------------------+---------+-------------------------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------------------------------------------+---------------------+---------+-------------------------+-------+-------------+
| 1 | SIMPLE | b | ref | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | IX_facility | 4 | const | 94598 | Using index |
| 1 | SIMPLE | a | eq_ref | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | test.b.housing_id,const | 1 | Using index |
+----+-------------+-------+--------+----------------------------------------------------------------+---------------------+---------+-------------------------+-------+-------------+
SELECT SQL_NO_CACHE a.housing_id
FROM mytable a
INNER JOIN mytable b
ON a.housing_id = b.housing_id
INNER JOIN mytable c
ON a.housing_id = c.housing_id
INNER JOIN mytable d
ON a.housing_id = d.housing_id
WHERE a.facility_id = 1
AND b.facility_id = 3
AND c.facility_id = 4
AND d.facility_id = 7;
567 rows in set (1.64 sec)
+----+-------------+-------+--------+----------------------------------------------------------------+---------------------+---------+-------------------------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------------------------------------------+---------------------+---------+-------------------------+-------+--------------------------+
| 1 | SIMPLE | b | ref | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | IX_facility | 4 | const | 93782 | Using index |
| 1 | SIMPLE | d | eq_ref | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | test.b.housing_id,const | 1 | Using index |
| 1 | SIMPLE | c | eq_ref | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | test.b.housing_id,const | 1 | Using index |
| 1 | SIMPLE | a | eq_ref | UQ_housing_facility,UQ_facility_housing,IX_housing,IX_facility | UQ_housing_facility | 8 | test.d.housing_id,const | 1 | Using where; Using index |
+----+-------------+-------+--------+----------------------------------------------------------------+---------------------+---------+-------------------------+-------+--------------------------+
您还可以进行自联接,最快的联接将在很大程度上取决于表中的数据量
SELECT a.housing_id
FROM mytable a
INNER JOIN mytable b
ON a.housing_id = b.housing_id AND a.facility_id <> b.facility_id
WHERE a.facility_id = 4 AND b.facility_id = 7
选择一个外壳\u id
从我的表a
内部联接表b
在a.housing_id=b.housing_id和a.facility_id b.facility_id上
其中a.facility_id=4和b.facility_id=7
您也可以进行自联接,最快的一种联接在很大程度上取决于表中的数据量
SELECT a.housing_id
FROM mytable a
INNER JOIN mytable b
ON a.housing_id = b.housing_id AND a.facility_id <> b.facility_id
WHERE a.facility_id = 4 AND b.facility_id = 7
选择一个外壳\u id
从我的表a
内部联接表b
在a.housing_id=b.housing_id和a.facility_id b.facility_id上
其中a.facility_id=4和b.facility_id=7
这似乎有点overcomplicated@SinistraD子句中的导致重复。仍然比分组方式更简单。。正在进行计数。分组依据。。。拥有COUNT
的开销将低于此值。我刚刚在一个有500K条记录的测试表上试用过它。@nnichols“较低的开销”取决于多种因素,包括索引、表中的总行数等。如果它是一个带有索引的ID
字段,我建议的方法应该总是比整个表上的聚合更好,因为查询优化器将首先应用WHERE
过滤器。使用HAVING
子句,在执行所有聚合计算(整个表)后应用过滤器。然而,这两种方法在理论上都回答了这个问题。现在由@Rene Koller决定在他的特定场景中哪一个表现最好。@JosvicZammit-公正评论。我粗略地假设这两个字段上有一个唯一的索引。两个查询都将首先使用索引进行筛选overcomplicated@SinistraD
子句中的导致重复。仍然比分组方式更简单。。正在进行计数。分组依据。。。拥有COUNT
的开销将低于此值。我刚刚在一个有500K条记录的测试表上试用过它。@nnichols“较低的开销”取决于多种因素,包括索引、表中的总行数等。如果它是一个带有索引的ID
字段,我建议的方法应该总是比整个表上的聚合更好,因为查询优化器将首先应用WHERE
过滤器。使用HAVING
子句,在执行所有聚合计算(整个表)后应用过滤器。然而,这两种方法在理论上都回答了这个问题。现在由@Rene Koller决定在他的特定场景中哪一个表现最好。@JosvicZammit-公正评论。我粗略地假设这两个字段上有一个唯一的索引。这两个查询都将首先使用索引进行过滤。现在我想这似乎是目前为止唯一的好解决方案,哈哈。。。我删除了我的答案哇,这是一个gib速度差,我想我会尝试使用你的版本。现在我想这似乎是目前为止唯一的好办法,哈哈。。。我删除了我的答案哇,这是一个gib速度差,我想我会尝试使用你的版本。