SQL查询2表空结果

SQL查询2表空结果,sql,sql-server,Sql,Sql Server,我在一次采访中被问到这个问题: 从下面的两个表中,编写一个查询来拉取没有销售订单的客户。 有多少种方法可以编写此查询,哪种方法性能最好 表1:Customer-CustomerID 表2:saleorder-OrderID、CustomerID、OrderDate 查询: SELECT * FROM Customer C RIGHT OUTER JOIN SalesOrder SO ON C.CustomerID = SO.CustomerID WHERE SO.OrderID = NU

我在一次采访中被问到这个问题: 从下面的两个表中,编写一个查询来拉取没有销售订单的客户。 有多少种方法可以编写此查询,哪种方法性能最好

  • 表1:
    Customer
    -
    CustomerID
  • 表2:
    saleorder
    -
    OrderID、CustomerID、OrderDate
查询:

SELECT *
FROM Customer C
  RIGHT OUTER JOIN SalesOrder SO ON C.CustomerID = SO.CustomerID
WHERE SO.OrderID = NULL

我的查询是否正确?是否有其他方法来编写查询并获得相同的结果?

我可以选择其他两种方法来编写此查询:

SELECT C.*
FROM Customer C
LEFT OUTER JOIN SalesOrder SO ON C.CustomerID = SO.CustomerID
WHERE SO.CustomerID IS NULL

SELECT C.*
FROM Customer C
WHERE NOT C.CustomerID IN(SELECT CustomerID FROM SalesOrder)

涉及外部连接的解决方案将比使用NOT IN的解决方案性能更好。

回答MySQL而不是SQL Server,因为您后来用SQL Server标记了它,所以我想(因为这是一个采访问题,它不会打扰您,这是针对哪个DBMS的)。不过请注意,我编写的查询是标准sql,它们应该在每个RDBMS中运行。不过,每个RDBMS如何处理这些查询是另一个问题

我为你们写了这个小程序,来做一个测试用例。它像您指定的那样创建表customers和orders,我添加了主键和外键,就像通常那样。没有其他索引,因为这里值得索引的每一列都已经是主键。创建了250个客户,其中100个客户下了订单(尽管出于方便,没有一个客户下了两次/多次订单)。随后是一个数据转储,发布了脚本,以防您想通过增加数字来玩一些游戏

delimiter $$
create procedure fill_table()
begin
create table customers(customerId int primary key) engine=innodb;
set @x = 1;
while (@x <= 250) do
insert into customers values(@x);
set @x := @x + 1;
end while;

create table orders(orderId int auto_increment primary key,
customerId int,
orderDate timestamp,
foreign key fk_customer (customerId) references customers(customerId)
) engine=innodb;

insert into orders(customerId, orderDate)
select
customerId,
now() - interval customerId day
from
customers
order by rand()
limit 100;

end $$
delimiter ;

call fill_table();
好,现在来问问题。我想到了三种方法,我省略了MDisel所做的
右连接
,因为它实际上只是另一种编写
左连接
的方法。它是为懒惰的sql开发人员发明的,他们不想切换表名,而只是重写一个单词

无论如何,第一个查询:

select
c.*
from
customers c
left join orders o on c.customerId = o.customerId
where o.customerId is null;
产生如下执行计划:

+----+-------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
| id | select_type | table | type  | possible_keys | key         | key_len | ref              | rows | Extra                    |
+----+-------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
|  1 | SIMPLE      | c     | index | NULL          | PRIMARY     | 4       | NULL             |  250 | Using index              |
|  1 | SIMPLE      | o     | ref   | fk_customer   | fk_customer | 5       | wtf.c.customerId |    1 | Using where; Using index |
+----+-------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
+----+--------------------+--------+----------------+---------------+-------------+---------+------+------+--------------------------+
| id | select_type        | table  | type           | possible_keys | key         | key_len | ref  | rows | Extra                    |
+----+--------------------+--------+----------------+---------------+-------------+---------+------+------+--------------------------+
|  1 | PRIMARY            | c      | index          | NULL          | PRIMARY     | 4       | NULL |  250 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | orders | index_subquery | fk_customer   | fk_customer | 5       | func |    2 | Using index              |
+----+--------------------+--------+----------------+---------------+-------------+---------+------+------+--------------------------+
+----+--------------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
| id | select_type        | table | type  | possible_keys | key         | key_len | ref              | rows | Extra                    |
+----+--------------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
|  1 | PRIMARY            | c     | index | NULL          | PRIMARY     | 4       | NULL             |  250 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | o     | ref   | fk_customer   | fk_customer | 5       | wtf.c.customerId |    1 | Using where; Using index |
+----+--------------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+

第二个问题:

select
c.*
from
customers c
where c.customerId not in (select distinct customerId from orders);
产生如下执行计划:

+----+-------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
| id | select_type | table | type  | possible_keys | key         | key_len | ref              | rows | Extra                    |
+----+-------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
|  1 | SIMPLE      | c     | index | NULL          | PRIMARY     | 4       | NULL             |  250 | Using index              |
|  1 | SIMPLE      | o     | ref   | fk_customer   | fk_customer | 5       | wtf.c.customerId |    1 | Using where; Using index |
+----+-------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
+----+--------------------+--------+----------------+---------------+-------------+---------+------+------+--------------------------+
| id | select_type        | table  | type           | possible_keys | key         | key_len | ref  | rows | Extra                    |
+----+--------------------+--------+----------------+---------------+-------------+---------+------+------+--------------------------+
|  1 | PRIMARY            | c      | index          | NULL          | PRIMARY     | 4       | NULL |  250 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | orders | index_subquery | fk_customer   | fk_customer | 5       | func |    2 | Using index              |
+----+--------------------+--------+----------------+---------------+-------------+---------+------+------+--------------------------+
+----+--------------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
| id | select_type        | table | type  | possible_keys | key         | key_len | ref              | rows | Extra                    |
+----+--------------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
|  1 | PRIMARY            | c     | index | NULL          | PRIMARY     | 4       | NULL             |  250 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | o     | ref   | fk_customer   | fk_customer | 5       | wtf.c.customerId |    1 | Using where; Using index |
+----+--------------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+

第三个问题:

select
c.*
from
customers c
where not exists (select 1 from orders o where o.customerId = c.customerId);
产生如下执行计划:

+----+-------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
| id | select_type | table | type  | possible_keys | key         | key_len | ref              | rows | Extra                    |
+----+-------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
|  1 | SIMPLE      | c     | index | NULL          | PRIMARY     | 4       | NULL             |  250 | Using index              |
|  1 | SIMPLE      | o     | ref   | fk_customer   | fk_customer | 5       | wtf.c.customerId |    1 | Using where; Using index |
+----+-------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
+----+--------------------+--------+----------------+---------------+-------------+---------+------+------+--------------------------+
| id | select_type        | table  | type           | possible_keys | key         | key_len | ref  | rows | Extra                    |
+----+--------------------+--------+----------------+---------------+-------------+---------+------+------+--------------------------+
|  1 | PRIMARY            | c      | index          | NULL          | PRIMARY     | 4       | NULL |  250 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | orders | index_subquery | fk_customer   | fk_customer | 5       | func |    2 | Using index              |
+----+--------------------+--------+----------------+---------------+-------------+---------+------+------+--------------------------+
+----+--------------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
| id | select_type        | table | type  | possible_keys | key         | key_len | ref              | rows | Extra                    |
+----+--------------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+
|  1 | PRIMARY            | c     | index | NULL          | PRIMARY     | 4       | NULL             |  250 | Using where; Using index |
|  2 | DEPENDENT SUBQUERY | o     | ref   | fk_customer   | fk_customer | 5       | wtf.c.customerId |    1 | Using where; Using index |
+----+--------------------+-------+-------+---------------+-------------+---------+------------------+------+--------------------------+

我们可以在所有的执行计划中看到,customers表是作为一个整体读取的,但是是从索引中读取的(作为唯一列的隐式表是主键)。当您从表中选择不在索引中的其他列时,这可能会发生变化

第一个似乎是最好的。对于客户中的每一行,只读取订单中的一行。
id
列表明,MySQL可以一步完成,因为只涉及索引

第二个查询似乎是最差的(尽管所有3个查询的性能都不应该太差)。对于客户中的每一行,都会执行子查询(select_type列会告诉您这一点)

第三个查询的区别不大,因为它使用依赖子查询,但应该比第二个查询执行得更好。解释这些微小的差异将导致现在的结果。如果您感兴趣,以下是手册页面,其中解释了每列及其值的含义:


最后:我想说的是,第一个查询的性能最好,但与往常一样,最终必须进行测量、测量和测量

这是哪个具体的数据库?SQL只是许多数据库使用的查询语言-您使用哪一种?MySQL?博士后?神谕SQL Server?IBM DB2?完全是别的吗?请相应地更新您的标签!就性能而言,我不知道,但sql server中的另一种方法可能是
从Customer c中选择CustomerID,其中CustomerID不在(从SalesOrder中选择Distinct CustomerID)
,但我假设您必须查看sql server为这两个查询创建的查询计划。此外,您可能会在stackexchange的数据库站点上找到更多答案。
SO.OrderID=NULL
不会执行您认为会执行的操作。正确的语法是
SO.OrderID为NULL
如果不了解整个架构并对其进行测试,就无法回答性能问题。我会在面试中寻找的答案是“诸如此类诸如此类,但我需要对您的模式和数据进行一些性能测试以确保”或“这取决于”。我相信在您的示例中,因为您执行的是正确的连接,所以您需要where子句来表示,其中C.CustomerID为空,我无法100%肯定地说,但我认为最好的性能是从customers表开始,并基于CustomerId(因为它是customer表中的主键,也是SalesOrders表中的外键)保留对SalesOrder的连接,最后为SO.CustomerId添加where子句为NULL