Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/sql-server-2008/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Sql server 2008 将DISTINCT与UNION/UNION ALL一起使用_Sql Server 2008 - Fatal编程技术网

Sql server 2008 将DISTINCT与UNION/UNION ALL一起使用

Sql server 2008 将DISTINCT与UNION/UNION ALL一起使用,sql-server-2008,Sql Server 2008,我偶然发现了一段执行缓慢的代码,它看起来很像这样 SELECT res.[X], res.[Y], SUM(res.[Z]) -- This is SUM so I have to remove duplicates FROM ( SELECT DISTINCT a.[X], a.[Y], b.[Z] FROM [A] a JOIN [B] b ON a.[ID] = b.[ID] UNION SELECT DISTINCT a.[X], a.[Y], c.[Z] FRO

我偶然发现了一段执行缓慢的代码,它看起来很像这样

SELECT
  res.[X],
  res.[Y],
  SUM(res.[Z]) -- This is SUM so I have to remove duplicates
FROM (
  SELECT DISTINCT a.[X], a.[Y], b.[Z] FROM [A] a JOIN [B] b ON a.[ID] = b.[ID]
  UNION
  SELECT DISTINCT a.[X], a.[Y], c.[Z] FROM [A] a JOIN [C] c ON a.[ID] = c.[ID]
  UNION
  SELECT DISTINCT a.[X], a.[Y], d.[Z] FROM [A] a JOIN [D] d ON a.[ID] = d.[ID]
  UNION ALL -- This set won't have duplicates, hence the UNION ALL in this case
  SELECT a.[X], a.[Y], n.[Z] FROM [A] a JOIN [N] n ON a.[ID] = n.[ID]
) res
GROUP BY res.[X], res.[Y]
连接要复杂得多,其中有12个并集/并集所有,但您可以了解情况。每个结果集通常包含100万到1500万行

我想知道其他人会怎么写这个查询。我还读了一些其他的帖子,警告人们:

SELECT DISTINCT * FROM [A]
UNION
SELECT DISTINCT * FROM [B]
因为DISTINCT被调用了三次(在这个小示例中)。所以我打了一针,把它去掉了。结果实际上慢了很多。我不明白删除额外的筛选将如何导致查询运行变慢


有人有什么想法吗?我正在深入研究查询计划,但它太大了,无法发布,所以我只是在寻找建议。谢谢

您可以尝试使用行号并为每个X、Y和Z选择最小值

select 
    MIN(myfilter),
    X,
    Y,
    Z
SELECT 
    a.[X], 
    a.[Y], 
    b.[Z] 
RowNumber() over (order by A.X, a.Y, b.Z) as MyFilter

    FROM [A] a JOIN [B] b ON a.[ID] = b.[ID]
)
group by x,y,z

你能给我更多关于你们桌子的细节吗?我不太明白你的意思。我的总和会发生什么变化?那应该选择出重复的数据;您的总和需要围绕选择值