获取SQL Server中GROUP BY之后的groups项

获取SQL Server中GROUP BY之后的groups项,sql,sql-server,tsql,Sql,Sql Server,Tsql,我有一个包含以下列的表: UserID1, UserID2, ProductID, PurchaseDate 以下查询在purchases表中运行,并返回两个用户,这些用户之间有多次交互,而不管过去31天的顺序如何: DECLARE @threshold AS INT DECLARE @days AS INT SET @threshold = 10 SET @days = 31 SELECT UserID1, UserID2, COUNT(*) AS Counter FROM

我有一个包含以下列的表:

UserID1, UserID2, ProductID, PurchaseDate
以下查询在purchases表中运行,并返回两个用户,这些用户之间有多次交互,而不管过去31天的顺序如何:

DECLARE @threshold AS INT
DECLARE @days AS INT

SET @threshold = 10
SET @days = 31

SELECT 
    UserID1, UserID2, COUNT(*) AS Counter
FROM 
    (SELECT
        --do this to revert columns and count as one case both Col1,Col2 and Col2,Col1
        CASE 
           WHEN UserID1 < UserID2 
              THEN UserID1 
              ELSE UserID2 
        END AS UserID1,
        CASE 
           WHEN UserID1 < UserID2 
              THEN UserID2 
              ELSE UserID1 
        END AS UserID2
    FROM
        Purchases WITH(NOLOCK)
    WHERE 
        Deadline BETWEEN DATEADD(day, -@days, GETDATE()) AND GETDATE()) t
GROUP BY 
    UserID1, UserID2
HAVING 
    COUNT(*) > @threshold
但是,我想要的是返回一个表,其中ProductID和PurchaseDate在单独的行中,如下所示

UserID1  UserID2  ProductID  PurchaseDate
1        2        12345      2017-01-18 00:13:52
1        2        5425       2017-01-12 15:10:02
1        2        64362      2017-01-05 10:10:02
..... for the 10 interactions
3        2        25235      2017-01-18 00:13:52
3        2        436346     2017-01-14 00:13:52
..... for the 5 interactions
4        1        23523      2017-01-14 00:13:52
4        1        135135     2017-01-09 00:13:52
..... for the 8 interactions
有没有办法不将第一个查询的结果放在临时表中,然后再次将其与采购表连接以查找所有采购?

免责声明:我没有测试代码,它是在T-SQL IDE之外编写的。 下面的代码基于以下假设:UserID1!=UserID2

1我建议使用最大/最小值解决方案以与[Col2,Col1]相同的方式处理[Col1,Col2]。它的性能可能会更好,并且能够正确地处理空值。您需要SQL Server 2008或更高版本才能使其正常工作

SELECT
    (SELECT MAX(usr) FROM (VALUES (UserID1), (UserID2) as User(usr)) as UserID1,
    (SELECT MIN(usr) FROM (VALUES (UserID1), (UserID2) as User(usr)) as UserID2
FROM
    Purchases
现在我们需要计算它们之间的相互作用,这应该很容易。为了保持代码干净,我们可以在前面的语句中使用CTE,我在这里添加了截止日期过滤器:

;WITH CTE_UserInteractions AS (
    SELECT
        (SELECT MAX(usr) FROM (VALUES (UserID1), (UserID2) as User(usr)) as FirstUser,
        (SELECT MIN(usr) FROM (VALUES (UserID1), (UserID2) as User(usr)) as SecondUser
    FROM
        Purchases
    WHERE
        Deadline BETWEEN DATEADD(day,-@days,GETDATE()) AND GETDATE()
)

SELECT
    FirstUser,
    SecondUser
FROM
    CTE_UserInteractions
GROUP BY
    FirstUser, SecondUser
HAVING
    COUNT(*) > @Threshold
这里请注意:您可能会发现提前计算左截止日期边界会对性能产生积极影响。例如,在运行批处理之前,我们可以执行以下操作:

DECLARE @StartDate DATETIME = DATEADD(DAY,-@days,GETDATE())
然后我们可以在WHERE子句中使用@StartDate

3最后,我们可以使用CROSS APPLY获得结果为用户对留下的产品和购买清单。若性能受到影响,我们可以使用“选择我的解决方案”子列表,也可以使用步骤2的结果预填充临时表

;WITH CTE_UserInteractions AS (
    SELECT
        (SELECT MAX(usr) FROM (VALUES (UserID1), (UserID2) as User(usr)) as FirstUser,
        (SELECT MIN(usr) FROM (VALUES (UserID1), (UserID2) as User(usr)) as SecondUser
    FROM
        Purchases AS p1
    WHERE
        Deadline BETWEEN DATEADD(day,-@days,GETDATE()) AND GETDATE()
)

SELECT
    groupedUsers.FirstUser as UserID1,
    groupedUsers.SecondUser as UserID2,
    products.ProductID,
    products.PurchaseDate
FROM (
    SELECT
        FirstUser,
        SecondUser
    FROM
        CTE_UserInteractions
    GROUP BY
        FirstUser, SecondUser
    HAVING
        COUNT(*) > @Threshold
) groupedUsers
CROSS APPLY (
    SELECT
        ProductID, PurchaseDate
    FROM
        Purchases AS p1
    WHERE
        p1.UserID1 = FirstUser AND p1.UserID2 = SecondUser
    UNION ALL
    SELECT
        ProductID, PurchaseDate
    FROM
        Purchases AS p2
    WHERE
        p2.UserID2 = FirstUser AND p2.UserID1 = SecondUser
) products

如果我理解正确,那么简单的窗口计数在这里会有所帮助

乐观主义者应该足够聪明,只需扫描一次表就可以做到这一点

DECLARE @threshold AS INT;
DECLARE @days AS INT;

SET @threshold = 10;
SET @days = 31;

WITH
CTE_Purchases
AS
(
    SELECT
        --do this to revert columns and count as one case both Col1,Col2 and Col2,Col1
        CASE 
            WHEN UserID1 < UserID2 
            THEN UserID1 
            ELSE UserID2 
        END AS UserID1
        ,CASE 
            WHEN UserID1 < UserID2 
            THEN UserID2 
            ELSE UserID1 
        END AS UserID2
        ,ProductID
        ,PurchaseDate
    FROM
        Purchases
    WHERE 
        Deadline BETWEEN DATEADD(day, -@days, GETDATE()) AND GETDATE()
)
,CTE_Counts
AS
(
    SELECT
        UserID1
        ,UserID2
        ,ProductID
        ,PurchaseDate
        ,COUNT(*) OVER (PARTITION BY UserID1, UserID2) AS Counter
        -- calc COUNT for groups without explicit GROUP BY
    FROM CTE_Purchases
)
SELECT
    UserID1
    ,UserID2
    ,ProductID
    ,PurchaseDate
    ,Counter
FROM CTE_Counts
WHERE Counter > @threshold
-- this filter is instead of your HAVING
;

你能不能先发布你的源表结构,一些示例,然后是完整的期望输出?看起来好像你正在完成一个解决方案的一半,并要求我们完成它,而一个完全不同的解决方案可能更合适。哇。它起作用了。老实说,我想不起来。仍然没有在CTE上挖掘足够的信息,似乎我也有:
DECLARE @threshold AS INT;
DECLARE @days AS INT;

SET @threshold = 10;
SET @days = 31;

WITH
CTE_Purchases
AS
(
    SELECT
        --do this to revert columns and count as one case both Col1,Col2 and Col2,Col1
        CASE 
            WHEN UserID1 < UserID2 
            THEN UserID1 
            ELSE UserID2 
        END AS UserID1
        ,CASE 
            WHEN UserID1 < UserID2 
            THEN UserID2 
            ELSE UserID1 
        END AS UserID2
        ,ProductID
        ,PurchaseDate
    FROM
        Purchases
    WHERE 
        Deadline BETWEEN DATEADD(day, -@days, GETDATE()) AND GETDATE()
)
,CTE_Counts
AS
(
    SELECT
        UserID1
        ,UserID2
        ,ProductID
        ,PurchaseDate
        ,COUNT(*) OVER (PARTITION BY UserID1, UserID2) AS Counter
        -- calc COUNT for groups without explicit GROUP BY
    FROM CTE_Purchases
)
SELECT
    UserID1
    ,UserID2
    ,ProductID
    ,PurchaseDate
    ,Counter
FROM CTE_Counts
WHERE Counter > @threshold
-- this filter is instead of your HAVING
;