Sql 在列上进行区分,并在postgres中查找过去几个月的总和
我有一个类似于此示例的表格:Sql 在列上进行区分,并在postgres中查找过去几个月的总和,sql,postgresql,distinct,Sql,Postgresql,Distinct,我有一个类似于此示例的表格: purchase_datetime customer_id value purchase_id 2013-01-08 17:13:29 45236 92 2526 2013-01-03 15:42:35 45236 16 2565 2013-01-03 15:42:35 45236 16 2565 201
purchase_datetime customer_id value purchase_id
2013-01-08 17:13:29 45236 92 2526
2013-01-03 15:42:35 45236 16 2565
2013-01-03 15:42:35 45236 16 2565
2013-03-08 09:04:52 45236 636 2563
2013-12-08 12:12:24 45236 23 2505
2013-12-08 12:12:24 45236 23 2505
2013-12-08 12:12:24 45236 23 2505
2013-12-08 12:12:24 45236 23 2505
2013-07-08 22:35:53 35536 73 2576
2013-07-08 09:52:03 35536 4 5526
2013-10-08 16:23:29 52626 20 2226
...
2013-04-08 17:49:31 52626 27 4526
2013-12-09 20:40:53 52626 27 4626
现在,我需要找到客户在过去几个月内每次购买(购买id)的总花费金额(价值)。但是我有一个问题,因为有两倍的购买id,所以我需要在购买id上进行区分
这就是我到目前为止在没有distinct的情况下得到的,我不知道如何接近distinct
Select customer_id
sum(case when ( date '2017-01-01' - purchase_datetime::DATE <=30) then value else 0 end) as 1month,
sum( case when ( date '2017-01-01' - purchase_datetime::DATE <=90) then value else 0 end) as 3month,
sum( case when ( date '2017-01-01' - purchase_datetime::DATE <=180) then value else 0 end) as 6month,
sum( case when ( date '2017-01-01' - purchase_datetime::DATE <=360) then value else 0 end) as 12month
FROM table_data
GROUP BY (customer_id)
ORDER BY amount_1month DESC;
您可以在子查询上选择,并在该子查询中使用DISTINCT(或GROUP BY) 例如:
测试数据:
create table table_data (purchase_datetime timestamp(0),customer_id int,"value" int,purchase_id int);
insert into table_data (purchase_datetime,customer_id,"value",purchase_id) values
(current_timestamp - interval '11 month',45236,92,2526),
(current_timestamp - interval '11 month',45236,16,2565),
(current_timestamp - interval '1 month',45236,16,2565),
(current_timestamp - interval '2 month',45236,636,2563),
(current_timestamp - interval '5 month',45236,23,2505),
(current_timestamp - interval '5 month',45236,23,2505),
(current_timestamp - interval '5 month',45236,23,2505),
(current_timestamp - interval '3 month',35536,73,2576),
(current_timestamp - interval '2 month',35536,4,5526),
(current_timestamp - interval '1 month',52626,20,2226),
(current_timestamp - interval '6 month',52626,27,4526),
(current_timestamp - interval '6 month',52626,27,4626);
我不明白你为什么要来这里?您的分组依据将不会返回任何重复的行。@jarlh如果我删除分组依据怎么办?我不确定,但当我这样做时,我得到了一个巨大的数字,因为有多行数据相同,就像第五行到第九行一样。使用GROUP BY,每个客户id可以得到一行。这不是你想要的吗?@jarlh我想要一行购买id,例如,我从第五行到第八行有4行数据相同,我需要3行数据才能消失,从数据中删除。所需的输出不显示总和。它仅仅是
从表数据中选择distinct*
。这就是你想要的吗?
SELECT
customer_id,
sum(case when purchase_datetime::DATE between current_date - interval '1 month' and current_date then "value" else 0 end) as "1month",
sum(case when purchase_datetime::DATE between current_date - interval '3 month' and current_date then "value" else 0 end) as "3month",
sum(case when purchase_datetime::DATE between current_date - interval '6 month' and current_date then "value" else 0 end) as "6month",
sum(case when purchase_datetime::DATE between current_date - interval '1 year' and current_date then "value" else 0 end) as "12month"
FROM (
select
distinct purchase_id, customer_id, purchase_datetime, "value"
-- distinct on (purchase_id) customer_id, purchase_datetime, "value"
-- Note: with this type of distinct you assume that for each purchase_id there is only 1 combination of the 3 other field values.
from table_data
) p
GROUP BY customer_id
ORDER BY "1month" DESC;
create table table_data (purchase_datetime timestamp(0),customer_id int,"value" int,purchase_id int);
insert into table_data (purchase_datetime,customer_id,"value",purchase_id) values
(current_timestamp - interval '11 month',45236,92,2526),
(current_timestamp - interval '11 month',45236,16,2565),
(current_timestamp - interval '1 month',45236,16,2565),
(current_timestamp - interval '2 month',45236,636,2563),
(current_timestamp - interval '5 month',45236,23,2505),
(current_timestamp - interval '5 month',45236,23,2505),
(current_timestamp - interval '5 month',45236,23,2505),
(current_timestamp - interval '3 month',35536,73,2576),
(current_timestamp - interval '2 month',35536,4,5526),
(current_timestamp - interval '1 month',52626,20,2226),
(current_timestamp - interval '6 month',52626,27,4526),
(current_timestamp - interval '6 month',52626,27,4626);
select customer_id, sum(value)
from (
select distinct on (purchase_id) *
from t
) s
where purchase_datetime >= '2017-07-01'
group by 1
;
customer_id | sum
-------------+-----
35536 | 77
52626 | 20
45236 | 23