如何在MYSQL中获取同一列中多个日期之间的平均天数

如何在MYSQL中获取同一列中多个日期之间的平均天数,mysql,Mysql,我想找出数据库中按帐户id分组的订单之间的平均天数 假设我有一个名为“orders”的下表,其中包含这些数据 我想要一个查询来产生以下结果 我检查了以下问题以获得想法,但仍然无法解决问题: 谢谢 您需要一个查询,如果没有以前的购买,则该查询将生成给定null的以前购买之间的差异,并取这些值的平均值 我将自联接上表,以获得子查询中每个订单的最大订单日期。在avg函数中,计算计算日期和当前订单日期之间的差值: SELECT o3.account_id, o3.account_name, avg(di

我想找出数据库中按帐户id分组的订单之间的平均天数

假设我有一个名为“orders”的下表,其中包含这些数据

我想要一个查询来产生以下结果

我检查了以下问题以获得想法,但仍然无法解决问题:


谢谢

您需要一个查询,如果没有以前的购买,则该查询将生成给定null的以前购买之间的差异,并取这些值的平均值

我将自联接上表,以获得子查询中每个订单的最大订单日期。在avg函数中,计算计算日期和当前订单日期之间的差值:

SELECT o3.account_id, o3.account_name, avg(diff) as average_days_between_orders
FROM
    (select o1.id,
            o1.account_id,
            o1.account_name,
            datediff(o1.order_date, max(o2.order_date)) as diff
     from orders o1
     left join orders o2 on o1.account_id=o2.account_id and o1.id>o2.id
     group by o1.id, o1.account_id, o1.account_name, o1.order_date) o3
GROUP BY o3.account_id, o3.account_name
作为联接的替代方法,您可以在子查询中使用用户定义的变量,或在选择列表中使用相关子查询来计算差异。您可以检查运行total solutions的mysql,以了解此解决方案,例如。具体来说,请查看


如果您的订单表很大,那么从性能角度来看,该主题中描述的替代方法可能会更好。

注意:请仔细测试,并按照您的意愿使用。我找不到一个简单的查询。我不能保证适用于所有情况:如果您只需要答案,那么完整的查询将显示在最后

目标是,我将尝试获得一个表,其中一行包含开始日期和结束日期,然后我将简单地计算两个日期之间的平均差值。像这样的

id | account_id | account_name    | start_date | end_date
------------------------------------------------------------
1  | 342        | Kent Brewery    | 2015-09-12 | 2015-10-12
2  | 342        | Kent Brewery    | 2015-10-12 | 2015-11-12
3  | 342        | Kent Brewery    | 2015-11-12 | 2015-12-12
4  | 555        | Acme Fireworks  | 2015-06-15 | 2015-09-15
5  | 555        | Acme Fireworks  | 2015-09-15 | 2015-12-15
我将创建几个临时表,使其更加清晰。首次查询开始日期:

QUERY:

create temporary table uniq_start_dates
select (@sid := @sid + 1) id, tmp_uniq_start_dates.*
    from
        (select distinct o1.account_id, o1.account_name, o1.order_date start_date 
         from orders o1 
         join orders o2 on o1.account_id=o2.account_id and o1.order_date < o2.order_date 
         order by o1.account_id, o1.order_date) tmp_uniq_start_dates 
         join (select @sid := 0) AS sid_generator

OUTPUT: temporary table - uniq_start_dates

id | account_id | account_name    | start_date
-----------------------------------------------
1  | 342        | Kent Brewery    | 2015-09-12 
2  | 342        | Kent Brewery    | 2015-10-12 
3  | 342        | Kent Brewery    | 2015-11-12 
4  | 555        | Acme Fireworks  | 2015-06-15 
5  | 555        | Acme Fireworks  | 2015-09-15 
现在这是一个简单的部分。只需汇总并获得平均日期时间差

QUERY: 

select account_id, account_name, avg(timestampdiff(day, start_date, end_date)) average_days 
from uniq_start_end_dates
group by account_id, account_name

OUTPUT:

account_id | account_name   | average_days
--------------------------------------------
342        | Kent Brewery   | 30.3333
555        | Acme Fireworks | 91.5000
如果您可能注意到,塑料公司不在结果中。如果你关心空的平均天数。这是:

QUERY:

select all_accounts.account_id, all_accounts.account_name, accounts_with_average_days.average_days
from 
    (select distinct account_id, account_name from orders) all_accounts
left join
    (select account_id, account_name, avg(timestampdiff(day, start_date, end_date)) average_days
    from uniq_start_end_dates
    group by account_id, account_name) accounts_with_average_days
using (account_id, account_name)

OUTPUT:

account_id | account_name   | average_days
--------------------------------------------
342        | Kent Brewery   | 30.3333
555        | Acme Fireworks | 91.5000
900        | Plastic Inc.   | null
下面是一个完整的混乱查询:

select all_accounts.account_id, all_accounts.account_name, accounts_with_average_days.average_days
from 
    (select distinct account_id, account_name from orders) all_accounts
left join
    (select uniq_start_dates.account_id, uniq_start_dates.account_name, avg(timestampdiff(day, start_date, end_date)) average_days 
    from
        (select (@sid := @sid + 1) id, tmp_uniq_start_dates.*
        from
            (select distinct o1.account_id, o1.account_name, o1.order_date start_date from orders o1
            join orders o2 on o1.account_id=o2.account_id and o1.order_date < o2.order_date order by o1.account_id, o1.order_date) tmp_uniq_start_dates join (select @sid := 0) AS sid_generator
            ) uniq_start_dates 
        join
        (select (@eid := @eid + 1) id, tmp_uniq_end_dates.*
        from
            (select distinct o2.account_id, o2.account_name, o2.order_date end_date from orders o1
            join orders o2 on o1.account_id=o2.account_id and o1.order_date < o2.order_date order by o2.account_id, o2.order_date) tmp_uniq_end_dates join (select @eid := 0) AS eid_generator
            ) uniq_end_dates 
        using (id)
    group by uniq_start_dates.account_id, uniq_start_dates.account_name) accounts_with_average_days
using (account_id, account_name)

91.5属于Acme Fireworks,30.333属于Kent Brewery,这让我困惑,这似乎太复杂了。如果您正在引入用户变量,那么您实际上可以一次性计算当前记录的订单日期与任何以前的订单日期之间的差异,而不是此。关闭。当我运行该查询时,我得到以下结果:肯特啤酒厂:22.75,Acme焰火:61,塑料公司:0。我在o3.diff上做了一组concat,它显示:肯特啤酒厂:30,30,0,31,Acme烟花:0,91,92,塑料公司:0。不知何故,一个额外的差0被抛出到混合中,这会使平均值产生偏差。在这种情况下,在子查询中删除null到0的转换。对于那些只有1张订单的客户,avg将返回null。见更新的答案。工作完美!我唯一需要做的更改是MySQL中的函数date_diff是datediff
QUERY:

create temporary table uniq_start_end_dates
select uniq_start_dates.*, uniq_end_dates.end_date
from uniq_start_dates
join uniq_end_dates using (id)

OUTPUT: temporary table - uniq_start_end_dates

(the same one as the first table)
QUERY: 

select account_id, account_name, avg(timestampdiff(day, start_date, end_date)) average_days 
from uniq_start_end_dates
group by account_id, account_name

OUTPUT:

account_id | account_name   | average_days
--------------------------------------------
342        | Kent Brewery   | 30.3333
555        | Acme Fireworks | 91.5000
QUERY:

select all_accounts.account_id, all_accounts.account_name, accounts_with_average_days.average_days
from 
    (select distinct account_id, account_name from orders) all_accounts
left join
    (select account_id, account_name, avg(timestampdiff(day, start_date, end_date)) average_days
    from uniq_start_end_dates
    group by account_id, account_name) accounts_with_average_days
using (account_id, account_name)

OUTPUT:

account_id | account_name   | average_days
--------------------------------------------
342        | Kent Brewery   | 30.3333
555        | Acme Fireworks | 91.5000
900        | Plastic Inc.   | null
select all_accounts.account_id, all_accounts.account_name, accounts_with_average_days.average_days
from 
    (select distinct account_id, account_name from orders) all_accounts
left join
    (select uniq_start_dates.account_id, uniq_start_dates.account_name, avg(timestampdiff(day, start_date, end_date)) average_days 
    from
        (select (@sid := @sid + 1) id, tmp_uniq_start_dates.*
        from
            (select distinct o1.account_id, o1.account_name, o1.order_date start_date from orders o1
            join orders o2 on o1.account_id=o2.account_id and o1.order_date < o2.order_date order by o1.account_id, o1.order_date) tmp_uniq_start_dates join (select @sid := 0) AS sid_generator
            ) uniq_start_dates 
        join
        (select (@eid := @eid + 1) id, tmp_uniq_end_dates.*
        from
            (select distinct o2.account_id, o2.account_name, o2.order_date end_date from orders o1
            join orders o2 on o1.account_id=o2.account_id and o1.order_date < o2.order_date order by o2.account_id, o2.order_date) tmp_uniq_end_dates join (select @eid := 0) AS eid_generator
            ) uniq_end_dates 
        using (id)
    group by uniq_start_dates.account_id, uniq_start_dates.account_name) accounts_with_average_days
using (account_id, account_name)