Firebase 如何在google BigQuery中填写缺失的日期
我想写一个图表,显示firebase中的活动用户 我写了这段代码Firebase 如何在google BigQuery中填写缺失的日期,firebase,google-bigquery,firebase-analytics,Firebase,Google Bigquery,Firebase Analytics,我想写一个图表,显示firebase中的活动用户 我写了这段代码 SELECT event_date, COUNT(DISTINCT user_pseudo_id) AS user_count FROM `mark-3314e.analytics_197261162.events_*` WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)) AND FORMAT_D
SELECT event_date, COUNT(DISTINCT user_pseudo_id) AS user_count
FROM `mark-3314e.analytics_197261162.events_*`
WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)) AND FORMAT_DATE('%Y%m%d', CURRENT_DATE())
AND event_name = 'session_start'
GROUP BY event_date
ORDER BY event_date ASC
这就是答案
Row event_date user_count
1 20190617 1
2 20190621 3
有没有办法用以前的数据来填补21日至17日之间缺失的日期?比如:
event_date user_count
20190617 1
20190618 1
20190619 1
20190620 1
20190621 3
您可以加入包含完整日期范围的日历表:
WITH dates AS (
SELECT '20190617' AS dt UNION ALL
SELECT '20190618' UNION ALL
SELECT '20190619' UNION ALL
SELECT '20190620' UNION ALL
SELECT '20190621'
)
SELECT
t1.dt AS event_date,
COUNT(DISTINCT t2.user_pseudo_id) AS user_count
FROM dates t1
LEFT JOIN `mark-3314e.analytics_197261162.events_*` t2
ON t1.dt = t2.event_date AND
t2._TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)) AND FORMAT_DATE('%Y%m%d', CURRENT_DATE())
AND t2.event_name = 'session_start'
GROUP BY
t1.dt
ORDER BY
t1.dt;
对于在BigQuery中生成日期范围的更通用的方法,.这里有一个可能的解决方案,使用BigQuery中的
生成日期数组
函数
with data as (
SELECT parse_date('%Y%m%d', event_date) AS event_date, COUNT(DISTINCT user_pseudo_id) AS user_count
FROM `mark-3314e.analytics_197261162.events_*`
WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)) AND FORMAT_DATE('%Y%m%d', CURRENT_DATE())
AND event_name = 'session_start'
GROUP BY event_date
ORDER BY event_date ASC
)
select dt as event_date, user_count from (
select user_count,
if(
previousdate is null,
generate_date_array(date, date_sub(nextdate, interval 1 day), interval 1 day),
generate_date_array(date, if(nextdate is null, date, date_sub(nextdate, interval 1 day)), interval 1 day)
) as dates
from (
select
lag(event_date) over(order by event_date) as previousdate,
event_date as date,
lead(event_date) over(order by event_date) as nextdate,
user_count
from data
)
), unnest(dates) dt
我不得不在
组中通过t2.dt通过t1改变t2,因为它不起作用。。。然后,它仍然向我显示相同的结果(20190617 1)和(20190621 3)@AymenFezai尝试将WHERE
子句中的所有逻辑移动到join的ON
子句中。我犯了错误,应该从一开始就这样做。它用零填充缺少的数据,但应该用以前的结果填充它们。我的意思是,像这样的:(20190617 1),(20190618 1),(20190619 1)和(20190621 3)这个要求没有出现在你原来的问题中。是的,很抱歉