Google bigquery 如何在大查询中构建一个封闭的用户步骤漏斗?

Google bigquery 如何在大查询中构建一个封闭的用户步骤漏斗?,google-bigquery,Google Bigquery,请帮我处理BigQuery查询。我需要在一个移动应用程序中建立一个封闭的用户步骤事件漏斗,持续一周 该表如下所示: 在此期间,有必要收集从步骤1到步骤2,依次到步骤6的所有唯一用户。在这些步骤之间,他们可能会做其他事情,被其他事件分散注意力。但重要的是每个独特的用户在给定的时间段内通过这些步骤 请告诉我如何创建这样一个漏斗?实现这一点的方法有多种。以下是一种使用相同样本数据的方法,该方法不是最理想的,但非常自我解释和明确: with data as ( select 'a' as use

请帮我处理BigQuery查询。我需要在一个移动应用程序中建立一个封闭的用户步骤事件漏斗,持续一周

该表如下所示:

在此期间,有必要收集从步骤1到步骤2,依次到步骤6的所有唯一用户。在这些步骤之间,他们可能会做其他事情,被其他事件分散注意力。但重要的是每个独特的用户在给定的时间段内通过这些步骤


请告诉我如何创建这样一个漏斗?

实现这一点的方法有多种。以下是一种使用相同样本数据的方法,该方法不是最理想的,但非常自我解释和明确:

with data as (
   select 'a' as user_id, cast('2020-01-01 04:45:00' as timestamp) as event_timestamp, '1' as step_name
   union all
   select 'b' as user_id, cast('2020-01-01 04:50:00' as timestamp) as event_timestamp, '1' as step_name
   union all
   select 'a' as user_id, cast('2020-01-01 05:00:00' as timestamp) as event_timestamp, '2' as step_name
   union all
   select 'a' as user_id, cast('2020-01-01 05:15:00' as timestamp) as event_timestamp, '3' as step_name
   union all
   select 'b' as user_id, cast('2020-01-01 04:55:00' as timestamp) as event_timestamp, '2' as step_name
   union all
   select 'c' as user_id, cast('2020-01-01 04:58:00' as timestamp) as event_timestamp, '1' as step_name
   union all
   select 'a' as user_id, cast('2020-01-01 05:16:00' as timestamp) as event_timestamp, '4' as step_name
   union all
   select 'b' as user_id, cast('2020-01-01 05:16:00' as timestamp) as event_timestamp, '3' as step_name
),
data2 as (
   select a.user_id,  a.step_name step_1, b.step_name step_2, c.step_name step_3, d.step_name step_4 from ( select user_id, event_timestamp, step_name from data where step_name = '1') a
   left join data b on (a.user_id = b.user_id and a.event_timestamp < b.event_timestamp and b.step_name = '2')
   left join data c on (b.user_id = c.user_id and b.event_timestamp < c.event_timestamp and c.step_name = '3')
   left join data d on (c.user_id = d.user_id and c.event_timestamp < d.event_timestamp and d.step_name = '4')
)

select * from (

   select 'step_1' as event_name, count(distinct user_id) as n_users from data2 where step_1  is not null
   group by 1
   union all 
   select 'step_2' as event_name, count(distinct user_id) as n_users from data2 where (step_1  is not null and step_2 is not null)
   group by 1
   union all 
   select 'step_3' as event_name, count(distinct user_id) as n_users from data2 where (step_1  is not null and step_2 is not null and step_3 is not null)
   group by 1
   union all 
   select 'step_4' as event_name, count(distinct user_id) as n_users from data2 where (step_1  is not null and step_2 is not null and step_3 is not null and step_4 is not null)
   group by 1
)
order by 1

您可以根据特定的过滤器、条件等进一步优化此设置。

因此,要重新表述,您需要从数据中提取通过“事件名称”的用户:“步骤1”、“步骤2”、“步骤3”步骤6',一周内?在您的流程中,如果用户到达步骤6,这意味着他必须完成前面的所有步骤,对吗?不可能直接从1跳到6?在这种情况下,问题变成过滤那些在通过步骤1后不到一周就到达步骤6的用户是的,我希望所有在一周内通过此路径并在每一步计数的唯一用户。因此,如果要得到这样一张表格step1-10000 step2-500 step3-20等等,他必须完成前面的所有步骤。他不能直接进入第6步或第4步,非常感谢!只是我没有这一步的确切时间。我不能强调它。我有一周的时间。一周内,用户按顺序完成这些步骤,每个用户的时间可以是任意的。然后,我如何从它们中提取三个漏斗步骤?s1作为选择用户伪id、用户id、事件时间戳、来自app.events的事件名称,其中event\u name=addToCart和_TABLE\u后缀介于'20210519'和'20210519'之间,s2作为选择用户伪id、用户id、事件时间戳,来自app.events的event_name,其中event_name=begin_1,且_TABLE_后缀介于'20210519'和'20210519'之间,s3为选择user_pseudo_id、user_id、event_timestamp、来自app.events的event_name=begin_2,以及'20210519'和'20210519'之间的_TABLE_后缀,