SQL将行分组为对_Sql_Snowflake Cloud Data Platform

SQL将行分组为对

sql snowflake-cloud-data-platform

SQL将行分组为对,sql,snowflake-cloud-data-platform,Sql,Snowflake Cloud Data Platform,我正在尝试向成对的行组成的分区添加某种类型的唯一标识符（uid），即在一个大小为2行的窗口分区中，为每两行（identifier1，identifier2）生成一些uid/标记例如，ID X的前两行将得到uid A，同一ID的下两行将得到uid B，如果分区中只剩下一行ID X，则将得到ID C 以下是我试图实现的目标，图片说明了表的结构，我手动添加了expectedIdentifier来说明目标：这是我当前的SQL，ntile无法解决它，因为分区大小不同： select rowId ,

我正在尝试向成对的行组成的分区添加某种类型的唯一标识符（uid），即在一个大小为2行的窗口分区中，为每两行（identifier1，identifier2）生成一些uid/标记

例如，ID X的前两行将得到uid A，同一ID的下两行将得到uid B，如果分区中只剩下一行ID X，则将得到ID C

以下是我试图实现的目标，图片说明了表的结构，我手动添加了expectedIdentifier来说明目标：

这是我当前的SQL，ntile无法解决它，因为分区大小不同：

select
rowId
, ntile(2) over (partition by firstIdentifier, secondIdentifier order by timestamp asc) as ntile
, *
from log;

已经尝试了ntile（（count（*）over partition…/2），但不起作用

可以使用md5（）或类似工具生成UID，但是如上所示标记行时遇到问题（因此我可以使用md5生成的标记/UID）

虽然雪花窗口函数中不支持count（*），但支持count（1）并可用于创建唯一标识符。下面是一个整数唯一ID匹配行对并处理“奇数”行组的示例：

select 
ntile(2) over (partition by firstIdentifier, secondIdentifier order by timestamp asc) as ntile
,ceil(count(1) over( partition by firstIdentifier, secondIdentifier order by timestamp asc) / 2) as id
, *
from log;

这是Sql Server版本

with log (firstidentifier,secondidentifier, timestamp)
as (
select 15396, 14460, 1 union all
select 15396, 14460, 1 union all
select 19744, 14451, 1 union all
select 19744, 14451, 1 union all
select 19744, 14451, 1 union all
select 15590, 12404, 1 union all
select 15590, 12404, 1 union all
select 15590, 12404, 1 union all
select 15590, 12404, 1 union all
select 15590, 12404, 1 
)
select *, char(65 + (row_number() over(partition by 
firstidentifier,secondidentifier order by timestamp)-1)/2) 
expectedidentifier from log 
order by firstidentifier,secondidentifier,timestamp

谢谢，但这并没有输出预期的结果。对不起，我错过了时间戳相同的例子。已使用行号更新查询。。现在应该可以了。几乎解决了这个问题，但是突出显示的行应该设置为1而不是2：我无法重现这个问题：即使时间戳与您的屏幕截图中的时间戳相同，我仍然会为该数据获取1,2,3对。可以发布用于生成结果的完整SQL吗？

with log (firstidentifier,secondidentifier, timestamp)
as (
select 15396, 14460, 1 union all
select 15396, 14460, 1 union all
select 19744, 14451, 1 union all
select 19744, 14451, 1 union all
select 19744, 14451, 1 union all
select 15590, 12404, 1 union all
select 15590, 12404, 1 union all
select 15590, 12404, 1 union all
select 15590, 12404, 1 union all
select 15590, 12404, 1 
)
select *, char(65 + (row_number() over(partition by 
firstidentifier,secondidentifier order by timestamp)-1)/2) 
expectedidentifier from log 
order by firstidentifier,secondidentifier,timestamp