Sql server 以客户为控制对象,整体应用10%进一步增加控制数量。我想我们需要改变这一点。有什么想法吗?@dnarb。实际上,对于任何给定的零售商来说,这都非常接近10%的阈值,因此这似乎是一个问题。谢谢@gordon Linoff:我尝试了你的代码,但最终的分割不是1
Sql server 以客户为控制对象,整体应用10%进一步增加控制数量。我想我们需要改变这一点。有什么想法吗?@dnarb。实际上,对于任何给定的零售商来说,这都非常接近10%的阈值,因此这似乎是一个问题。谢谢@gordon Linoff:我尝试了你的代码,但最终的分割不是1,sql-server,tsql,Sql Server,Tsql,以客户为控制对象,整体应用10%进一步增加控制数量。我想我们需要改变这一点。有什么想法吗?@dnarb。实际上,对于任何给定的零售商来说,这都非常接近10%的阈值,因此这似乎是一个问题。谢谢@gordon Linoff:我尝试了你的代码,但最终的分割不是10-90(@dnarb…并非如此,因为在多个零售商中的客户自动处于控制中。其余的被分成90/10。默认情况下,控制中的某些客户加上剩余客户的10%会导致超过所有客户的10%:-)在“0.1*计数(*)中是否仍应为0.1?因为我们已经将重复客户分
以客户为控制对象,整体应用10%进一步增加控制数量。我想我们需要改变这一点。有什么想法吗?@dnarb。实际上,对于任何给定的零售商来说,这都非常接近10%的阈值,因此这似乎是一个问题。谢谢@gordon Linoff:我尝试了你的代码,但最终的分割不是10-90(@dnarb…并非如此,因为在多个零售商中的客户自动处于控制中。其余的被分成90/10。默认情况下,控制中的某些客户加上剩余客户的10%会导致超过所有客户的10%:-)在“0.1*计数(*)中是否仍应为0.1?因为我们已经将重复客户分类为控制,所以应用10%的总体进一步增加了控制数量。我想我们需要改变这一点。有什么想法吗?@dnarb。事实上,对于任何给定的零售商来说,这都非常接近10%的门槛,所以这似乎是一个令人担忧的问题。你能单独解释一下吗。你怎么能在这里找到min(零售代码)?怎么可能呢?其RetailerCode是字符文件类型。我只是想知道这怎么可能?你能分享一些见解吗?这将有助于我今后更好地工作。Thanks@dnarb:MIN/MAX适用于(几乎)每种数据类型。这里类似于
按RetailerCode从tab ORDER中选择排名前1的RetailerCode
(与INT/DATE/TIMESTAMP等相同)Qucik问题。你能单独解释一下吗。你怎么能在这里找到min(零售代码)?怎么可能呢?其RetailerCode是字符文件类型。我只是想知道这怎么可能?你能分享一些见解吗?这将有助于我今后更好地工作。Thanks@dnarb:MIN/MAX适用于(几乎)每种数据类型。这里类似于按RetailerCode从tab ORDER中选择排名前1的RetailerCode
(与INT/DATE/TIMESTAMP等相同)
RetailerCode CID Segment
A6005 13SVC15 High
A6005 19VDE1F Low
A6005 1B3BD1F Medium
A6005 1B3HB48 Medium
A6005 1B3HB49 Low
A9006 1B3HB40 High
A9006 1B3HB41 High
A9006 1B3HB43 Low
A9006 1B3HB46 Medium
- If a customer is tied to two or more retailers, then he should be in control group
- Each retailer will be provided with list of customers to target for the campaigns and the retailer will run the campaign. Here,
o Test-Control split should be done at Retailer level and then at segment level. For example, For each retailer
10% of their High customers to control and remaining 90% of their high customers to test.
10% of their Medium customers to control and remaining 90% of their Medium customers to test
10% of their Low customers to control and remaining 90% of their Low customers to test.
select RetailerCode, CID, Segment,
(case when auto_in_control = 1 then 'control'
when row_number() over (partition by retailercode, segment, auto_in_control order by newid()) <=
0.1 * count(*) over (partition by retailercode, segment, auto_in_control)
then 'control'
else 'test'
end) as group
from (select t.*,
min(RetailerCode) over (partition by cid) as min_rc,
max(RetailerCode) over (partition by cid) as max_rc,
(case when min_rc = max_rc then 0 else 1 end) as auto_in_control
from Table t
) t
Order by RetailerCode;
select RetailerCode, CID, Segment,
(case when row_number() over (partition by retailercode, segmentorder by auto_in_control desc, newid()) <=
0.1 * count(*) over (partition by retailercode, segment)
then 'control'
else 'test'
end) as group
from (select t.*,
min(RetailerCode) over (partition by cid) as min_rc,
max(RetailerCode) over (partition by cid) as max_rc,
(case when min_rc = max_rc then 0 else 1 end) as auto_in_control
from Table t
) t
Order by RetailerCode;
SELECT RetailerCode, CID, Segment,
CASE WHEN Percent_Rank()
Over (PARTITION BY retailercode, segment -- for each retailer/segment
ORDER BY ControlGroup, newid() -- all customers with multiple retailers are sorted low, i.e. will be in control group (if it's less than 10%)
) <= 0.1
THEN 'control'
ELSE 'test'
END AS GROUP
FROM
(
SELECT t.*,
-- flag customers to be put in control group
CASE WHEN Min(RetailerCode) Over (PARTITION BY CID)
= Max(RetailerCode) Over (PARTITION BY CID)
THEN 1 -- only a single retailer
ELSE 0 -- multiple retailers
END AS ControlGroup
-- if the RetailerCode/CID combination is unique:
-- CASE WHEN Count(*) Over (PARTITION BY CID) = 1
-- THEN 1 -- only a single retailer
-- ELSE 0 -- multiple retailers
-- END AS ControlGroup
FROM tab t
) AS dt;