Sql server 以客户为控制对象,整体应用10%进一步增加控制数量。我想我们需要改变这一点。有什么想法吗?@dnarb。实际上,对于任何给定的零售商来说,这都非常接近10%的阈值,因此这似乎是一个问题。谢谢@gordon Linoff:我尝试了你的代码,但最终的分割不是1

Sql server 以客户为控制对象,整体应用10%进一步增加控制数量。我想我们需要改变这一点。有什么想法吗?@dnarb。实际上,对于任何给定的零售商来说,这都非常接近10%的阈值,因此这似乎是一个问题。谢谢@gordon Linoff:我尝试了你的代码,但最终的分割不是1,sql-server,tsql,Sql Server,Tsql,以客户为控制对象,整体应用10%进一步增加控制数量。我想我们需要改变这一点。有什么想法吗?@dnarb。实际上,对于任何给定的零售商来说,这都非常接近10%的阈值,因此这似乎是一个问题。谢谢@gordon Linoff:我尝试了你的代码,但最终的分割不是10-90(@dnarb…并非如此,因为在多个零售商中的客户自动处于控制中。其余的被分成90/10。默认情况下,控制中的某些客户加上剩余客户的10%会导致超过所有客户的10%:-)在“0.1*计数(*)中是否仍应为0.1?因为我们已经将重复客户分


以客户为控制对象,整体应用10%进一步增加控制数量。我想我们需要改变这一点。有什么想法吗?@dnarb。实际上,对于任何给定的零售商来说,这都非常接近10%的阈值,因此这似乎是一个问题。谢谢@gordon Linoff:我尝试了你的代码,但最终的分割不是10-90(@dnarb…并非如此,因为在多个零售商中的客户自动处于控制中。其余的被分成90/10。默认情况下,控制中的某些客户加上剩余客户的10%会导致超过所有客户的10%:-)在“0.1*计数(*)中是否仍应为0.1?因为我们已经将重复客户分类为控制,所以应用10%的总体进一步增加了控制数量。我想我们需要改变这一点。有什么想法吗?@dnarb。事实上,对于任何给定的零售商来说,这都非常接近10%的门槛,所以这似乎是一个令人担忧的问题。你能单独解释一下吗。你怎么能在这里找到min(零售代码)?怎么可能呢?其RetailerCode是字符文件类型。我只是想知道这怎么可能?你能分享一些见解吗?这将有助于我今后更好地工作。Thanks@dnarb:MIN/MAX适用于(几乎)每种数据类型。这里类似于
按RetailerCode从tab ORDER中选择排名前1的RetailerCode
(与INT/DATE/TIMESTAMP等相同)Qucik问题。你能单独解释一下吗。你怎么能在这里找到min(零售代码)?怎么可能呢?其RetailerCode是字符文件类型。我只是想知道这怎么可能?你能分享一些见解吗?这将有助于我今后更好地工作。Thanks@dnarb:MIN/MAX适用于(几乎)每种数据类型。这里类似于
按RetailerCode从tab ORDER中选择排名前1的RetailerCode
(与INT/DATE/TIMESTAMP等相同)
RetailerCode    CID        Segment
A6005         13SVC15       High
A6005         19VDE1F       Low
A6005         1B3BD1F       Medium
A6005         1B3HB48       Medium
A6005         1B3HB49       Low
A9006         1B3HB40       High
A9006         1B3HB41       High
A9006         1B3HB43       Low
A9006         1B3HB46       Medium
-   If a customer is tied to two or more retailers, then he should be in control group
-   Each retailer will be provided with list of customers to target for the campaigns and the retailer will run the campaign. Here, 
o   Test-Control split should be done at Retailer level and then at segment level. For example,  For each retailer
   10% of their High customers to control and remaining 90% of their high customers to test.
   10% of their Medium customers to control and remaining 90% of their Medium customers to test
   10% of their Low customers to control and remaining 90% of their Low customers to test.
select RetailerCode, CID, Segment,
       (case when auto_in_control = 1 then 'control'
             when row_number() over (partition by retailercode, segment, auto_in_control order by newid()) <= 
                  0.1 * count(*) over (partition by retailercode, segment, auto_in_control) 
             then 'control'
             else 'test'
        end) as group
from (select t.*,
             min(RetailerCode) over (partition by cid) as min_rc,
             max(RetailerCode) over (partition by cid) as max_rc,
             (case when min_rc = max_rc then 0 else 1 end) as auto_in_control
      from Table t
     ) t
Order by RetailerCode;
select RetailerCode, CID, Segment,
       (case when row_number() over (partition by retailercode, segmentorder by auto_in_control desc, newid()) <= 
                  0.1 * count(*) over (partition by retailercode, segment) 
             then 'control'
             else 'test'
        end) as group
from (select t.*,
             min(RetailerCode) over (partition by cid) as min_rc,
             max(RetailerCode) over (partition by cid) as max_rc,
             (case when min_rc = max_rc then 0 else 1 end) as auto_in_control
      from Table t
     ) t
Order by RetailerCode;
SELECT RetailerCode, CID, Segment,
   CASE WHEN Percent_Rank()
              Over (PARTITION BY retailercode, segment -- for each retailer/segment
                    ORDER BY ControlGroup, newid()     -- all customers with multiple retailers are sorted low, i.e. will be in control group (if it's less than 10%)
                   ) <= 0.1 
             THEN 'control'
             ELSE 'test'
   END AS GROUP
FROM
 (
   SELECT t.*,
      -- flag customers to be put in control group
      CASE WHEN Min(RetailerCode) Over (PARTITION BY CID)
              = Max(RetailerCode) Over (PARTITION BY CID)
           THEN 1 -- only a single retailer
           ELSE 0 -- multiple retailers 
      END AS ControlGroup
-- if the RetailerCode/CID combination is unique:
--      CASE WHEN Count(*) Over (PARTITION BY CID) = 1
--           THEN 1 -- only a single retailer
--           ELSE 0 -- multiple retailers 
--      END AS ControlGroup
   FROM tab t
 ) AS dt;