Sql server 四分位数范围-下限、上限和中位数

Sql server 四分位数范围-下限、上限和中位数,sql-server,vb.net,excel,Sql Server,Vb.net,Excel,我试图根据任意长度的数字数组计算出四分位数的范围 1, 1, 5, 6, 7, 8, 2, 4, 7, 9, 9, 9, 9 我需要从这个四分位数范围计算出的值是: 上四分位数 中间带 下四分位数 如果将上述数字数组转储到Microsoft Excel(A:M列),则可以使用以下公式: =QUARTILE.INC(A1:M1,1) =QUARTILE.INC(A1:M1,2) =QUARTILE.INC(A1:M1,3) 要得到我的答案: 四, 七, 九,

我试图根据任意长度的数字数组计算出四分位数的范围

1,  1,  5,  6,  7,  8,  2,  4,  7,  9,  9,  9,  9
我需要从这个四分位数范围计算出的值是:

  • 上四分位数
  • 中间带
  • 下四分位数
如果将上述数字数组转储到Microsoft Excel(A:M列),则可以使用以下公式:

  • =QUARTILE.INC(A1:M1,1)
  • =QUARTILE.INC(A1:M1,2)
  • =QUARTILE.INC(A1:M1,3)
要得到我的答案:

  • 四,
  • 七,
  • 九,
我现在需要在SQL Server或VB.NET中计算出这3个值。我可以用这两种语言中的任何一种获取任何格式或对象的数组值,但是我找不到任何像Excel中的
QUARTILE.INC
函数那样存在的函数


有人知道如何在SQL Server或VB.NET中实现这一点吗?

如果我误解了您的意思,我深表歉意,但这可以通过使用
NTILE()
和以后的
行数()

SQL代码:

;WITH FirstStep (NT, N)
AS (
    SELECT NTILE(3) OVER (ORDER BY T.column1), T.column1
    FROM dbo.GetTableFromList_Int('1,  1,  5,  6,  7,  8,  2,  4,  7,  9,  9,  9,  9', ',') AS T
),
SecondStep (RN, NT, N)
AS (
    SELECT ROW_NUMBER() OVER (PARTITION BY T.NT ORDER BY T.N DESC), NT, T.N
    FROM FirstStep AS T
)
SELECT N
FROM SecondStep
WHERE RN = 1
说明:

  • TVF将我的字符串拆分为行(不同的行)
  • 我们使用
    NTILE(3)
    将其分为三类,按列表排序(IIRC您需要对列表排序以获得正确的值)
  • 然后使用
    ROW\u NUMBER()
    在每个组中获得正确的值
在您的场景中,它会返回预期结果


如果这不是您需要的,那么可以修改它以获得正确的输出。

可能有一种更简单的方法,但要获得四分位数,您可以使用

将有序分区中的行分配到指定数量的组中。分组编号,从一开始。对于每一行,NTILE返回该行所属组的编号

因此,对于您的数据:

SELECT  1 Val
INTO    #temp
UNION ALL
SELECT  1
UNION ALL
SELECT  5
UNION ALL
SELECT  6
UNION ALL
SELECT  7
UNION ALL
SELECT  8
UNION ALL
SELECT  2
UNION ALL
SELECT  4
UNION ALL
SELECT  7
UNION ALL
SELECT  9
UNION ALL
SELECT  9
UNION ALL
SELECT  9
UNION ALL
SELECT  9

-- NTILE(4) specifies you require 4 partitions (quartiles)
SELECT  NTILE(4) OVER ( ORDER BY Val ) AS Quartile ,
        Val
INTO #tempQuartiles
FROM    #temp

SELECT * 
FROM #tempQuartiles

DROP TABLE #temp
DROP TABLE #tempQuartiles
这将产生:

Quartile    Val
1           1
1           1
1           2
1           4
2           5
2           6
2           7
3           7
3           8
3           9
4           9
4           9
4           9
从这个你可以知道你在追求什么

因此,修改
选择
可以执行以下操作:

SELECT Quartile, MAX(Val) MaxVal
FROM #tempQuartiles
WHERE Quartile <= 3
GROUP BY Quartile

我们创建了一个用户定义的类型,将其用作函数参数,然后以这种方式使用它

我们的实现使用与Excel百分位函数相同的计算

将类型[dbo].[floatListType]创建为表(
[值]浮点不为空
);
去
创建函数[dbo]。[getPercentile]
(
@数据floatListType只读,
@百分位数浮动
)
返回浮动
作为
开始
声明@values表
(
价值浮动,
idx int
);
插入到@values中
选择值(按值排序)-1上的行号()作为idx
从@数据;
声明@cnt int=(从@values中选择count(*)
,@n float=(@cnt-1)*@percentile+1
,@k int=FLOOR(@n)
,@d float=@n-@k;
如果(@k=0)
返回(从idx=0的@values中选择值)
如果(@k=@cnt)
返回(从@values中选择值,其中idx=@cnt-1)
如果(@k>0和@k<@cnt)
返回(从@values中选择值,其中idx=@k-1)
+@d*((从@values中选择值,其中idx=@k)
-(从@values中选择值,其中idx=@k-1))
返回null;
结束
您可以这样使用它来获得中位数和四分位数(因为Q1只是一个0.25%的百分位数),例如:

declare@values-floatListType;
插入到@values中
从#mytable中选择值
选择getPercentile(@values,0.25)作为Q1,
getPercentile(@值,0.5)作为中位数,
getPercentile(@值,0.75)作为Q3

如果您想要SQL Server解决方案,几年前。它基于动态SQL,因此您可以将任何有权访问的列插入其中。它没有经过很好的测试,我当时还在学习,代码现在有点旧,但它可以满足您的需求,或者至少提供一个起点来编写您自己的解决方案。下面是代码的要点——请点击我博客的链接进行深入讨论

CREATE PROCEDURE [Calculations].[InterquartileRangeSP]
@DatabaseName as nvarchar(128) = NULL, @SchemaName as nvarchar(128), @TableName as nvarchar(128),@ColumnName AS nvarchar(128), @PrimaryKeyName as nvarchar(400), @OrderByCode as tinyint = 1, @DecimalPrecision AS nvarchar(50)
AS
SET @DatabaseName = @DatabaseName + ‘.’
DECLARE @SchemaAndTableName nvarchar(400)
SET @SchemaAndTableName = ISNull(@DatabaseName, ”) + @SchemaName + ‘.’ + @TableName
DECLARE @SQLString nvarchar(max)

SET @SQLString = ‘DECLARE @OrderByCode tinyint,
@Count bigint,
@LowerPoint bigint,
@UpperPoint bigint,
@LowerRemainder decimal(38,37), — use the maximum precision and scale for these two variables to make the
 procedure flexible enough to handle large datasets; I suppose I could use a float
@UpperRemainder decimal(38,37),
@LowerQuartile decimal(‘ + @DecimalPrecision + ‘),
@UpperQuartile decimal(‘ + @DecimalPrecision + ‘),
@InterquartileRange decimal(‘ + @DecimalPrecision + ‘),
@LowerInnerFence decimal(‘ + @DecimalPrecision + ‘),
@UpperInnerFence decimal(‘ + @DecimalPrecision + ‘),
@LowerOuterFence decimal(‘ + @DecimalPrecision + ‘),
@UpperOuterFence decimal(‘ + @DecimalPrecision + ‘) 

SET @OrderByCode = ‘ + CAST(@OrderByCode AS nvarchar(50)) + ‘ SELECT @Count=Count(‘ + @ColumnName + ‘)
FROM ‘ + @SchemaAndTableName +
‘ WHERE ‘ + @ColumnName + ‘ IS NOT NULL

SELECT @LowerPoint = (@Count + 1) / 4, @LowerRemainder =  ((CAST(@Count AS decimal(‘ + @DecimalPrecision + ‘)) + 1) % 4) /4,
@UpperPoint = ((@Count + 1) *3) / 4, @UpperRemainder =  (((CAST(@Count AS decimal(‘ + @DecimalPrecision + ‘)) + 1) *3) % 4) / 4; –multiply by 3 for the left s’ + @PrimaryKeyName + ‘e on the upper point to get 75 percent

WITH TempCTE
(‘ + @PrimaryKeyName + ‘, RN, ‘ + @ColumnName + ‘)
AS (SELECT ‘ + @PrimaryKeyName + ‘, ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY ‘ + @ColumnName + ‘ ASC) AS RN, ‘ + @ColumnName + ‘
FROM ‘ + @SchemaAndTableName + ‘
WHERE ‘ + @ColumnName + ‘ IS NOT NULL),
TempCTE2 (QuartileValue)
AS (SELECT TOP 1 ‘ + @ColumnName + ‘ + ((Lead(‘ + @ColumnName + ‘, 1) OVER (ORDER BY ‘ + @ColumnName + ‘) – ‘ + @ColumnName + ‘) * @LowerRemainder) AS QuartileValue
FROM TempCTE
WHERE RN BETWEEN @LowerPoint AND @LowerPoint + 1 

UNION


SELECT TOP 1 ‘ + @ColumnName + ‘ + ((Lead(‘ + @ColumnName + ‘, 1) OVER (ORDER BY ‘ + @ColumnName + ‘) – ‘ + @ColumnName + ‘) * @UpperRemainder) AS QuartileValue
FROM TempCTE
WHERE RN BETWEEN @UpperPoint AND @UpperPoint + 1)

SELECT @LowerQuartile = (SELECT TOP 1 QuartileValue
 FROM TempCTE2 ORDER BY QuartileValue ASC), @UpperQuartile = (SELECT TOP 1 QuartileValue
 FROM TempCTE2 ORDER BY QuartileValue DESC)

SELECT @InterquartileRange = @UpperQuartile – @LowerQuartile
SELECT @LowerInnerFence = @LowerQuartile – (1.5 * @InterquartileRange), @UpperInnerFence = @UpperQuartile + (1.5 * @InterquartileRange), @LowerOuterFence = @LowerQuartile – (3 * @InterquartileRange), @UpperOuterFence = @UpperQuartile + (3 * @InterquartileRange)

–SELECT @LowerPoint AS LowerPoint, @LowerRemainder AS LowerRemainder, @UpperPoint AS UpperPoint, @UpperRemainder AS UpperRemainder
— uncomment this line to debug the inner calculations

SELECT @LowerQuartile AS LowerQuartile, @UpperQuartile AS UpperQuartile, @InterquartileRange AS InterQuartileRange,@LowerInnerFence AS LowerInnerFence, @UpperInnerFence AS UpperInnerFence,@LowerOuterFence AS LowerOuterFence, @UpperOuterFence AS UpperOuterFence


SELECT ‘ + @PrimaryKeyName + ‘, ‘ + @ColumnName + ‘, OutlierDegree
FROM  (SELECT ‘ + @PrimaryKeyName + ‘, ‘ + @ColumnName + ‘,
       ”OutlierDegree” =  CASE WHEN (‘ + @ColumnName + ‘ < @LowerInnerFence AND ‘ + @ColumnName + ‘ >= @LowerOuterFence) OR (‘ +
@ColumnName + ‘ > @UpperInnerFence
 AND ‘ + @ColumnName + ‘ <= @UpperOuterFence) THEN 1
       WHEN ‘ + @ColumnName + ‘ < @LowerOuterFence OR ‘ + @ColumnName + ‘ > @UpperOuterFence THEN 2
       ELSE 0 END
       FROM ‘ + @SchemaAndTableName + ‘
       WHERE ‘ + @ColumnName + ‘ IS NOT NULL) AS T1
      ORDER BY CASE WHEN @OrderByCode = 1 THEN ‘ + @PrimaryKeyName + ‘ END ASC,
CASE WHEN @OrderByCode = 2 THEN ‘ + @PrimaryKeyName + ‘ END DESC,
CASE WHEN @OrderByCode = 3 THEN ‘ + @ColumnName + ‘ END ASC,
CASE WHEN @OrderByCode = 4 THEN ‘ + @ColumnName + ‘ END DESC,
CASE WHEN @OrderByCode = 5 THEN OutlierDegree END ASC,
CASE WHEN @OrderByCode = 6 THEN OutlierDegree END DESC‘

–SELECT @SQLString — uncomment this to debug string errors
EXEC (@SQLString)
创建过程[计算].[InterquartileRangeSP]
@数据库名为nvarchar(128)=NULL,@SchemaName为nvarchar(128),@TableName为nvarchar(128),@ColumnName为nvarchar(128),@PrimaryKeyName为nvarchar(400),@OrderByCode为tinyint=1,@DecimalPrecision为nvarchar(50)
作为
设置@DatabaseName=@DatabaseName+'
声明@SchemaAndTableName nvarchar(400)
设置@SchemaAndTableName=ISNull(@DatabaseName,“)+@SchemaName+”。++@TableName
声明@SQLString nvarchar(最大值)
SET@SQLString='DECLARE@OrderByCode tinyint,
@比金伯爵,
@LowerPoint bigint,
@上点bigint,
@LowerMainder十进制数(38,37)-使用这两个变量的最大精度和刻度,使
程序灵活,足以处理大型数据集;我想我可以用一个浮子
@小数的上余数(38,37),
@低四分位小数('+@DecimalPrecision+'),
@上四分位小数('+@DecimalPrecision+'),
@四分位小数('+@DecimalPrecision+'),
@LowerInnerFence十进制('+@DecimalPrecision+'),
@大写小数('+@DecimalPrecision+'),
@LowerOuterFence十进制('+@DecimalPrecision+'),
@大写小数('+@DecimalPrecision+'))
设置@OrderByCode='+CAST(@OrderByCode为nvarchar(50))+'选择@Count=Count('+@ColumnName+'))
来自“+@SchemaAndTableName”+
'其中'+@ColumnName+'不为空
选择@LowerPoint=(@Count+1)/4、@LowerMainder=((强制转换(@Count为十进制(+@DecimalPrecision+)+1)%4),
@上点=(@Count+1)*3)/4,@upperrements=((强制转换(@Count为十进制(+@DecimalPrecision+)+1)*3)%4)/4;-用3乘以左s'+@PrimaryKeyName+'e的上限,得到75%
与坦普克特
(“++@PrimaryKeyName+”,RN“++@ColumnName+”)
AS(选择“++@PrimaryKeyName+”,行号()超过(按“++@ColumnName+”ASC按1顺序划分)AS RN,++@ColumnName+?
CREATE PROCEDURE [Calculations].[InterquartileRangeSP]
@DatabaseName as nvarchar(128) = NULL, @SchemaName as nvarchar(128), @TableName as nvarchar(128),@ColumnName AS nvarchar(128), @PrimaryKeyName as nvarchar(400), @OrderByCode as tinyint = 1, @DecimalPrecision AS nvarchar(50)
AS
SET @DatabaseName = @DatabaseName + ‘.’
DECLARE @SchemaAndTableName nvarchar(400)
SET @SchemaAndTableName = ISNull(@DatabaseName, ”) + @SchemaName + ‘.’ + @TableName
DECLARE @SQLString nvarchar(max)

SET @SQLString = ‘DECLARE @OrderByCode tinyint,
@Count bigint,
@LowerPoint bigint,
@UpperPoint bigint,
@LowerRemainder decimal(38,37), — use the maximum precision and scale for these two variables to make the
 procedure flexible enough to handle large datasets; I suppose I could use a float
@UpperRemainder decimal(38,37),
@LowerQuartile decimal(‘ + @DecimalPrecision + ‘),
@UpperQuartile decimal(‘ + @DecimalPrecision + ‘),
@InterquartileRange decimal(‘ + @DecimalPrecision + ‘),
@LowerInnerFence decimal(‘ + @DecimalPrecision + ‘),
@UpperInnerFence decimal(‘ + @DecimalPrecision + ‘),
@LowerOuterFence decimal(‘ + @DecimalPrecision + ‘),
@UpperOuterFence decimal(‘ + @DecimalPrecision + ‘) 

SET @OrderByCode = ‘ + CAST(@OrderByCode AS nvarchar(50)) + ‘ SELECT @Count=Count(‘ + @ColumnName + ‘)
FROM ‘ + @SchemaAndTableName +
‘ WHERE ‘ + @ColumnName + ‘ IS NOT NULL

SELECT @LowerPoint = (@Count + 1) / 4, @LowerRemainder =  ((CAST(@Count AS decimal(‘ + @DecimalPrecision + ‘)) + 1) % 4) /4,
@UpperPoint = ((@Count + 1) *3) / 4, @UpperRemainder =  (((CAST(@Count AS decimal(‘ + @DecimalPrecision + ‘)) + 1) *3) % 4) / 4; –multiply by 3 for the left s’ + @PrimaryKeyName + ‘e on the upper point to get 75 percent

WITH TempCTE
(‘ + @PrimaryKeyName + ‘, RN, ‘ + @ColumnName + ‘)
AS (SELECT ‘ + @PrimaryKeyName + ‘, ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY ‘ + @ColumnName + ‘ ASC) AS RN, ‘ + @ColumnName + ‘
FROM ‘ + @SchemaAndTableName + ‘
WHERE ‘ + @ColumnName + ‘ IS NOT NULL),
TempCTE2 (QuartileValue)
AS (SELECT TOP 1 ‘ + @ColumnName + ‘ + ((Lead(‘ + @ColumnName + ‘, 1) OVER (ORDER BY ‘ + @ColumnName + ‘) – ‘ + @ColumnName + ‘) * @LowerRemainder) AS QuartileValue
FROM TempCTE
WHERE RN BETWEEN @LowerPoint AND @LowerPoint + 1 

UNION


SELECT TOP 1 ‘ + @ColumnName + ‘ + ((Lead(‘ + @ColumnName + ‘, 1) OVER (ORDER BY ‘ + @ColumnName + ‘) – ‘ + @ColumnName + ‘) * @UpperRemainder) AS QuartileValue
FROM TempCTE
WHERE RN BETWEEN @UpperPoint AND @UpperPoint + 1)

SELECT @LowerQuartile = (SELECT TOP 1 QuartileValue
 FROM TempCTE2 ORDER BY QuartileValue ASC), @UpperQuartile = (SELECT TOP 1 QuartileValue
 FROM TempCTE2 ORDER BY QuartileValue DESC)

SELECT @InterquartileRange = @UpperQuartile – @LowerQuartile
SELECT @LowerInnerFence = @LowerQuartile – (1.5 * @InterquartileRange), @UpperInnerFence = @UpperQuartile + (1.5 * @InterquartileRange), @LowerOuterFence = @LowerQuartile – (3 * @InterquartileRange), @UpperOuterFence = @UpperQuartile + (3 * @InterquartileRange)

–SELECT @LowerPoint AS LowerPoint, @LowerRemainder AS LowerRemainder, @UpperPoint AS UpperPoint, @UpperRemainder AS UpperRemainder
— uncomment this line to debug the inner calculations

SELECT @LowerQuartile AS LowerQuartile, @UpperQuartile AS UpperQuartile, @InterquartileRange AS InterQuartileRange,@LowerInnerFence AS LowerInnerFence, @UpperInnerFence AS UpperInnerFence,@LowerOuterFence AS LowerOuterFence, @UpperOuterFence AS UpperOuterFence


SELECT ‘ + @PrimaryKeyName + ‘, ‘ + @ColumnName + ‘, OutlierDegree
FROM  (SELECT ‘ + @PrimaryKeyName + ‘, ‘ + @ColumnName + ‘,
       ”OutlierDegree” =  CASE WHEN (‘ + @ColumnName + ‘ < @LowerInnerFence AND ‘ + @ColumnName + ‘ >= @LowerOuterFence) OR (‘ +
@ColumnName + ‘ > @UpperInnerFence
 AND ‘ + @ColumnName + ‘ <= @UpperOuterFence) THEN 1
       WHEN ‘ + @ColumnName + ‘ < @LowerOuterFence OR ‘ + @ColumnName + ‘ > @UpperOuterFence THEN 2
       ELSE 0 END
       FROM ‘ + @SchemaAndTableName + ‘
       WHERE ‘ + @ColumnName + ‘ IS NOT NULL) AS T1
      ORDER BY CASE WHEN @OrderByCode = 1 THEN ‘ + @PrimaryKeyName + ‘ END ASC,
CASE WHEN @OrderByCode = 2 THEN ‘ + @PrimaryKeyName + ‘ END DESC,
CASE WHEN @OrderByCode = 3 THEN ‘ + @ColumnName + ‘ END ASC,
CASE WHEN @OrderByCode = 4 THEN ‘ + @ColumnName + ‘ END DESC,
CASE WHEN @OrderByCode = 5 THEN OutlierDegree END ASC,
CASE WHEN @OrderByCode = 6 THEN OutlierDegree END DESC‘

–SELECT @SQLString — uncomment this to debug string errors
EXEC (@SQLString)