SQL Server中按列的频率分布
我有一张结构如下的桌子SQL Server中按列的频率分布,sql,sql-server,Sql,Sql Server,我有一张结构如下的桌子 year age score weight ------------------------------ 2008 16 100 3 2008 25 150 2 2009 40 210 2 2009 22 50 3 2009 65 90 3 我需要按年龄找出分数的频率分布 预期产出是按年龄划分的分数分布,单位为%:
year age score weight
------------------------------
2008 16 100 3
2008 25 150 2
2009 40 210 2
2009 22 50 3
2009 65 90 3
我需要按年龄找出分数的频率分布
预期产出是按年龄划分的分数分布,单位为%:
(输出不准确-仅用于描述。行总计1)
我为一个小组提出了以下问题
select
yearofsale,
sum(sum(case
when age between 16 and 35 AND score between 0 and 50
then 1
else 0
end) * weight) / sum(case
when age between 16 and 35 and score between 0 and 50
then weight
else 1 end)
from
table
group by
yearofsale
但我很确定有一个更简单的方法。有什么想法吗
谢谢,
bee您可以首先执行子查询,将年龄组和分数组信息添加到每个记录中 然后再加上每年的总体重和年龄组 之后,您将计算每个年龄和分数组的百分比。最后,将结果导入交叉表:
select yearofsale, age_group, [0-50], [51-100], [101-150], [151-200], [201-250]
from (
select yearofsale, age_group, score_group, 100.0*sum(weight) / min(total_weight) as pct
from (
select *,
sum(weight) over (partition by yearofsale, age_group) as total_weight
from (
select *,
case when age >= 66 then '66+'
when age >= 41 then '41-65'
when age >= 36 then '36-40'
when age >= 16 then '16-35'
end as age_group,
case when score < 51 then '0-50'
when score < 101 then '51-100'
when score < 151 then '101-150'
when score < 201 then '151-200'
when score < 251 then '201-250'
end as score_group
from
table) as base
) as base2
group by yearofsale, age_group, score_group) as base3
pivot ( sum(pct)
for score_group in ([0-50], [51-100], [101-150], [151-200], [201-250])
) as pivotTable
这里是一个动态方法,您将获得完整矩阵 矩阵层存储在表中,通过调整参数,可以更改轴甚至源 考虑以下几点:
Declare @Source varchar(150)= 'YourTable'
Declare @KeyCol varchar(150)= 'Year'
Declare @YTier varchar(50) = 'Age'
Declare @YMeas varchar(50) = 'Age'
Declare @XTier varchar(50) = 'Score'
Declare @XMeas varchar(50) = 'Score'
Declare @SQL varchar(max) = '
;with cte1 as (
Select '+@KeyCol+'
,YSeq = max(Y.Seq)
,YTitle = max(Y.Title)
,XSeq = max(X.Seq)
,XTitle = max(X.Title)
,Value = sum(Weight)
From '+@Source+' A
Join Tier Y on (Y.Tier='''+@YTier+''' and A.'+@YMeas+' between Y.R1 and Y.R2)
Join Tier X on (X.Tier='''+@XTier+''' and A.'+@XMeas+' between X.R1 and X.R2)
Group By '+@KeyCol+',Y.Seq,X.Seq
Union All
Select '+@KeyCol+'
,YSeq = Y.Seq
,YTitle = Y.Title
,XSeq = X.Seq
,XTitle = X.Title
,Value = 0
From (Select Distinct '+@KeyCol+' from '+@Source+') A
Cross Join (Select Distinct Seq,Title From Tier where Tier='''+@YTier+''') Y
Cross Join (Select Distinct Seq,Title From Tier where Tier='''+@XTier+''') X )
, cte2 as (Select '+@KeyCol+',YSeq,RowTotal=sum(Value) from cte1 Group By '+@KeyCol+',YSeq)
, cte3 as (Select A.*
,PctRow = Format(case when B.RowTotal=0 then 0 else (A.Value*100.0)/B.RowTotal end,''#0.0'')
From cte1 A
Join cte2 B on A.'+@KeyCol+'=B.'+@KeyCol+' and A.YSeq=B.YSeq )
Select *
Into #Temp
From cte3
Declare @SQL2 varchar(max) = Stuff((Select '','' + QuoteName(Title) From Tier where Tier='''+@XTier+''' Order by Seq For XML Path('''')),1,1,'''')
Select @SQL2 = ''
Select ['+@KeyCol+'],[YTitle] as '+@YTier+','' + @SQL2 + ''
From (Select '+@KeyCol+',YSeq,YTitle,XTitle,PctRow=max(PctRow) from #Temp Group BY '+@KeyCol+',YSeq,YTitle,XTitle) A
Pivot (max(PctRow) For [XTitle] in ('' + @SQL2 + '') ) p''
Exec(@SQL2);
'
Exec(@SQL)
返回
这些层存储在一般结构中。这允许多个版本。层表如下所示:
SQL有错误,并且不会生成您所示的表。对于示例数据,您希望得到什么样的输出?实际的预期结果?感谢您对@trincot的响应,sql of Course已被截断,只显示一列。基本上,我正在尝试按年龄生成分数的累积频率分布。请提供样本输入和预期结果。同样,您的SQL有语法错误。您刚才编辑的输出与示例数据没有任何关系。这是什么魔法?投票表决
yearofsale | age_group | 0-50 | 51-100 | 101-150 | 151-200 | 201-250
------------+-----------+--------+--------+---------+---------+---------
2008 | 16-35 | NULL | 60.00 | 40.00 | NULL | NULL
2009 | 16-35 | 100.00 | NULL | NULL | NULL | NULL
2009 | 36-40 | NULL | NULL | NULL | NULL | 100.00
2009 | 41-65 | NULL | 100.00 | NULL | NULL | NULL
Declare @Source varchar(150)= 'YourTable'
Declare @KeyCol varchar(150)= 'Year'
Declare @YTier varchar(50) = 'Age'
Declare @YMeas varchar(50) = 'Age'
Declare @XTier varchar(50) = 'Score'
Declare @XMeas varchar(50) = 'Score'
Declare @SQL varchar(max) = '
;with cte1 as (
Select '+@KeyCol+'
,YSeq = max(Y.Seq)
,YTitle = max(Y.Title)
,XSeq = max(X.Seq)
,XTitle = max(X.Title)
,Value = sum(Weight)
From '+@Source+' A
Join Tier Y on (Y.Tier='''+@YTier+''' and A.'+@YMeas+' between Y.R1 and Y.R2)
Join Tier X on (X.Tier='''+@XTier+''' and A.'+@XMeas+' between X.R1 and X.R2)
Group By '+@KeyCol+',Y.Seq,X.Seq
Union All
Select '+@KeyCol+'
,YSeq = Y.Seq
,YTitle = Y.Title
,XSeq = X.Seq
,XTitle = X.Title
,Value = 0
From (Select Distinct '+@KeyCol+' from '+@Source+') A
Cross Join (Select Distinct Seq,Title From Tier where Tier='''+@YTier+''') Y
Cross Join (Select Distinct Seq,Title From Tier where Tier='''+@XTier+''') X )
, cte2 as (Select '+@KeyCol+',YSeq,RowTotal=sum(Value) from cte1 Group By '+@KeyCol+',YSeq)
, cte3 as (Select A.*
,PctRow = Format(case when B.RowTotal=0 then 0 else (A.Value*100.0)/B.RowTotal end,''#0.0'')
From cte1 A
Join cte2 B on A.'+@KeyCol+'=B.'+@KeyCol+' and A.YSeq=B.YSeq )
Select *
Into #Temp
From cte3
Declare @SQL2 varchar(max) = Stuff((Select '','' + QuoteName(Title) From Tier where Tier='''+@XTier+''' Order by Seq For XML Path('''')),1,1,'''')
Select @SQL2 = ''
Select ['+@KeyCol+'],[YTitle] as '+@YTier+','' + @SQL2 + ''
From (Select '+@KeyCol+',YSeq,YTitle,XTitle,PctRow=max(PctRow) from #Temp Group BY '+@KeyCol+',YSeq,YTitle,XTitle) A
Pivot (max(PctRow) For [XTitle] in ('' + @SQL2 + '') ) p''
Exec(@SQL2);
'
Exec(@SQL)