Winsorize函数:`[.data.frame`(x,顺序(x,na.last=na.last,递减=递减))中出错:选择了未定义的列
我想winsorize我的数据,如下所示(总共134个观察值):Winsorize函数:`[.data.frame`(x,顺序(x,na.last=na.last,递减=递减))中出错:选择了未定义的列,r,dataframe,desctools,R,Dataframe,Desctools,我想winsorize我的数据,如下所示(总共134个观察值): 为了使用DescTools软件包中的winsorize函数,我创建了一个变量rev的单一数值向量,只需使用select函数:rev_vector您就快到了,只是您为分位数选择了限制性probs。您的向量已经有相当多的相等的变量它的边缘值。它以前可能已经被winsorized了吗 library(DescTools) x <- c(0.66, 2.8, 87.75, 6.89, 134.73, 0.09, 22.78, 1
为了使用
DescTools
软件包中的winsorize
函数,我创建了一个变量rev
的单一数值向量,只需使用select
函数:rev_vector您就快到了,只是您为分位数选择了限制性probs。您的向量已经有相当多的相等的变量它的边缘值。它以前可能已经被winsorized了吗
library(DescTools)
x <- c(0.66, 2.8, 87.75, 6.89, 134.73, 0.09, 22.78, 1.36,
5.48, 0.7, 0.79, 0.35, 31.37, 0.55, 0.94, 0.06, 12.36, 13.58,
7.95, 0.29, 7.8, 0.39, 73.55, 0.09, 23.07, 0.27, 0.32, 0.08,
0.05, 0.41, 29.47, 0.66, 20.91, 0.67, 0.05, 1.39, 0.17, 0.14,
1.79, 0.05, 2.52, 3.68, 0.24, 0.09, 109.65, 8.43, 0.2, 0.17,
35.93, 3.05, 0.07, 0.05, 0.82, 0.57, 26.21, 0.28, 0.05, 5.72,
6.12, 4.09, 0.05, 0.22, 134.73, 94.43, 41.35, 0.2, 17.32, 5.63,
3.25, 0.12, 0.05, 0.07, 10.89, 3.79, 1.89, 134.73, 9.98, 10.58,
54.98, 134.73, 15.55, 15.21, 5.93, 42.65, 1.59, 3, 11.19, 6.1,
0.08, 134.73, 31.37, 17.74, 20.92, 6.46, 3.18, 0.05, 0.81, 9.15,
29.47, 0.05, 1.34, 7.97, 109.65, 28.45, 35.93, 0.38, 0.65, 134.73,
9.44, 8.66, 5.3, 11.83, 20.06, 29.55, 1.15, 2.32, 46.14, 134.73,
9.98, 10.58, 11.05, 54.98, 134.73, 15.55, 15.21, 5.93, 1.59,
1.03, 3, 11.19, 6.1)
使用Desc()
Desc(Winsorize(x))
# -----------------------------------------------------
# Winsorize(x) (numeric)
#
# length n NAs unique 0s mean meanCI
# 131 131 0 95 0 19.73 13.53
# 100.0% 0.0% 0.0% 25.92
#
# .05 .10 .25 median .75 .90 .95
# 0.05 0.08 0.48 5.48 17.53 54.98 134.73
#
# range sd vcoef mad IQR skew kurt
# 134.68 35.84 1.82 7.87 17.05 2.35 4.42
#
# lowest : 0.05 (9), 0.06, 0.07 (2), 0.08 (2), 0.09 (3)
# highest: 73.55, 87.75, 94.43, 109.65 (2), 134.73 (8)
你看,你有9倍于0.05的值和8倍于134.73的值。因此,probs为0.05和0.95的分位数与极值相同,winsorized向量与原始向量相同
quantile(x=x, probs=c(0.05, 0.95))
# 5% 95%
# 0.05 134.73
只需将probs增加到c(0.1,0.9),您就会看到效果
PS:Winsorize()
需要一个向量作为参数,并且无法处理data.frames。(帮助文件中也介绍了这一点…)
PPS:一个可复制的示例将有助于…;-)谢谢!我现在可以winsorize我的向量了!按照您的建议:rev_wins1如果答案令人满意地涵盖了您问题的所有方面,您可以接受它…;-)
library(DescTools)
x <- c(0.66, 2.8, 87.75, 6.89, 134.73, 0.09, 22.78, 1.36,
5.48, 0.7, 0.79, 0.35, 31.37, 0.55, 0.94, 0.06, 12.36, 13.58,
7.95, 0.29, 7.8, 0.39, 73.55, 0.09, 23.07, 0.27, 0.32, 0.08,
0.05, 0.41, 29.47, 0.66, 20.91, 0.67, 0.05, 1.39, 0.17, 0.14,
1.79, 0.05, 2.52, 3.68, 0.24, 0.09, 109.65, 8.43, 0.2, 0.17,
35.93, 3.05, 0.07, 0.05, 0.82, 0.57, 26.21, 0.28, 0.05, 5.72,
6.12, 4.09, 0.05, 0.22, 134.73, 94.43, 41.35, 0.2, 17.32, 5.63,
3.25, 0.12, 0.05, 0.07, 10.89, 3.79, 1.89, 134.73, 9.98, 10.58,
54.98, 134.73, 15.55, 15.21, 5.93, 42.65, 1.59, 3, 11.19, 6.1,
0.08, 134.73, 31.37, 17.74, 20.92, 6.46, 3.18, 0.05, 0.81, 9.15,
29.47, 0.05, 1.34, 7.97, 109.65, 28.45, 35.93, 0.38, 0.65, 134.73,
9.44, 8.66, 5.3, 11.83, 20.06, 29.55, 1.15, 2.32, 46.14, 134.73,
9.98, 10.58, 11.05, 54.98, 134.73, 15.55, 15.21, 5.93, 1.59,
1.03, 3, 11.19, 6.1)
summary(Winsorize(x))
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 0.05 0.48 5.48 19.73 17.53 134.73
Desc(Winsorize(x))
# -----------------------------------------------------
# Winsorize(x) (numeric)
#
# length n NAs unique 0s mean meanCI
# 131 131 0 95 0 19.73 13.53
# 100.0% 0.0% 0.0% 25.92
#
# .05 .10 .25 median .75 .90 .95
# 0.05 0.08 0.48 5.48 17.53 54.98 134.73
#
# range sd vcoef mad IQR skew kurt
# 134.68 35.84 1.82 7.87 17.05 2.35 4.42
#
# lowest : 0.05 (9), 0.06, 0.07 (2), 0.08 (2), 0.09 (3)
# highest: 73.55, 87.75, 94.43, 109.65 (2), 134.73 (8)
quantile(x=x, probs=c(0.05, 0.95))
# 5% 95%
# 0.05 134.73