Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/67.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/angularjs/22.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Winsorize函数:`[.data.frame`(x,顺序(x,na.last=na.last,递减=递减))中出错:选择了未定义的列_R_Dataframe_Desctools - Fatal编程技术网

Winsorize函数:`[.data.frame`(x,顺序(x,na.last=na.last,递减=递减))中出错:选择了未定义的列

Winsorize函数:`[.data.frame`(x,顺序(x,na.last=na.last,递减=递减))中出错:选择了未定义的列,r,dataframe,desctools,R,Dataframe,Desctools,我想winsorize我的数据,如下所示(总共134个观察值): 为了使用DescTools软件包中的winsorize函数,我创建了一个变量rev的单一数值向量,只需使用select函数:rev_vector您就快到了,只是您为分位数选择了限制性probs。您的向量已经有相当多的相等的变量它的边缘值。它以前可能已经被winsorized了吗 library(DescTools) x <- c(0.66, 2.8, 87.75, 6.89, 134.73, 0.09, 22.78, 1

我想winsorize我的数据,如下所示(总共134个观察值):


为了使用
DescTools
软件包中的
winsorize
函数,我创建了一个变量
rev
的单一数值向量,只需使用
select
函数:
rev_vector您就快到了,只是您为分位数选择了限制性probs。您的向量已经有相当多的相等的变量它的边缘值。它以前可能已经被winsorized了吗

library(DescTools)

x <-  c(0.66, 2.8, 87.75, 6.89, 134.73, 0.09, 22.78, 1.36, 
        5.48, 0.7, 0.79, 0.35, 31.37, 0.55, 0.94, 0.06, 12.36, 13.58, 
        7.95, 0.29, 7.8, 0.39, 73.55, 0.09, 23.07, 0.27, 0.32, 0.08, 
        0.05, 0.41, 29.47, 0.66, 20.91, 0.67, 0.05, 1.39, 0.17, 0.14, 
        1.79, 0.05, 2.52, 3.68, 0.24, 0.09, 109.65, 8.43, 0.2, 0.17, 
        35.93, 3.05, 0.07, 0.05, 0.82, 0.57, 26.21, 0.28, 0.05, 5.72, 
        6.12, 4.09, 0.05, 0.22, 134.73, 94.43, 41.35, 0.2, 17.32, 5.63, 
        3.25, 0.12, 0.05, 0.07, 10.89, 3.79, 1.89, 134.73, 9.98, 10.58, 
        54.98, 134.73, 15.55, 15.21, 5.93, 42.65, 1.59, 3, 11.19, 6.1, 
        0.08, 134.73, 31.37, 17.74, 20.92, 6.46, 3.18, 0.05, 0.81, 9.15, 
        29.47, 0.05, 1.34, 7.97, 109.65, 28.45, 35.93, 0.38, 0.65, 134.73, 
        9.44, 8.66, 5.3, 11.83, 20.06, 29.55, 1.15, 2.32, 46.14, 134.73, 
        9.98, 10.58, 11.05, 54.98, 134.73, 15.55, 15.21, 5.93, 1.59, 
        1.03, 3, 11.19, 6.1)
使用
Desc()

Desc(Winsorize(x))

# -----------------------------------------------------    
# Winsorize(x) (numeric)
#
#  length       n    NAs  unique     0s   mean  meanCI
#     131     131      0      95      0  19.73   13.53
#          100.0%   0.0%           0.0%          25.92
#                                                     
#     .05     .10    .25  median    .75    .90     .95
#    0.05    0.08   0.48    5.48  17.53  54.98  134.73
#                                                     
#   range      sd  vcoef     mad    IQR   skew    kurt
#  134.68   35.84   1.82    7.87  17.05   2.35    4.42
#                                                     
# lowest : 0.05 (9), 0.06, 0.07 (2), 0.08 (2), 0.09 (3)
# highest: 73.55, 87.75, 94.43, 109.65 (2), 134.73 (8)
你看,你有9倍于0.05的值和8倍于134.73的值。因此,probs为0.05和0.95的分位数与极值相同,winsorized向量与原始向量相同

quantile(x=x, probs=c(0.05, 0.95))
#    5%    95% 
#  0.05 134.73 
只需将probs增加到c(0.1,0.9),您就会看到效果

PS:
Winsorize()
需要一个向量作为参数,并且无法处理data.frames。(帮助文件中也介绍了这一点…)


PPS:一个可复制的示例将有助于…;-)

谢谢!我现在可以winsorize我的向量了!按照您的建议:rev_wins1如果答案令人满意地涵盖了您问题的所有方面,您可以接受它…;-)
library(DescTools)

x <-  c(0.66, 2.8, 87.75, 6.89, 134.73, 0.09, 22.78, 1.36, 
        5.48, 0.7, 0.79, 0.35, 31.37, 0.55, 0.94, 0.06, 12.36, 13.58, 
        7.95, 0.29, 7.8, 0.39, 73.55, 0.09, 23.07, 0.27, 0.32, 0.08, 
        0.05, 0.41, 29.47, 0.66, 20.91, 0.67, 0.05, 1.39, 0.17, 0.14, 
        1.79, 0.05, 2.52, 3.68, 0.24, 0.09, 109.65, 8.43, 0.2, 0.17, 
        35.93, 3.05, 0.07, 0.05, 0.82, 0.57, 26.21, 0.28, 0.05, 5.72, 
        6.12, 4.09, 0.05, 0.22, 134.73, 94.43, 41.35, 0.2, 17.32, 5.63, 
        3.25, 0.12, 0.05, 0.07, 10.89, 3.79, 1.89, 134.73, 9.98, 10.58, 
        54.98, 134.73, 15.55, 15.21, 5.93, 42.65, 1.59, 3, 11.19, 6.1, 
        0.08, 134.73, 31.37, 17.74, 20.92, 6.46, 3.18, 0.05, 0.81, 9.15, 
        29.47, 0.05, 1.34, 7.97, 109.65, 28.45, 35.93, 0.38, 0.65, 134.73, 
        9.44, 8.66, 5.3, 11.83, 20.06, 29.55, 1.15, 2.32, 46.14, 134.73, 
        9.98, 10.58, 11.05, 54.98, 134.73, 15.55, 15.21, 5.93, 1.59, 
        1.03, 3, 11.19, 6.1)
summary(Winsorize(x))
# Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
# 0.05    0.48    5.48   19.73   17.53  134.73 
Desc(Winsorize(x))

# -----------------------------------------------------    
# Winsorize(x) (numeric)
#
#  length       n    NAs  unique     0s   mean  meanCI
#     131     131      0      95      0  19.73   13.53
#          100.0%   0.0%           0.0%          25.92
#                                                     
#     .05     .10    .25  median    .75    .90     .95
#    0.05    0.08   0.48    5.48  17.53  54.98  134.73
#                                                     
#   range      sd  vcoef     mad    IQR   skew    kurt
#  134.68   35.84   1.82    7.87  17.05   2.35    4.42
#                                                     
# lowest : 0.05 (9), 0.06, 0.07 (2), 0.08 (2), 0.09 (3)
# highest: 73.55, 87.75, 94.43, 109.65 (2), 134.73 (8)
quantile(x=x, probs=c(0.05, 0.95))
#    5%    95% 
#  0.05 134.73