R 连续变量中有6个级别的系数
我有一个连续的频率变量,范围从0到6.115053。 我需要将其分为6个级别,这样我的分析将更具可读性 我试过:R 连续变量中有6个级别的系数,r,factors,R,Factors,我有一个连续的频率变量,范围从0到6.115053。 我需要将其分为6个级别,这样我的分析将更具可读性 我试过: frequency.new <- hist(all$frequency, 6, plot = FALSE) all$frequency <- as.factor(frequency.new) 有人能帮我吗 非常感谢 Katerina您应该查看base R中的cut()函数。在进一步冒险之前,您还应该注意我答案的最后一行(粗体) > set.seed(42) &g
frequency.new <- hist(all$frequency, 6, plot = FALSE)
all$frequency <- as.factor(frequency.new)
有人能帮我吗
非常感谢
Katerina您应该查看base R中的
cut()
函数。在进一步冒险之前,您还应该注意我答案的最后一行(粗体)
> set.seed(42)
> cut(runif(50), 6)
[1] (0.825,0.99] (0.825,0.99] (0.167,0.332] (0.825,0.99]
[5] (0.496,0.661] (0.496,0.661] (0.661,0.825] (0.00296,0.167]
[9] (0.496,0.661] (0.661,0.825] (0.332,0.496] (0.661,0.825]
[13] (0.825,0.99] (0.167,0.332] (0.332,0.496] (0.825,0.99]
[17] (0.825,0.99] (0.00296,0.167] (0.332,0.496] (0.496,0.661]
[21] (0.825,0.99] (0.00296,0.167] (0.825,0.99] (0.825,0.99]
[25] (0.00296,0.167] (0.496,0.661] (0.332,0.496] (0.825,0.99]
[29] (0.332,0.496] (0.825,0.99] (0.661,0.825] (0.661,0.825]
[33] (0.332,0.496] (0.661,0.825] (0.00296,0.167] (0.825,0.99]
[37] (0.00296,0.167] (0.167,0.332] (0.825,0.99] (0.496,0.661]
[41] (0.332,0.496] (0.332,0.496] (0.00296,0.167] (0.825,0.99]
[45] (0.332,0.496] (0.825,0.99] (0.825,0.99] (0.496,0.661]
[49] (0.825,0.99] (0.496,0.661]
6 Levels: (0.00296,0.167] (0.167,0.332] (0.332,0.496] ... (0.825,0.99]
cut()。这只是将数据范围简单地分成6组相等的间隔。请阅读?cut
,了解在极端间隔下的操作细节
代码失败的原因是因为hist()
返回的对象是一个列表,其中包含的数据远远多于分组中的数据:
> foo <- hist(runif(50), breaks = 6, plot = FALSE)
> str(foo)
List of 7
$ breaks : num [1:6] 0 0.2 0.4 0.6 0.8 1
$ counts : int [1:5] 12 13 7 13 5
$ intensities: num [1:5] 1.2 1.3 0.7 1.3 0.5
$ density : num [1:5] 1.2 1.3 0.7 1.3 0.5
$ mids : num [1:5] 0.1 0.3 0.5 0.7 0.9
$ xname : chr "runif(50)"
$ equidist : logi TRUE
- attr(*, "class")= chr "histogram"
但你应该问问自己,为什么要对你的数据进行离散化,以及这是否有意义?谢谢@Aaron-我甚至不知道该怎么回答。我想我要强调最后一点!:-)好吧,这可能是一个合理的做法,特别是作为一个更复杂分析的解释的一部分,但由于很难说出OP的想法,我认为你最好包括关于这是否有意义的建议。
> foo <- hist(runif(50), breaks = 6, plot = FALSE)
> str(foo)
List of 7
$ breaks : num [1:6] 0 0.2 0.4 0.6 0.8 1
$ counts : int [1:5] 12 13 7 13 5
$ intensities: num [1:5] 1.2 1.3 0.7 1.3 0.5
$ density : num [1:5] 1.2 1.3 0.7 1.3 0.5
$ mids : num [1:5] 0.1 0.3 0.5 0.7 0.9
$ xname : chr "runif(50)"
$ equidist : logi TRUE
- attr(*, "class")= chr "histogram"
> set.seed(42)
> x <- runif(50)
> brks <- pretty(range(x), n = 6, min.n = 1)
> cut(x, breaks = brks)
[1] (0.8,1] (0.8,1] (0.2,0.4] (0.8,1] (0.6,0.8] (0.4,0.6] (0.6,0.8]
[8] (0,0.2] (0.6,0.8] (0.6,0.8] (0.4,0.6] (0.6,0.8] (0.8,1] (0.2,0.4]
[15] (0.4,0.6] (0.8,1] (0.8,1] (0,0.2] (0.4,0.6] (0.4,0.6] (0.8,1]
[22] (0,0.2] (0.8,1] (0.8,1] (0,0.2] (0.4,0.6] (0.2,0.4] (0.8,1]
[29] (0.4,0.6] (0.8,1] (0.6,0.8] (0.8,1] (0.2,0.4] (0.6,0.8] (0,0.2]
[36] (0.8,1] (0,0.2] (0.2,0.4] (0.8,1] (0.6,0.8] (0.2,0.4] (0.4,0.6]
[43] (0,0.2] (0.8,1] (0.4,0.6] (0.8,1] (0.8,1] (0.6,0.8] (0.8,1]
[50] (0.6,0.8]
Levels: (0,0.2] (0.2,0.4] (0.4,0.6] (0.6,0.8] (0.8,1]