R 如何基于特定间隔创建新列
我想根据另一列“dim”中的间隔在我的数据框中创建一个新列 例如: 我的数据集是:R 如何基于特定间隔创建新列,r,R,我想根据另一列“dim”中的间隔在我的数据框中创建一个新列 例如: 我的数据集是: df1 id dim 1 25 2 34 3 60 4 65 5 80 6 82 7 90 8 95 9 110 10 120 I would like the follow data set below using the interval by 20 (my column begin with 25 for a new column x factors: 25:44 = 1 45=64=
df1
id dim
1 25
2 34
3 60
4 65
5 80
6 82
7 90
8 95
9 110
10 120
I would like the follow data set below using the interval by 20 (my column begin with 25 for a new column x
factors: 25:44 = 1 45=64= 2 and so on...
df2
id dim x
1 25 1
2 34 1
3 60 2
4 65 3
5 80 3
6 82 3
7 90 4
8 95 4
9 110 5
10 120 5
有人能帮我吗?你可以用
地板和一些数学:
df您可以使用floor
和一些数学运算来完成此操作:
df我们可以使用%/%
来确定“dim”和“dim”的第一个值之间的差异
library(dplyr)
df %>%
mutate(x = (dim - first(dim)) %/% 20 + 1)
# id dim x
#1 1 25 1
#2 2 34 1
#3 3 60 2
#4 4 65 3
#5 5 80 3
#6 6 82 3
#7 7 90 4
#8 8 95 4
#9 9 110 5
#10 10 120 5
或者使用findInterval
df %>%
mutate(x = findInterval(dim, seq(20, length.out = n(), by = 20), all.inside = TRUE))
数据
df我们可以使用%/%
来确定“dim”和“dim”的第一个值之间的差异
library(dplyr)
df %>%
mutate(x = (dim - first(dim)) %/% 20 + 1)
# id dim x
#1 1 25 1
#2 2 34 1
#3 3 60 2
#4 4 65 3
#5 5 80 3
#6 6 82 3
#7 7 90 4
#8 8 95 4
#9 9 110 5
#10 10 120 5
或者使用findInterval
df %>%
mutate(x = findInterval(dim, seq(20, length.out = n(), by = 20), all.inside = TRUE))
数据
df这是一个使用cut
的tidyverse
解决方案
library(tidyverse)
df %>%
mutate(x = cut(dim,
#Add 1 to the maximum value in dim to make sure it is included in the categorization.
breaks = seq(min(dim),max(dim)+1,20),
#Set this to T to include the lowest value
include.lowest = T,
#To set labels as a sequence of integers
labels = F))
# id dim x
#1 1 25 1
#2 2 34 1
#3 3 60 2
#4 4 65 2
#5 5 80 3
#6 6 82 3
#7 7 90 4
#8 8 95 4
#9 9 110 5
#10 10 120 5
下面是一个使用cut
的tidyverse
解决方案
library(tidyverse)
df %>%
mutate(x = cut(dim,
#Add 1 to the maximum value in dim to make sure it is included in the categorization.
breaks = seq(min(dim),max(dim)+1,20),
#Set this to T to include the lowest value
include.lowest = T,
#To set labels as a sequence of integers
labels = F))
# id dim x
#1 1 25 1
#2 2 34 1
#3 3 60 2
#4 4 65 2
#5 5 80 3
#6 6 82 3
#7 7 90 4
#8 8 95 4
#9 9 110 5
#10 10 120 5
为什么80被标记为3,当65和80之间的差值为15,为什么80被标记为3,当65和80之间的差值为15,嘿,非常感谢,它工作得非常好,我的朋友!嘿,太谢谢你了,我的朋友,这很好用!