R 如何基于特定间隔创建新列_R

R 如何基于特定间隔创建新列

R 如何基于特定间隔创建新列,r,R,我想根据另一列“dim”中的间隔在我的数据框中创建一个新列例如：我的数据集是： df1 id dim 1 25 2 34 3 60 4 65 5 80 6 82 7 90 8 95 9 110 10 120 I would like the follow data set below using the interval by 20 (my column begin with 25 for a new column x factors: 25:44 = 1 45=64=

我想根据另一列“dim”中的间隔在我的数据框中创建一个新列

例如：

我的数据集是：

df1
id dim
1  25
2  34
3  60
4  65
5  80
6  82
7  90
8  95
9  110
10 120

I would like the follow data set below using the interval by 20 (my column begin with 25 for a new column x
factors: 25:44 = 1 45=64= 2 and so on...
df2
id dim x
1  25  1
2  34  1
3  60  2
4  65  3
5  80  3
6  82  3
7  90  4
8  95  4
9  110 5
10 120 5

有人能帮我吗？

你可以用

地板和一些数学：
df您可以使用floor
和一些数学运算来完成此操作：
df我们可以使用%/%
来确定“dim”和“dim”的第一个值之间的差异
library(dplyr)
df %>% 
   mutate(x = (dim - first(dim)) %/% 20 + 1)
#   id dim x
#1   1  25 1
#2   2  34 1
#3   3  60 2
#4   4  65 3
#5   5  80 3
#6   6  82 3
#7   7  90 4
#8   8  95 4
#9   9 110 5
#10 10 120 5


或者使用findInterval

df %>% 
   mutate(x = findInterval(dim, seq(20, length.out = n(), by = 20), all.inside = TRUE))

数据
df我们可以使用%/%
来确定“dim”和“dim”的第一个值之间的差异
library(dplyr)
df %>% 
   mutate(x = (dim - first(dim)) %/% 20 + 1)
#   id dim x
#1   1  25 1
#2   2  34 1
#3   3  60 2
#4   4  65 3
#5   5  80 3
#6   6  82 3
#7   7  90 4
#8   8  95 4
#9   9 110 5
#10 10 120 5


或者使用findInterval

df %>% 
   mutate(x = findInterval(dim, seq(20, length.out = n(), by = 20), all.inside = TRUE))

数据
df这是一个使用cut
的tidyverse
解决方案
library(tidyverse)
df %>%
  mutate(x = cut(dim, 
                 #Add 1 to the maximum value in dim to make sure it is included in the categorization.
                 breaks = seq(min(dim),max(dim)+1,20),
                 #Set this to T to include the lowest value
                 include.lowest = T,
                 #To set labels as a sequence of integers
                 labels = F))

#   id dim x
#1   1  25 1
#2   2  34 1
#3   3  60 2
#4   4  65 2
#5   5  80 3
#6   6  82 3
#7   7  90 4
#8   8  95 4
#9   9 110 5
#10 10 120 5

下面是一个使用cut
的tidyverse
解决方案
library(tidyverse)
df %>%
  mutate(x = cut(dim, 
                 #Add 1 to the maximum value in dim to make sure it is included in the categorization.
                 breaks = seq(min(dim),max(dim)+1,20),
                 #Set this to T to include the lowest value
                 include.lowest = T,
                 #To set labels as a sequence of integers
                 labels = F))

#   id dim x
#1   1  25 1
#2   2  34 1
#3   3  60 2
#4   4  65 2
#5   5  80 3
#6   6  82 3
#7   7  90 4
#8   8  95 4
#9   9 110 5
#10 10 120 5

为什么80被标记为3，当65和80之间的差值为15，为什么80被标记为3，当65和80之间的差值为15，嘿，非常感谢，它工作得非常好，我的朋友！嘿，太谢谢你了，我的朋友，这很好用！