R 有没有办法更有效地编码这个范围？_R

R 有没有办法更有效地编码这个范围？

R 有没有办法更有效地编码这个范围？,r,R,嗨，我有一个文件被导入了r，我想对其中一列重新编码 Number of People 1 to 3 4 to 6 7 to 10 . . . . “人数”一栏总共有30多个级别。我想做的是将它们转换成数值（即“1到3”变成“2”，“4到6”变成“5”）由于我有大量的数据要处理，是否有更有效的方法对此进行重新编码，还是只有使用recode（）才能进行重新编码谢谢样本数据： df <- data.frame( Number_of_ppl = c("1 to 3&qu

嗨，我有一个文件被导入了r，我想对其中一列重新编码

Number of People
1 to 3
4 to 6 
7 to 10
.
.
.
.

“人数”一栏总共有30多个级别。我想做的是将它们转换成数值（即“1到3”变成“2”，“4到6”变成“5”）

由于我有大量的数据要处理，是否有更有效的方法对此进行重新编码，还是只有使用recode（）才能进行重新编码

谢谢

样本数据：

df <- data.frame(
  Number_of_ppl = c("1 to 3", "40 to 45")
)

如果要将平均值作为数据帧中的新列，请将结果存储为新变量：

df$Number_of_ppl_mean <- sapply(lapply(str_extract_all(df$Number_of_ppl, "\\d+"), as.numeric), mean)

这是一个基于

dplyr

的解决方案，其基本结构与Chris Ruehlemann的答案相同

library(dplyr)
library(stringr)

df <- data.frame(Number_of_People = c("1 to 3",
                                       "4 to 6",
                                       "7 to 10"))

df %>%
  mutate(first_numb = as.numeric(str_extract(Number_of_People, "^\\d{1,}")),
         second_numb = as.numeric(str_extract(Number_of_People, "\\d{1,}$"))) %>%
  rowwise() %>%
  mutate(avg = mean(c(first_numb, second_numb)))
# A tibble: 3 x 4
  Number_of_People first_numb second_numb   avg
  <fct>                 <dbl>       <dbl> <dbl>
1 1 to 3                    1           3   2  
2 4 to 6                    4           6   5  
3 7 to 10                   7          10   8.5

库（dplyr）
图书馆（stringr）
df%
mutate（first_numb=as.numeric（str_extract（人数，“^\\d{1，}”），
second_numb=as.numeric（str_extract（人数，“\\d{1，}$”））%>%
行（）
变异（平均值=平均值（c（第一次麻木，第二次麻木）））
#一个tibble:3x4
人数第一位第二位平均人数
1至3 1 3 2
2 4至6 4 6 5
3 7至10 7 10 8.5

我们也可以使用

separate

将列一分为二，然后得到列的

平均值
library(dplyr)
library(tidyr)
df %>% 
     separate(Number_of_People, into = c("first", "second"), sep="\\s*to\\s*",
           convert = TRUE, remove = FALSE) %>% 
     mutate(avg =  (first + second)/2)
#  Number_of_People first second avg
#1           1 to 3     1      3 2.0
#2           4 to 6     4      6 5.0
#3          7 to 10     7     10 8.5

数据
df7到10的平均值（8.5）df是多少？你在帖子中没有提到你想要两个数字的平均值。
library(dplyr)
library(stringr)

df <- data.frame(Number_of_People = c("1 to 3",
                                       "4 to 6",
                                       "7 to 10"))

df %>%
  mutate(first_numb = as.numeric(str_extract(Number_of_People, "^\\d{1,}")),
         second_numb = as.numeric(str_extract(Number_of_People, "\\d{1,}$"))) %>%
  rowwise() %>%
  mutate(avg = mean(c(first_numb, second_numb)))
# A tibble: 3 x 4
  Number_of_People first_numb second_numb   avg
  <fct>                 <dbl>       <dbl> <dbl>
1 1 to 3                    1           3   2  
2 4 to 6                    4           6   5  
3 7 to 10                   7          10   8.5

library(dplyr)
library(tidyr)
df %>% 
     separate(Number_of_People, into = c("first", "second"), sep="\\s*to\\s*",
           convert = TRUE, remove = FALSE) %>% 
     mutate(avg =  (first + second)/2)
#  Number_of_People first second avg
#1           1 to 3     1      3 2.0
#2           4 to 6     4      6 5.0
#3          7 to 10     7     10 8.5

df <- data.frame(Number_of_People = c("1 to 3",
                                       "4 to 6",
                                       "7 to 10"))