使用dplyr从字符串变量中选择列_R_Dplyr

使用dplyr从字符串变量中选择列

使用dplyr从字符串变量中选择列,r,dplyr,R,Dplyr,我试图从字符串变量中选择列，并执行一些计算。假设我正在分析虹膜，我想找出所有长度和宽度之间的比率 # Manual mutation (ie: adding the column names explicitly in the mutate statement) iris %>% mutate(Sepal.ratio = Sepal.Length/Sepal.Width, Petal.ratio = Petal.Length/Petal.Width) #

我试图从字符串变量中选择列，并执行一些计算。

假设我正在分析虹膜，我想找出所有长度和宽度之间的比率

# Manual mutation (ie: adding the column names explicitly in the mutate statement) 
iris %>% 
  mutate(Sepal.ratio = Sepal.Length/Sepal.Width, 
         Petal.ratio = Petal.Length/Petal.Width)

# Output: 
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.ratio Petal.ratio
# 1          5.1         3.5          1.4         0.2  setosa    1.457143        7.00
# 2          4.9         3.0          1.4         0.2  setosa    1.633333        7.00
# 3          4.7         3.2          1.3         0.2  setosa    1.468750        6.50
# 4          4.6         3.1          1.5         0.2  setosa    1.483871        7.50
# 5          5.0         3.6          1.4         0.2  setosa    1.388889        7.00
# 6          5.4         3.9          1.7         0.4  setosa    1.384615        4.25

问题： 是否有任何方法可以使用变量或数据框（如下面定义的比率集）来指定列名

# Predefined or preprocessed column name set: 
ratioSets = rbind(c(value = 'Sepal.ratio', numerator = 'Sepal.Length', denominator = 'Sepal.Width'), 
                 c(value = 'Petal.ratio', numerator = 'Petal.Length', denominator = 'Petal.Width'))

# Automated mutation:
iris %>% 
  mutate(
    # How can I use the ratioSets here?
    # Something like : ratioSets$value = ratioSets$numerator / ratioSets$denominator
  )


# Expected Output: 
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.ratio Petal.ratio
# 1          5.1         3.5          1.4         0.2  setosa    1.457143        7.00
# 2          4.9         3.0          1.4         0.2  setosa    1.633333        7.00
# 3          4.7         3.2          1.3         0.2  setosa    1.468750        6.50
# 4          4.6         3.1          1.5         0.2  setosa    1.483871        7.50
# 5          5.0         3.6          1.4         0.2  setosa    1.388889        7.00
# 6          5.4         3.9          1.7         0.4  setosa    1.384615        4.25

假设分子总是在分母之前的一种方式（即长度在宽度之前）

或

我不明白你想要什么。你能包括几行你想要的输出吗？@Maiasaura我对这个问题做了进一步的解释。请让我知道，如果它仍然不清楚。完美，现在有意义。这在dplyr中有点挑战性，但我正在仔细考虑。谢谢@Sotos。你知道有没有办法通过

dplyr

的

mutate

？我肯定有。将代码翻译成dplyr将是一个很好的练习。实际上，我面临的主要挑战是通过dplyr传递变量colname。因此，如果在示例中，我预设了一些无法通过dplyr正确索引的值。

sapply(unique(sub('\\..*', '', names(iris[,-ncol(iris)]))), function(i)
        Reduce('/', iris[,-ncol(iris)][,grepl(i, sub('\\..*', '', names(iris[,-ncol(iris)])))]))

head(cbind(iris, sapply(unique(sub('\\..*', '', names(iris[,-ncol(iris)]))), 
         function(i) Reduce('/', iris[,-ncol(iris)][,grepl(i, sub('\\..*', '', names(iris[,-ncol(iris)])))]))))

#  Sepal.Length Sepal.Width Petal.Length Petal.Width Species    Sepal Petal
#1          5.1         3.5          1.4         0.2  setosa 1.457143  7.00
#2          4.9         3.0          1.4         0.2  setosa 1.633333  7.00
#3          4.7         3.2          1.3         0.2  setosa 1.468750  6.50
#4          4.6         3.1          1.5         0.2  setosa 1.483871  7.50
#5          5.0         3.6          1.4         0.2  setosa 1.388889  7.00
#6          5.4         3.9          1.7         0.4  setosa 1.384615  4.25