Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/67.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Dplyr为成对的列集对每个列进行变异_R_Dplyr - Fatal编程技术网

Dplyr为成对的列集对每个列进行变异

Dplyr为成对的列集对每个列进行变异,r,dplyr,R,Dplyr,有没有一种方法可以使用dplyr::mutate_实现以下转换 data.frame(x1 = 1:5, x2 = 6:10, y1 = rnorm(5), y2 = rnorm(5)) %>% mutate(diff1 = x1 - y1, diff2 = x2 - y2) ## x1 x2 y1 y2 diff1 diff2 ## 1 1 6 1.03645018 -0.8602099 -0.03645018 6

有没有一种方法可以使用dplyr::mutate_实现以下转换

data.frame(x1 = 1:5, x2 = 6:10, y1 = rnorm(5), y2 = rnorm(5)) %>%
  mutate(diff1 = x1 - y1, diff2 = x2 - y2) 

##   x1 x2          y1         y2       diff1     diff2
## 1  1  6  1.03645018 -0.8602099 -0.03645018  6.860210
## 2  2  7 -1.10790835  1.6912875  3.10790835  5.308712
## 3  3  8  0.95452119  2.7232657  2.04547881  5.276734
## 4  4  9  0.01370762  1.6385765  3.98629238  7.361424
## 5  5 10  0.19354354 -1.0464360  4.80645646 11.046436
我意识到这是一个简单的例子,正如我所描述的,很容易完成,但我正试图用一组更大的列来完成类似的事情


谢谢

这并不是每个都使用mutate_,也不是很漂亮,我也不认为它会很快,但是:

#create data set
p<-data.frame(x1 = 1:5, x2 = 6:10,
          y1 = rnorm(5), y2 = rnorm(5),
          z1 = 11:15, z2 = rnorm(5),
          w1 = rchisq(5,2), w2 = rgamma(5, .2)) 

#subset the columns by their column number and subtract them
p[,ncol(p)+seq(1,ncol(p)/2, by = 1)]<-
p[,seq(1,ncol(p),by = 2)]-p[,seq(2,ncol(p), by = 2)]
#创建数据集

p根据@Gregor在评论中提到的,如果您想使用
dplyr
,最好以整洁的格式获取数据。这里有一个想法:

library(dplyr)
library(tidyr)

df %>%
  add_rownames() %>%
  gather(key, val, -rowname) %>%
  separate(key, c("var", "num"), "(?<=[a-z]) ?(?=[0-9])") %>%
  spread(var, val) %>%
  mutate(diff = x - y) 

如果出于某种原因,在执行操作后仍需要宽格式的数据,则可以向管道中添加:

  gather(key, value, -(rowname:num)) %>%
  unite(key_num, key, num, sep = "") %>%
  spread(key_num, value)
这将使:

#Source: local data frame [5 x 7]
#
#  rowname       diff1     diff2    x1    x2          y1         y2
#    (chr)       (dbl)     (dbl) (dbl) (dbl)       (dbl)      (dbl)
#1       1 -0.03645018  6.860210     1     6  1.03645018 -0.8602099
#2       2  3.10790835  5.308713     2     7 -1.10790835  1.6912875
#3       3  2.04547881  5.276734     3     8  0.95452119  2.7232657
#4       4  3.98629238  7.361423     4     9  0.01370762  1.6385765
#5       5  4.80645646 11.046436     5    10  0.19354354 -1.0464360

数据

df <- structure(list(x1 = 1:5, x2 = 6:10, y1 = c(1.03645018, -1.10790835, 
0.95452119, 0.01370762, 0.19354354), y2 = c(-0.8602099, 1.6912875, 
2.7232657, 1.6385765, -1.046436)), .Names = c("x1", "x2", "y1", 
"y2"), class = "data.frame", row.names = c("1", "2", "3", "4", "5"))

df我很高兴听到其他解决方案,但我特别感兴趣的是,是否可以用mutate_each
id
定义为原始行,然后按id及其琐碎内容分组。你只是在寻找一个更快的解决方案吗?这是一个聪明的解决方案,@Gregor。我从没想过。老实说,这可能就是我今后处理这个问题的方法。我仍然认为,能够以类似于mutate_each的方式将相同的函数应用于列集合会很好,但这可能是我的问题所特有的一个用例。我认为,如果您有相关列的对、三元组或其他元组,那么您的数据就不整洁。您正在对属于自己列的列名中的信息进行编码,
1
2
。而
dplyr
是为处理整洁的数据而构建的。使用向量循环,您可以将最后一行中的
seq()
调用替换为
c(T,F)
c(F,T)
。例如,
mtcars[,c(T,F)]
给出所有奇数列,
mtcars[,c(F,T)]
给出所有偶数列。
df <- structure(list(x1 = 1:5, x2 = 6:10, y1 = c(1.03645018, -1.10790835, 
0.95452119, 0.01370762, 0.19354354), y2 = c(-0.8602099, 1.6912875, 
2.7232657, 1.6385765, -1.046436)), .Names = c("x1", "x2", "y1", 
"y2"), class = "data.frame", row.names = c("1", "2", "3", "4", "5"))