Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/82.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 按组减去每列的最小值-将减去的值添加到df中的另一列_R_For Loop_Dplyr_Apply_Mutate - Fatal编程技术网

R 按组减去每列的最小值-将减去的值添加到df中的另一列

R 按组减去每列的最小值-将减去的值添加到df中的另一列,r,for-loop,dplyr,apply,mutate,R,For Loop,Dplyr,Apply,Mutate,我在下面有一个数据框: date group col1 col2 col3 col4 col5 1234 1 -2 3 4 -5 100 1235 1 4 5 -2 -7 200 1234 1 -5 2 9 1

我在下面有一个数据框:

date    group    col1    col2    col3     col4     col5      
1234        1      -2       3       4       -5      100       
1235        1       4       5      -2       -7      200       
1234        1      -5       2       9        1      400       
1235        1       8       2      -4        7      900       
1235        2     -72      83     -54       98      800      
1233        2      32     -21      -1        4      900      
1342        2     -54       0     -10      -11      100      
1234        2      98      -8      -9      -10      100      
以下是我想做的:

对于从df[,3]到倒数第二列的列,我要执行以下操作:

1) 对于每列,按组取正数的最小值和负数的最小值

2) 然后使用以下逻辑替换当前值:

a) 如果该值为正数,则按组减去为正数找到的最小值

b) 如果该值为负值,则按组减去为负数找到的最小值

c) 如果该值为0,则不进行更改

3) 然后获取该行中每个值减去的总值,并将其添加到最后一列值

Minimum for col1 neg, group 1 = -5
Minimum for col1 pos, group 1 = 4
Minimum for col1 neg, group 2 = -72
Minimum for col1 pos, group 2 = 32
Minimum for col2 neg, group 1 = NA
Minimum for col2 pos, group 1 = 2
etc.  
我希望我的最终输出如下所示:

date    group         col1      col2      col3          col4            col5      
1234        1      -2-(-5)       3-2       4-4       -5-(-7)            100+(-5)+2+4+(-7)       
1235        1         4-4        5-2   -2-(-4)       -7-(-7)            200+4+2+(-4)+(-7)      
1234        1      -5-(-5)       2-2       9-4           1-1               400+(-5)+2+4+1       
1235        1         8-4        2-2   -4-(-4)           7-1               900+4+2+(-4)+1       
1235        2    -72-(-72)     83-83 -54-(-54)          98-4         800+(-72)+83+(-54)+4      
1233        2       32-32  -21-(-21)  -1-(-54)           4-4         900+32+(-21)+(-54)+4      
1342        2    -54-(-72)       0-0 -10-(-54)     -11-(-11)      100+(-72)+0+(-54)+(-11)      
1234        2       98-32   -8-(-21)  -9-(-54)     -10-(-11)     100+32+(-21)+(-54)+(-11) 
预期产出:

date    group         col1      col2      col3          col4            col5      
1234        1            3         1         0             2              94       
1235        1            0         3         2             0             195      
1234        1            0         0         5             0             402       
1235        1            4         0         0             6             903       
1235        2            0         0         0            94             761      
1233        2            0         0        53             0             861      
1342        2           18         0        44             0             -37      
1234        2           66        13        45             1              46

按“组”分组后,
使用正负数的
min
值将列“col1”变为“col4”,然后将数字的行和与“col5”相加并更新“col5”。稍后,通过从初始数据集(“df1”)的相应列中减去,将“col1”更新为“col4”


或者转换为“长”格式进行计算,然后将其更改为“宽”

library(tidyverse)
df1 %>% 
  rownames_to_column('rn') %>%
  gather(key, val, col1:col4) %>%
  group_by(group, key, sn= sign(val)) %>% 
  mutate(mnVal = min(val)) %>%
  group_by(rn) %>% 
  mutate(col5 = col5 + sum(mnVal), val = val - mnVal) %>% 
  select(-sn, -mnVal) %>%
  spread(key, val) %>%
  ungroup %>% 
  select(names(df1))
数据
df1道歉是的意思是说min only你的尝试是什么?问题已经回答了。在我的实际数据集中,存在可能导致此警告的NA值-
In min(col4[col4<0]):min没有未丢失的参数;返回-Inf
。根据@akrun的回答,不确定这是否是原因
library(rlang)
expr <- paste(glue::glue('{nm1} - {nm1}_new'), collapse=";")
df1 %>% 
   group_by(group) %>%
   mutate_at(3:6, funs(new = ave(., sign(.), FUN = min))) %>%
   ungroup %>%
   mutate(col5 = col5 + select(., col1_new:col4_new)  %>% 
                    reduce(`+`)) %>% 
   transmute(date, group, !!! parse_exprs(expr), col5) %>%
   rename_at(3:6, ~ nm1)
# A tibble: 8 x 7
#   date group  col1  col2  col3  col4  col5
#  <int> <int> <int> <int> <int> <int> <int>
#1  1234     1     3     1     0     2    94
#2  1235     1     0     3     2     0   195
#3  1234     1     0     0     5     0   402
#4  1235     1     4     0     0     6   903
#5  1235     2     0     0     0    94   761
#6  1233     2     0     0    53     0   861
#7  1342     2    18     0    44     0   -37
#8  1234     2    66    13    45     1    46
library(tidyverse)
df1 %>% 
  rownames_to_column('rn') %>%
  gather(key, val, col1:col4) %>%
  group_by(group, key, sn= sign(val)) %>% 
  mutate(mnVal = min(val)) %>%
  group_by(rn) %>% 
  mutate(col5 = col5 + sum(mnVal), val = val - mnVal) %>% 
  select(-sn, -mnVal) %>%
  spread(key, val) %>%
  ungroup %>% 
  select(names(df1))
df1 <- structure(list(date = c(1234L, 1235L, 1234L, 1235L, 1235L, 1233L, 
1342L, 1234L), group = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), col1 = c(-2L, 
4L, -5L, 8L, -72L, 32L, -54L, 98L), col2 = c(3L, 5L, 2L, 2L, 
83L, -21L, 0L, -8L), col3 = c(4L, -2L, 9L, -4L, -54L, -1L, -10L, 
-9L), col4 = c(-5L, -7L, 1L, 7L, 98L, 4L, -11L, -10L), col5 = c(100L, 
200L, 400L, 900L, 800L, 900L, 100L, 100L)), .Names = c("date", 
"group", "col1", "col2", "col3", "col4", "col5"), 
  class = "data.frame", row.names = c(NA, 
-8L))