Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/65.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
dplyr:使用mutate()创建具有复杂操作的新列_R_Dplyr_Mutate - Fatal编程技术网

dplyr:使用mutate()创建具有复杂操作的新列

dplyr:使用mutate()创建具有复杂操作的新列,r,dplyr,mutate,R,Dplyr,Mutate,我想使用原始(df)创建一个新的数据帧(new.df),但使用dplyr包中的mutate函数的复杂操作创建一个新的列(Age)。我的步骤是: # Artificial dataframe IDtest<-c(1,1,1,1,1,1,2,2,2,3,3,3,3) Class<-c(1,1,2,2,2,3,1,1,2,1,2,2,3) Day<-c(0,47,76,100,150,173,0,47,76,0,47,76,100) Area<-c(0.45,0.85,1.50

我想使用原始(
df
)创建一个新的数据帧(
new.df
),但使用
dplyr
包中的
mutate
函数的复杂操作创建一个新的列(
Age
)。我的步骤是:

# Artificial dataframe
IDtest<-c(1,1,1,1,1,1,2,2,2,3,3,3,3)
Class<-c(1,1,2,2,2,3,1,1,2,1,2,2,3)
Day<-c(0,47,76,100,150,173,0,47,76,0,47,76,100)
Area<-c(0.45,0.85,1.50,1.53,1.98,5.2,
         0.36,0.58,1.2,
         0.85,1.36,2.26,3.59)
df<-data.frame(cbind(IDtest, Class, Day, Area))
str(df)

#Split each IDtest
df[df[,1]==1,]
#  IDtest Class Day Area
#1      1     1   0 0.45
#2      1     1  47 0.85
#3      1     2  76 1.50
#4      1     2 100 1.53
#5      1     2 150 1.98
#6      1     3 173 5.20

有什么想法吗?

这很棘手,所以我分别做了所有步骤,以便您更容易发现任何可能的误解。 你这一行有没有可能出错

(1.98-1)/((1.98-0.85)/150) + (157 - 47) # 157? wouldn't it be 150?
也就是说,我的第一堂课成绩与你的相同,但请注意第二堂课和第三堂课,因为我不确定是否正确理解了第二和第三步,我也不确定你是否使用了“last”(即课堂上的“last”或“上一堂课的“last”)

在第二步中,我在类中使用“last”,在第三步中,我使用for循环来使用“previous”。我想你可以改变这个想法

df2 <- df %>% 
  group_by(IDtest, Class) %>%
    mutate(
      DayOrder = row_number() 
    )

df2 <- df2 %>%
  mutate(step1a = Area[max(DayOrder)], # I divide step1 in several steps to make it clearer
     minus =  # what you want to substract
       case_when(
         step1a < 1 ~ 0,
         step1a < 2.9 ~ 1,
         step1a < 8.9 ~ 3,
         step1a < 24.9 ~ 9,
         step1a > 25 ~ 25
       ),
     step1done = step1a - minus, 
     step2a = Area[max(DayOrder)] - Area[min(DayOrder)], # "Last" inside the same Class (as it is inside mutate, which is under group_by)
     step2b = Day[max(DayOrder)],
     step2done = step2a / step2b,
     step1by2 = step1done / step2done
     )


df2$step3 <- NA 
for (i in 1:max(df2$Class)){
  if(i == 1){
     df2$step3[Class == i] <- max(df2$Day[df2$Class == i]) - 0 # quite silly
     }else{
     df2$step3[Class == i] <- max(df2$Day[df2$Class == i]) - max(df2$Day[df2$Class == i - 1]) # "Last" as the "previous" Class, not inside the same Class
 }}


df2 %>%
  mutate(
    step3done = step1by2 + step3,
    Age = step3done / 365 # Do you want "age" as a unique value?? not a value for each person? This case I would do this outside mutate and add as a new column
  )
df2%
分组依据(IDtest,Class)%>%
变异(
DayOrder=行号()
)
df2%
变异(步骤1a=面积[max(daydorder)],我将步骤1分为几个步骤以使其更清晰
减=#要减去的内容
什么时候(
步骤1a<1~0,
步骤1a<2.9~1,
步骤1a<8.9~3,
步骤1a<24.9~9,
步骤1a>25~25
),
Step1One=step1a-减,
step2a=区域[max(DayOrder)]-区域[min(DayOrder)],#同一类中的“Last”(因为它在mutate中,属于组_by)
步骤2b=日[最大(日订单)],
Step2One=step2a/step2b,
step1by2=Step1One/Step2One
)

df2$step3 Miguel,你的代码帮了大忙!!!我现在在max(df2$Day[df2$Class==I-1]中有一个小问题,因为我的数据集中的类必须非连续地增加时间。类将值1更改为4。在这种情况下,您的代码会给我-Inf值,因为必须使用最后的类值(1)而不是不存在的3。你看到这里有任何简单的修改吗?我想我并不完全理解你。所以“类”在某种程度上是一组人,它可以是“a”、“B”、“C”……但你还需要考虑“最后一天”(前一个类)。如果你有几个类别的“类”,你可以使用特定的If(){}else{}例如,如果到4的前一个类是2,那么如果到4的前一个类是2,那么您可以执行如下操作:if(i==4){…-max(df2$Day[df2$Class==2])}或者不使用for循环:df2$step3[Class==4]谢谢Miguel!!非常感谢您的帮助。
df2 <- df %>% 
  group_by(IDtest, Class) %>%
    mutate(
      DayOrder = row_number() 
    )

df2 <- df2 %>%
  mutate(step1a = Area[max(DayOrder)], # I divide step1 in several steps to make it clearer
     minus =  # what you want to substract
       case_when(
         step1a < 1 ~ 0,
         step1a < 2.9 ~ 1,
         step1a < 8.9 ~ 3,
         step1a < 24.9 ~ 9,
         step1a > 25 ~ 25
       ),
     step1done = step1a - minus, 
     step2a = Area[max(DayOrder)] - Area[min(DayOrder)], # "Last" inside the same Class (as it is inside mutate, which is under group_by)
     step2b = Day[max(DayOrder)],
     step2done = step2a / step2b,
     step1by2 = step1done / step2done
     )


df2$step3 <- NA 
for (i in 1:max(df2$Class)){
  if(i == 1){
     df2$step3[Class == i] <- max(df2$Day[df2$Class == i]) - 0 # quite silly
     }else{
     df2$step3[Class == i] <- max(df2$Day[df2$Class == i]) - max(df2$Day[df2$Class == i - 1]) # "Last" as the "previous" Class, not inside the same Class
 }}


df2 %>%
  mutate(
    step3done = step1by2 + step3,
    Age = step3done / 365 # Do you want "age" as a unique value?? not a value for each person? This case I would do this outside mutate and add as a new column
  )