如何基于R中的条件创建新列

如何基于R中的条件创建新列,r,if-statement,R,If Statement,我有三个班的学生A、B和C。我的可复制数据集如下所示: data <- data.frame(Student_ID =c(1,1,1,2,2,3,3,3,3,3,4,4,4,5,6,6,7,7,7,8,8), Years_Attended = c(1991,1992,1995,1992,1993,1991,1992,1993,1994,1995,1993,1994,1995,1995,1993,1995,1990,1995,2000,1995,1996

我有三个班的学生A、B和C。我的可复制数据集如下所示:

data <- data.frame(Student_ID =c(1,1,1,2,2,3,3,3,3,3,4,4,4,5,6,6,7,7,7,8,8),
                   Years_Attended = c(1991,1992,1995,1992,1993,1991,1992,1993,1994,1995,1993,1994,1995,1995,1993,1995,1990,1995,2000,1995,1996),
                   Class = c("A","A","A","A","A","A","A","A","A","A","B","B","B","B","B","B","C","C","C","C","C"))

Intended_output <- data.frame(Student_ID = c(1,1,1,2,2,3,3,3,3,3,4,4,4,5,6,6,7,7,7,8,8),
                              Years_Attended = c(1991,1992,1995,1992,1993,1991,1992,1993,1994,1995,1993,1994,1995,1995,1993,1995,1990,1995,2000,1995,1996),
                              Class = c("A","A","A","A","A","A","A","A","A","A","B","B","B","B","B","B","C","C","C","C","C"),
                              New_Student = c("No","No","No","Yes","Yes","No","No","No","No","No","No","No","No","Yes","No","No","No","No","No","Yes","Yes"))


数据对于每个
班级
查找一年中的最短时间,将此列作为新列添加到您的数据集中,并针对每个学生检查他们是否参加了该年的课程

library(dplyr)

data %>%
  group_by(Class) %>%
  summarise(min_year = min(Years_Attended)) %>%
  left_join(data, by = 'Class') %>%
  group_by(Class, Student_ID) %>%
  mutate(New_Student = if(any(Years_Attended == first(min_year)))'No' else 'Yes')

#  Class min_year Student_ID Years_Attended New_Student
#   <chr>    <dbl>      <dbl>          <dbl> <chr>      
# 1 A         1991          1           1991 No         
# 2 A         1991          1           1992 No         
# 3 A         1991          1           1995 No         
# 4 A         1991          2           1992 Yes        
# 5 A         1991          2           1993 Yes        
# 6 A         1991          3           1991 No         
# 7 A         1991          3           1992 No         
# 8 A         1991          3           1993 No         
# 9 A         1991          3           1994 No         
#10 A         1991          3           1995 No         
# … with 11 more rows
库(dplyr)
数据%>%
分组依据(类别)%>%
总结(最低年数=最低年数))%>%
左联合(数据,由='类')%>%
分组依据(班级,学生ID)%>%
变异(新学生=如果(任何(就读年数==第一年(最小年))“否”或“是”)
#班级最低年级学生ID年级新生
#                              
#1 A 1991 1 1991第号
#2 A 1991 1 1992号
#3 A 1991 1 1995第号
#4 A 1991 2 1992是的
#5 A 1991 2 1993是的
#6 A 1991 3 1991号
#7 A 1991 3 1992号
#8 A 1991 3 1993号
#9 A 1991 3 1994号
#10 A 1991 3 1995号
#…还有11排

以下是一个
数据。表
方法:

库(data.table)
dt=as.data.table(数据)
dt[,新生:={
班级最低=最低(就读年数)
ave(参加年数,学生ID,乐趣=函数(x)如果其他(课程分钟!=分钟(x),‘是’、‘否’)
},
by=类]

这其实是一个复杂的问题。我们需要找出每个班级的最低学年,然后在每个班级中找出哪些学生的起始学年不等于最低学年。

错误是由括号输入错误造成的。它应该是
…=min(数据$Years\u Attended),“是”、“否”)
在min通话后加上括号非常感谢大家来帮助我
library(dplyr)

data %>%
  group_by(Class) %>%
  summarise(min_year = min(Years_Attended)) %>%
  left_join(data, by = 'Class') %>%
  group_by(Class, Student_ID) %>%
  mutate(New_Student = if(any(Years_Attended == first(min_year)))'No' else 'Yes')

#  Class min_year Student_ID Years_Attended New_Student
#   <chr>    <dbl>      <dbl>          <dbl> <chr>      
# 1 A         1991          1           1991 No         
# 2 A         1991          1           1992 No         
# 3 A         1991          1           1995 No         
# 4 A         1991          2           1992 Yes        
# 5 A         1991          2           1993 Yes        
# 6 A         1991          3           1991 No         
# 7 A         1991          3           1992 No         
# 8 A         1991          3           1993 No         
# 9 A         1991          3           1994 No         
#10 A         1991          3           1995 No         
# … with 11 more rows