R 如何相对于另一列和组更改列
我有两个专栏R 如何相对于另一列和组更改列,r,dataframe,R,Dataframe,我有两个专栏 PERNO TPURP loop 1 Loop trip 1 1 Loop trip 2 1 home 2 1 shopping 2 2 work 1 2 Loop trip 2 2 school 2 3 Looptrip 1 4 work
PERNO TPURP loop
1 Loop trip 1
1 Loop trip 2
1 home 2
1 shopping 2
2 work 1
2 Loop trip 2
2 school 2
3 Looptrip 1
4 work 1
对于每个perno,如果TPURP==循环行程,我想在该行后面的循环中添加1
对于每个PERNO,如果循环行程正好位于另一个循环行程的下一行,我们不会在第一个循环行程中添加1,但会在第二个循环行程中添加1
输出
PERNO TPURP loop
1 Loop trip 1
1 Loop trip 2
1 home 3
1 shopping 3
2 work 1
2 Loop trip 2
2 school 3
3 Looptrip 1
4 work 1
资料
使用
dplyr
,我们可以按PERNO
分组,并在组中最后一次出现“回路跳闸”
后增加回路的值
library(dplyr)
df %>%
group_by(PERNO) %>%
mutate(loop1 = ifelse(any(TPURP == "Loop trip") &
row_number() > max(which(TPURP == "Loop trip")),loop + 1, loop))
# PERNO TPURP loop loop1
# <int> <fct> <int> <dbl>
#1 1 Loop trip 1 1
#2 1 Loop trip 2 2
#3 1 home 2 3
#4 1 shopping 2 3
#5 2 work 1 1
#6 2 Loop trip 2 2
#7 2 school 2 3
#8 3 Looptrip 1 1
#9 4 work 1 1
或者我们可以使用grepl
/grep
进行部分匹配,而不是@Sotos提到的精确匹配。在更新的数据集上,我们可以
df %>%
group_by(PERNO) %>%
dplyr::mutate(loop1 = ifelse(any(grepl('Loop', TPURP)) &
row_number() > max(grep('Loop', TPURP)), loop + 1, loop))
# PERNO TPURP loop loop1
# <dbl> <fct> <dbl> <dbl>
#1 1 (8) Dropped off passenger 1 1
#2 1 (1) Working at home (for pay) 1 1
#3 1 (24) Loop trip 2 2
#4 1 (24) Loop trip 2 2
#5 1 (9) Picked up passenger 2 3
#6 1 (2) All other home activities 2 3
df%>%
分组人(PERNO)%>%
dplyr::mutate(loop1=ifelse(any)(grepl('Loop',TPURP))&
行数()>max(grep('Loop',TPURP)),Loop+1,Loop))
#PERNO TPURP环路1
#
#1(8)下车乘客1 1
#2 1(1)在家工作(带薪)1 1
#3 1(24)回路跳闸2 2
#4 1(24)回路跳闸2 2
#5 1(9)接载乘客2 3
#6 1(2)所有其他家庭活动2 3
为什么工作
和学校
也在增加?我真的很抱歉,环行旅行没有增加,之后的一行也在增加。plz see editit不会更改我的数据中的loop1,我得到以下警告:在max中(其中(TPURP==“Loop trip”):max没有未丢失的参数;在my中返回-Infloop1==循环data@hghg完全按照数据中的方式使用“循环行程”。是“循环跳闸”还是“循环跳闸”或“循环跳闸”还是其他什么?只需使用grep
…这应该可以df%>%groupby(PERNO)%%>%mutate(loop1=ifelse(any(grepl('Loop',TPURP))&row_number()>max(which(grepl('Loop',TPURP)),Loop+1,Loop))
@RonakShah我使用我的数据信息,但问题是一样的
df <- structure(list(PERNO = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 4L),
TPURP = structure(c(2L, 2L, 1L, 5L, 6L, 2L, 4L, 3L, 6L), .Label = c("home",
"Loop trip", "Looptrip", "school", "shopping", "work"), class = "factor"),
loop = c(1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L)), class = "data.frame",
row.names = c(NA, -9L))
df %>%
group_by(PERNO) %>%
dplyr::mutate(loop1 = ifelse(any(grepl('Loop', TPURP)) &
row_number() > max(grep('Loop', TPURP)), loop + 1, loop))
# PERNO TPURP loop loop1
# <dbl> <fct> <dbl> <dbl>
#1 1 (8) Dropped off passenger 1 1
#2 1 (1) Working at home (for pay) 1 1
#3 1 (24) Loop trip 2 2
#4 1 (24) Loop trip 2 2
#5 1 (9) Picked up passenger 2 3
#6 1 (2) All other home activities 2 3