R 如何根据条件和上面的行操作行的值?
我有一个df,我想按R 如何根据条件和上面的行操作行的值?,r,dplyr,R,Dplyr,我有一个df,我想按链接对它进行分组,并按时间对它进行排序。然后,每次type=='vehicle leave traffic'列count中的单元格应加上前一行单元格值的+1。如果type==“车辆进入交通”则应从前一行中扣除1 为了澄清,不应更改前一行的值,而应基于前一行的值更改该行的值 这是我的方法,但我得到的是0、1和2。我期望一些链接的值更高 parking_min <- cars %>% group_by(link)%>% dplyr::mutate(c
链接对它进行分组
,并按时间对它进行排序
。然后,每次type=='vehicle leave traffic'
列count
中的单元格应加上前一行单元格值的+1。如果type==“车辆进入交通”
则应从前一行中扣除1
为了澄清,不应更改前一行的值,而应基于前一行的值更改该行的值
这是我的方法,但我得到的是0、1和2。我期望一些链接的值更高
parking_min <- cars %>%
group_by(link)%>%
dplyr::mutate(count = if_else(type == 'vehicle leaves traffic', lag(count, n=1,order_by=time)+1,lag(count))) %>%
dplyr::mutate(count = if_else(type == 'vehicle enters traffic',lag(count, n=1, order_by=time)-1,lag(count)))
可能的产出:
time type vehicle_id link count
18798 23707.0 vehicle enters traffic 1267069 90 0 #start point
64777 31209.0 vehicle leaves traffic 810534 90 1 #+1
64783 31210.0 vehicle enters traffic 810534 90 0 #-1
90025 36230.0 vehicle leaves traffic 51825 90 1
90030 36231.0 vehicle enters traffic 51825 90 0
102868 38925.0 vehicle leaves traffic 1326473 90 1
105834 39583.0 vehicle leaves traffic 1199672 90 2 #here as well 1+1 =2
108690 40198.0 vehicle leaves traffic 1111105 90 3 #2+1 =3
111727 40818.0 vehicle enters traffic 1111105 90 2 #3-1 =2
118283 41974.0 vehicle leaves traffic 532654 90 3
124349 42895.0 vehicle enters traffic 532654 90 2
125700 43099.0 vehicle leaves traffic 1267069 90 3
129642 43683.0 vehicle enters traffic 1199672 90 2
135888 44645.0 vehicle leaves traffic 1398907 90 3
142577 45730.0 vehicle enters traffic 1398907 90 2
148772 46785.0 vehicle leaves traffic 1239391 90 3
161264 48846.0 vehicle enters traffic 1239391 90 2
161590 48905.0 vehicle enters traffic 1326473 90 1
182778 52790.0 vehicle leaves traffic 46491 90 2
最终我想找到每个链接的最大计数。但这可以在另一个步骤中完成,不需要成为解决方案的一部分,也许这有助于澄清问题。我认为这正是您想要的:
df %>%
group_by(link) %>%
arrange(time) %>%
mutate(
adder = case_when(
type == "vehicle leaves traffic" ~ 1,
type == "vehicle enters traffic" ~ -1,
TRUE ~ 0),
count = count + cumsum(adder)
) %>%
select(-adder)
df %>%
arrange(link, time) %>%
group_by(link) %>%
mutate(vehicles_entered_traffic = cumsum(type == "vehicle enters traffic")
, vehicles_left_traffic = cumsum(type == "vehicle leaves traffic")
, count = count[1] + vehicles_left_traffic - vehicles_entered_traffic)
给
time type vehicle_id link count
1 23707.0 vehicle enters traffic 1267069 90 0
2 31209.0 vehicle leaves traffic 810534 90 1
3 31210.0 vehicle enters traffic 810534 90 0
4 36230.0 vehicle leaves traffic 51825 90 1
5 36231.0 vehicle enters traffic 51825 90 0
6 38925.0 vehicle leaves traffic 1326473 90 1
7 39583.0 vehicle leaves traffic 1199672 90 2
8 40198.0 vehicle leaves traffic 1111105 90 3
9 40818.0 vehicle enters traffic 1111105 90 2
10 41974.0 vehicle leaves traffic 532654 90 3
我想这就是你想要的:
df %>%
group_by(link) %>%
arrange(time) %>%
mutate(
adder = case_when(
type == "vehicle leaves traffic" ~ 1,
type == "vehicle enters traffic" ~ -1,
TRUE ~ 0),
count = count + cumsum(adder)
) %>%
select(-adder)
df %>%
arrange(link, time) %>%
group_by(link) %>%
mutate(vehicles_entered_traffic = cumsum(type == "vehicle enters traffic")
, vehicles_left_traffic = cumsum(type == "vehicle leaves traffic")
, count = count[1] + vehicles_left_traffic - vehicles_entered_traffic)
如果你能提供一个最小的可复制的例子(可能还有预期的输出),那会很有帮助。@Georgery,我相信我提供的最小可复制的例子。我将尝试描述所需的输出,手工操作有点棘手。@Georgery添加了一个解决方案示例:)我的意思是,最好提供一个可能包含10行而不是300行的示例数据集@乔治亚哈,对不起,我总是担心给太少^ ^我会尝试更好地考虑最小的必要行数:)感谢提示和您的解决方案!您可能想按时间添加一些排序?