Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/65.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 总结满足条件的第一行_R_Dplyr_Conditional Statements_Lag_Summarize - Fatal编程技术网

R 总结满足条件的第一行

R 总结满足条件的第一行,r,dplyr,conditional-statements,lag,summarize,R,Dplyr,Conditional Statements,Lag,Summarize,假设我有这个数据框: df <- data.frame( party = c("A", "A", "B", "A", "B"), votes = c(100, 99, 98, 97, 96), elected = c(1, 1, 1, 0, 0, 0) ) party votes elected 1 A 100 1 2 A 99 1 3 B 98 1 4 A

假设我有这个数据框:

    df <- data.frame(
    party = c("A", "A", "B", "A", "B"), 
    votes = c(100, 99, 98, 97, 96), 
    elected = c(1, 1, 1, 0, 0, 0)
)

  party votes elected
1     A   100       1
2     A    99       1
3     B    98       1
4     A    97       0
5     B    96       0

我尝试了
first()
lag()
使用
which()
的条件,但目前没有运气。非常感谢您的帮助。

这是使用
fuzzyjoin
-package的一个选项

library(fuzzyjoin)
library(tidyverse)

fuzzy_left_join(df, df %>% 
                  arrange(party, elected, desc(votes)) %>% 
                  group_by(party) %>% slice(1) , 
                by = c("party", "elected"), match_fun = list(`!=`, `>`)) %>%
select(ends_with("x"), votes.y)  

  party.x votes.x elected.x votes.y
1       A     100         1      96
2       A      99         1      96
3       B      98         1      97
4       A      97         0      NA
5       B      96         0      NA


也许这对你有用

你可以尝试使用一个函数

library(dplyr)

get_opposite_votes <- function(df, group) { 
   df %>% filter(party != group & elected == 0) %>% slice(1L) %>% pull(votes)
}


df %>%
  group_by(party) %>%
  mutate(new = get_opposite_votes(., first(party))) %>%
  ungroup() %>%
  #If needed to have NA values where elected = 0
  mutate(new = replace(new, elected == 0, NA))

#  party votes elected   new
#  <fct> <dbl>   <dbl> <dbl>
#1 A       100       1    96
#2 A        99       1    96
#3 B        98       1    97
#4 A        97       0    NA
#5 B        96       0    NA
库(dplyr)
获取\u反对票%filter(party!=组&当选==0)%%>%slice(1L)%%>%pull(选票)
}
df%>%
(缔约方)分组%>%
变异(新=获得反对票(,第一(党))%>%
解组()%>%
#如果需要,则选择NA值=0
变异(新=替换(新,当选==0,NA))
#政党投票选出新成员
#       
#1A 100 196
#2 A 99 1 96
#3 B 98 1 97
#4 A 97 0 NA
#5B960NA

逻辑是什么?如何用“挑战者候选人”标识行?可能有一种方法可以复制此数据的结果。但我认为,如果你能提供一个竞赛id,那么这个方法可能更具可复制性。例如,如果数据看起来更像这样:```所有观测值都属于同一个选举。挑战者候选人是第一位属于另一党派的未经选举产生的候选人。例如,第一排来自甲方,因此挑战者是来自不同政党(即B)的投票率最高的未经选举产生的候选人,该政党位于第5排。
library(dplyr)

get_opposite_votes <- function(df, group) { 
   df %>% filter(party != group & elected == 0) %>% slice(1L) %>% pull(votes)
}


df %>%
  group_by(party) %>%
  mutate(new = get_opposite_votes(., first(party))) %>%
  ungroup() %>%
  #If needed to have NA values where elected = 0
  mutate(new = replace(new, elected == 0, NA))

#  party votes elected   new
#  <fct> <dbl>   <dbl> <dbl>
#1 A       100       1    96
#2 A        99       1    96
#3 B        98       1    97
#4 A        97       0    NA
#5 B        96       0    NA