Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/75.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 根据不同的场景创建用于粘贴或删除元素的循环_R - Fatal编程技术网

R 根据不同的场景创建用于粘贴或删除元素的循环

R 根据不同的场景创建用于粘贴或删除元素的循环,r,R,假设我有以下数据集: mydf <- data.frame( "MemberID"=c("111","0111A","0111B","112","0112A","113","0113B"), "resign.date"=c("2013/01/01",NA,NA,"2014/03/01",NA,NA,NA)) 注:11111 2和113是家庭代表的ID 我想做两件

假设我有以下数据集:

mydf <- data.frame( "MemberID"=c("111","0111A","0111B","112","0112A","113","0113B"),
                    "resign.date"=c("2013/01/01",NA,NA,"2014/03/01",NA,NA,NA))                                            
注:11111 2和113是家庭代表的ID

我想做两件事:

a如果我有一个家庭代表的辞职日期,例如在111的情况下,我想粘贴0111A和0111B的相同辞职日期,如果你想知道的话,这些代表111的配偶和子女 b如果我没有家族代表的辞职日期,例如113,我只想删除第113行和0113B行

我的结果数据框应如下所示:

mydf <- data.frame("MemberID"=c("111","0111A","0111B","112","0112A"),
                    "resign.date"=c("2013/01/01","2013/01/01","2013/01/01","2014/03/01","2014/03/01"))
提前谢谢

如果Dimit.date仅存在于某些MembersID中,且没有尾随字母,则使用data.table的解决方案

我们也可以使用tidyverse


您是否只有不带尾随字母的MemberID才有Dimit.date?@simone Yes Dimit.date仅适用于不带尾随字母的MemberID。在这种情况下,请查看下面的解决方案是否符合您的要求Yes它有效。我以前从未使用过data.table。您能告诉我第3行和第4行的作用是什么吗?另外,实际文件有一些不一致之处,例如,有时ID为“113”,配偶/子女ID为“0113A”和“0113B”。更好的代码可能是搜索113,并对带有尾随字母的ID执行粘贴和/或删除操作。我已经编辑了我的问题,我很高兴。你应该接受这个答案,如果这是我刚到这里后你的样子。你说的“接受答案”是什么意思?我是否只需点击“答案很有用”,因为我已经这样做了。要将答案标记为已接受,请点击答案旁边的复选标记,将其从灰色变为已填写。
library(data.table)

df <- data.table( "MemberID"=c("0111","0111A","0111B","0112","0112A","0113","0113B"),
                "resign.date"=c("2013/01/01",NA,NA,"2014/03/01",NA,NA,NA)) 

df <- df[order(MemberID)] ## order data : MemberIDs w/out trailing letters first by ID
df[, myID := gsub("\\D+", "", MemberID)] ## create myID col : MemberID w/out trailing letters

df[ , my.resign.date := resign.date[1L], by = myID] ##assign first occurrence of resign date by myID
df <- df[!is.na(my.resign.date)] ##drop rows if my.resign.date is missing
df <- data.table( "MemberID"=c("111","0111A","0111B","112","0112A","113","0113B"),
              "resign.date"=c("2013/01/01",NA,NA,"2014/03/01",NA,NA,NA)) 

df[, myID := gsub("(?<![0-9])0+", "", gsub("\\D+", "", MemberID), perl = TRUE)]
df <- df[order(myID, -MemberID)]

df[ , my.resign.date := resign.date[1L], by = myID]
df <- df[!is.na(my.resign.date)]
library(tidyverse)
mydf %>%
     group_by(grp = parse_number(MemberID)) %>% 
     mutate(resign.date = first(resign.date)) %>% 
     na.omit() %>% 
     ungroup() %>% 
     select(-grp)
# A tibble: 5 x 2
#   MemberID resign.date
#    <fctr>      <fctr>
#1     0111  2013/01/01
#2    0111A  2013/01/01
#3    0111B  2013/01/01
#4     0112  2014/03/01
#5    0112A  2014/03/01