R 删除没有虚拟变量所需值顺序的个体?(面板数据)

R 删除没有虚拟变量所需值顺序的个体?(面板数据),r,R,我有面板数据,只想保留t=1的x=0和t=2的x=1的个体,以便: df <- data.frame( ID = c(1,1,2,2,3,3,4,4), time = c(1,2,1,2,1,2,1,2), x = c(0,1,0,0,1,1,1,0) ) ID time x 1 1 1 0 2 1 2 1 3 2 1 0 4 2 2 0 5 3 1 1 6 3 2 1 7 4 1 1 8 4

我有面板数据,只想保留t=1的x=0和t=2的x=1的个体,以便:

df <- data.frame(
    ID = c(1,1,2,2,3,3,4,4), 
    time = c(1,2,1,2,1,2,1,2), 
    x = c(0,1,0,0,1,1,1,0)
)
  ID time x
1  1    1 0
2  1    2 1
3  2    1 0
4  2    2 0
5  3    1 1
6  3    2 1
7  4    1 1
8  4    2 0 

尝试获取但未成功。

我扩展了示例数据,以更具体地包括ID 1不符合标准的情况。您可以使用库
dplyr
和分组筛选来执行此操作,如下所示:

df <- rbind(df, data.frame(ID = c(1, 1), time = c(2, 1), x = c(0, 1)))
df
   ID time x
1   1    1 0
2   1    2 1
3   2    1 0
4   2    2 0
5   3    1 1
6   3    2 1
7   4    1 1
8   4    2 0
9   1    2 0
10  1    1 1

# First, get all IDs where both conditions are present
df <- df %>% group_by(ID) %>% filter(any(time == 1 & x == 0) & any(time == 2 & x == 1))
df
Source: local data frame [4 x 3]
Groups: ID [1]

     ID  time     x
  (dbl) (dbl) (dbl)
1     1     1     0
2     1     2     1
3     1     2     0
4     1     1     1

# Filter within those IDs for the specific conditions
df %>% filter((time == 1 & x == 0 | time == 2 & x == 1))
Source: local data frame [2 x 3]
Groups: ID [1]

     ID  time     x
  (dbl) (dbl) (dbl)
1     1     1     0
2     1     2     1
df%过滤器(任意(时间==1&x==0)和任意(时间==2&x==1))
df
来源:本地数据帧[4 x 3]
分组:ID[1]
ID时间x
(dbl)(dbl)(dbl)
1     1     1     0
2     1     2     1
3     1     2     0
4     1     1     1
#在这些ID中筛选特定条件
df%>%过滤器((时间==1&x==0 |时间==2&x==1))
来源:本地数据帧[2 x 3]
分组:ID[1]
ID时间x
(dbl)(dbl)(dbl)
1     1     1     0
2     1     2     1

满足您的条件的行比您的输出多。很抱歉,我没有说它是按ID的。因此它必须在相同的ID号内。因此,您只想保留同时具有这两种情况的ID,并丢弃其余的?是的,这就是我想要的。如果个体2的x=0表示时间=2,个体3的x=1表示时间=1,这不是我想要保留的x序列。如果这有道理的话,我想把它“条件化”到个人身上!是的,我希望序列只在个人内部。所以id 2和id 3不会停留。仅ID=1,因为此人具有正确的x值
df <- rbind(df, data.frame(ID = c(1, 1), time = c(2, 1), x = c(0, 1)))
df
   ID time x
1   1    1 0
2   1    2 1
3   2    1 0
4   2    2 0
5   3    1 1
6   3    2 1
7   4    1 1
8   4    2 0
9   1    2 0
10  1    1 1

# First, get all IDs where both conditions are present
df <- df %>% group_by(ID) %>% filter(any(time == 1 & x == 0) & any(time == 2 & x == 1))
df
Source: local data frame [4 x 3]
Groups: ID [1]

     ID  time     x
  (dbl) (dbl) (dbl)
1     1     1     0
2     1     2     1
3     1     2     0
4     1     1     1

# Filter within those IDs for the specific conditions
df %>% filter((time == 1 & x == 0 | time == 2 & x == 1))
Source: local data frame [2 x 3]
Groups: ID [1]

     ID  time     x
  (dbl) (dbl) (dbl)
1     1     1     0
2     1     2     1