R 选择具有指定列属性的ID_R

R 选择具有指定列属性的ID

R 选择具有指定列属性的ID,r,R,我试图进行选择的数据如下所示 ID Field Rank 8 6 Other Prof 9 6 Other Prof 13 7 Other Assoc 16 7 Other Assoc 17 7 Other Prof 18 8 Other Assoc 19 8 Other Assoc 22 9 Other Assoc 23 9 Other Assoc 24 9 Other Prof 让我们用base R来实现这一点（尽管plyr正在招手）：

我试图进行选择的数据如下所示

ID Field Rank 8 6 Other Prof 9 6 Other Prof 13 7 Other Assoc 16 7 Other Assoc 17 7 Other Prof 18 8 Other Assoc 19 8 Other Assoc 22 9 Other Assoc 23 9 Other Assoc 24 9 Other Prof 让我们用base R来实现这一点（尽管

plyr

正在招手）：编辑根据新提供的

dput

输出进行调整和测试

dfr<-df.promotion #just so I don't have to change too much below
colnames(dfr)<-c("ID", "Rank") #just so I don't have to change too much below
promotedIDs<-unique(dfr$ID)[sapply(unique(dfr$ID), function(curID){
  hasBoth<-(sum(is.na(match(c("Assoc", "Prof"), dfr$Rank[dfr$ID==curID]))) == 0)
})]
result<-dfr[dfr$ID %in% promotedIDs,]

dfr您可以使用xtabs
按ID
和Rank
将数据制成表格：
tab <- xtabs(~ID+Rank,dfr)
tab
   Rank
ID  Assoc Prof
  6     0    2
  7     2    1
  8     2    0
  9     2    1

这里有一个相当容易理解的方法，它使用您的第一个爱好，即使用subset（）
：
我创建了p
这是每个教授的id
。然后我创建了a
这是每个同事的id。然后使用%

中的

%选择所有同时担任过助理和教授的人，这给了我一组密钥，然后我可以使用这些密钥对初始data.frame进行子集设置
p <- unique(subset(df.promotion, rank=="Prof")$id)
a <- unique(subset(df.promotion, rank=="Assoc")$id)

mySet <- a[a %in% p]
subset(df.promotion, id %in% mySet)

p这里是使用plyr
的常用单行程序。代码的工作原理是：（a）按id拆分数据帧，以及（b）仅选择那些子集
，这些子集具有大于1个唯一秩（这是提升的代理）
欢迎来到SO！我为您和其他人复制了这个示例，下次也尝试这样做，例如使用dput（）。不再需要我的或您的：）谢谢，我把您的留在了。仍然习惯于发帖和R。谢谢，我会玩一玩，让你知道它是怎么回事。像rownames（tab）[apply（tab！=0,1，all）]更清晰一些吗@BenBolker是的，看起来很不错better@BenBolker谢谢我使用了你提供的第一个例子，取得了很大的成功。同样有效。谢谢
tab <- xtabs(~ID+Rank,dfr)
tab
   Rank
ID  Assoc Prof
  6     0    2
  7     2    1
  8     2    0
  9     2    1

subset(dfr,ID %in% rownames(tab[as.logical(apply(tab,1,prod)),]))
   ID Field  Rank
13  7 Other Assoc
16  7 Other Assoc
17  7 Other  Prof
22  9 Other Assoc
23  9 Other Assoc
24  9 Other  Prof

p <- unique(subset(df.promotion, rank=="Prof")$id)
a <- unique(subset(df.promotion, rank=="Assoc")$id)

mySet <- a[a %in% p]
subset(df.promotion, id %in% mySet)

require(plyr)
ddply(df.promotion, .(id), subset, length(unique(rank)) > 1)