删除R data.table中的错误代码和关联记录

删除R data.table中的错误代码和关联记录,r,dataframe,data.table,R,Dataframe,Data.table,我在R中有一个data.table,比如说dt,它看起来像: > dt <- data.table(adr = c("A", "A", "A","A","A","A","A","B", "B", "C", "C", "C", "D", "E", "E"), code=c("0001","0001","0001","0001","0001","0001","0001","0001","0001", "0002", "0002", "0002", "0

我在R中有一个data.table,比如说dt,它看起来像:

> dt <- data.table(adr = c("A", "A", "A","A","A","A","A","B", "B", "C", "C", "C", "D", "E", "E"),
                  code=c("0001","0001","0001","0001","0001","0001","0001","0001","0001", "0002", "0002", "0002", "0003", "0003", "0003"),
                  num = c(1,67,875,467,986,34,987,876,785, 67,9078,45,907,451,987))
> dt
    adr code  num
 1:   A 0001    1
 2:   A 0001   67
 3:   A 0001  875
 4:   A 0001  467
 5:   A 0001  986
 6:   A 0001   34
 7:   A 0001  987
 8:   B 0001  876
 9:   B 0001  785
10:   C 0002   67
11:   C 0002 9078
12:   C 0002   45
13:   D 0003  907
14:   E 0003  451
15:   E 0003  987

如何在R中使用data.table实现这一点,我将您的
dt
设置为
data.frame()
而不是
data.table()
,这样我就不必加载其他包了,但您可以按如下方式完成:

require(dplyr)

dt <- dt %>% group_by(code, adr) %>% mutate(count = n()) %>% group_by(code) %>% filter(count == max(count)) %>% select(-count)
require(dplyr)
dt%分组依据(代码,adr)%%>%mutate(count=n())%%>%groupby(代码)%%>%filter(count==max(count))%%>%select(-count)

dt=dt[dt$adr!=“B”,如果大多数adr出现平局,或者大多数adr出现不超过50%,会发生什么情况?如果出现平局,第一个是正确的
数据。表
版本:
dt2 0.5*.N],by=code][,N:=NULL]
给出结果,错误的记录仍然存在:@Jack如果你想要的是模式(最常见的情况),这就可以做到:
dt[,{uadr=unique(adr);.SD[which(adr==uadr[which.max(tablate(match(adr,uadr)))],]},by=c('code')]
。如果有两个以上的备选地址,可能会有所帮助。
require(dplyr)

dt <- dt %>% group_by(code, adr) %>% mutate(count = n()) %>% group_by(code) %>% filter(count == max(count)) %>% select(-count)