使用setDT在R中进行子集设置以删除值

使用setDT在R中进行子集设置以删除值,r,dataframe,R,Dataframe,您好,我正在使用R studio筛选数据集中出现次数少于5000次的各种葡萄酒 我已经运行了下面的函数- #create new data frame with varities greater than 5000 wineVar <- setDT(wineNew)[, if(.N > 5000) .SD, by = variety] #list the unique varieties to show theres only 5 unique(wineVar$variety) 有

您好,我正在使用R studio筛选数据集中出现次数少于5000次的各种葡萄酒

我已经运行了下面的函数-

#create new data frame with varities greater than 5000
wineVar <- setDT(wineNew)[, if(.N > 5000) .SD, by = variety]
#list the unique varieties to show theres only 5
unique(wineVar$variety)

有没有办法完全删除这些内容,因为这会导致我的训练集出现问题-即训练集仍然可以看到值,但没有关于已删除的种类的数据。

我想你要找的是这个。你快到了

wineVar <- setDT(wineNew)
wineVar <- wineVar[, .SD[.N > 5000], by = variety]
wineVar[, Variety:=as.factor(as.character(Variety))]

wineVar只需重构它。正如在
wineVar$variation中一样,非常感谢您解决了我的问题。另一个稍微简单一点的选项:
variation:=droplevels(variation)
wineVar <- setDT(wineNew)
wineVar <- wineVar[, .SD[.N > 5000], by = variety]
wineVar[, Variety:=as.factor(as.character(Variety))]