R 从数据集中删除表_R - Fatal编程技术网

R 从数据集中删除表

R 从数据集中删除表,r,R,从数据集中删除所选数据时遇到问题。我有一个集合示例，还有另一个选定行表（toremove）。我正在尝试从原始设置中删除（删除）我尝试使用setdiff，但尽管有行被切掉（根据环境变量），但并没有删除选定的数据产品1#答案#1--------------------------------------------------------------- AnswerinComments流行的dplyr软件包也有一个setdiff功能。但是，它需要相同的数据结构-在您的情况下：相同的因子级别： #

从数据集中删除所选数据时遇到问题。我有一个集合示例，还有另一个选定行表（toremove）。我正在尝试从原始设置中删除（删除）

我尝试使用setdiff，但尽管有行被切掉（根据环境变量），但并没有删除选定的数据

产品1

#答案#1---------------------------------------------------------------
AnswerinComments流行的dplyr
软件包也有一个setdiff
功能。但是，它需要相同的数据结构-在您的情况下：相同的因子级别：
## factors to character vectors if needed...
# idx <- sapply(Prod, class) == "factor"
# Prod[idx] <- sapply(Prod[idx], as.character)
# toremove[idx] <- sapply(toremove[idx], as.character)
library(dplyr)
setdiff(Prod, toremove)

##如果需要，将因子转换为字符向量。。。
#idxProd[！（rownames（Prod）%in%rownames（toremove）），]可能吗？虽然我不明白，在本例中，如果您只需要行名称，为什么还要创建一个全新的数据集，但我创建一个新的数据集Prod1只是为了进行良好的实践，因此我可以确保它工作正常。我不需要在最终脚本中使用新的集合。不管怎样，我都不建议使用行名称，因为它们往往在第一个子集合之后混乱不堪。你最好有其他更可靠的索引。
toremove <- structure(list(CountryCode = c(5000L, 5400L, 5300L, 5400L, 5200L
), Country = structure(c(4L, 3L, 2L, 3L, 1L), .Label = c("Americas + (Total)", 
"Asia + (Total)", "Europe + (Total)", "World + (Total)"), class = "factor"), 
ItemCode = c(116L, 1717L, 1817L, 1817L, 1717L), Item = structure(c(3L, 
2L, 1L, 1L, 2L), .Label = c("Cereals (Rice Milled Eqv) + (Total)", 
"Cereals,Total + (Total)", "Potatoes"), class = "factor"), 
ElementGroup = c(51L, 51L, 51L, 51L, 51L), ElementCode = c(5510L, 
5510L, 5510L, 5510L, 5510L), Element = structure(c(1L, 1L, 
1L, 1L, 1L), .Label = "Production", class = "factor"), Unit = structure(c(1L, 
1L, 1L, 1L, 1L), .Label = "tonnes", class = "factor"), Y1961 = c(2.71e+08, 
2.64e+08, 2.63e+08, 2.63e+08, 2.28e+08), Y1962 = c(2.53e+08, 
2.81e+08, 2.78e+08, 2.81e+08, 2.4e+08), Y1963 = c(2.7e+08, 
2.5e+08, 2.95e+08, 2.49e+08, 2.62e+08), Y1964 = c(2.85e+08, 
2.96e+08, 3.1e+08, 2.96e+08, 2.49e+08)), .Names = c("CountryCode", 
"Country", "ItemCode", "Item", "ElementGroup", "ElementCode", 
"Element", "Unit", "Y1961", "Y1962", "Y1963", "Y1964"), class = "data.frame", row.names = c(NA, 
-5L))

# Answer #1 ---------------------------------------------------------------
AnswerinComments <- Prod[!(rownames(Prod) %in% rownames(toremove )),]

# Answer #2 ---------------------------------------------------------------
require(sqldf)
AnotherWay  <- sqldf("Delete a from Prod a inner join toremove b 
                     on a.CountryCode = b.CountryCode 
                    and a.ElementCode = b. ElementCode")

# Answer #3 ---------------------------------------------------------------
all <- rbind(Prod, toremove)
duplicated(all)
YetAnother <- all[!duplicated(all,fromLast = FALSE) & 
                  !duplicated(all,fromLast = TRUE),] 

## factors to character vectors if needed...
# idx <- sapply(Prod, class) == "factor"
# Prod[idx] <- sapply(Prod[idx], as.character)
# toremove[idx] <- sapply(toremove[idx], as.character)
library(dplyr)
setdiff(Prod, toremove)