任意数量数据帧的rbind列表
我有一个数据帧列表,每个数据帧中有一些重叠的列。列表中的数据帧数未知。如何有效地在base中将数据帧绑定在一起,并用零填充非重叠列 示例数据:任意数量数据帧的rbind列表,r,R,我有一个数据帧列表,每个数据帧中有一些重叠的列。列表中的数据帧数未知。如何有效地在base中将数据帧绑定在一起,并用零填充非重叠列 示例数据: x <- data.frame(a=1:2, b=1:2, c=1:2) y <- data.frame(a=1:2, r=1:2, f=1:2) z <- data.frame(b=1:3, c=1:3, v=1:3, t=c("A", "A", "D")) L1 <- list(x, y, z) a b c f r t
x <- data.frame(a=1:2, b=1:2, c=1:2)
y <- data.frame(a=1:2, r=1:2, f=1:2)
z <- data.frame(b=1:3, c=1:3, v=1:3, t=c("A", "A", "D"))
L1 <- list(x, y, z)
a b c f r t v
1 1 1 1 0 0 0 0
2 2 2 2 0 0 0 0
3 1 0 0 1 1 0 0
4 2 0 0 2 2 0 0
5 0 1 1 0 0 A 1
6 0 2 2 0 0 A 2
7 0 3 3 0 0 D 3
用缺少的列填充每个数据帧,然后重新绑定:
allnames <- unique(unlist(lapply(L1, names)))
do.call(rbind, lapply(L1, function(df) {
not <- allnames[!allnames %in% names(df)]
df[, not] <- 0
df
}))
allnames用缺少的列填充每个数据帧,然后重新绑定它们:
allnames <- unique(unlist(lapply(L1, names)))
do.call(rbind, lapply(L1, function(df) {
not <- allnames[!allnames %in% names(df)]
df[, not] <- 0
df
}))
allnames我有一个这样做的。我在这里做了一个修改,允许指定填充
RBIND <- function(datalist, keep.rownames = TRUE, fill = NA) {
Len <- sapply(datalist, ncol)
if (all(diff(Len) == 0)) {
temp <- names(datalist[[1]])
if (all(sapply(datalist, function(x) names(x) %in% temp))) tryme <- "basic"
else tryme <- "complex"
}
else tryme <- "complex"
almost <- switch(
tryme,
basic = { do.call("rbind", datalist) },
complex = {
Names <- unique(unlist(lapply(datalist, names)))
NROWS <- c(0, cumsum(sapply(datalist, nrow)))
NROWS <- paste(NROWS[-length(NROWS)]+1, NROWS[-1], sep=":")
out <- lapply(1:length(datalist), function(x) {
emptyMat <- matrix(fill, nrow = nrow(datalist[[x]]), ncol = length(Names))
colnames(emptyMat) <- Names
emptyMat[, match(names(datalist[[x]]),
colnames(emptyMat))] <- as.matrix(datalist[[x]])
emptyMat
})
do.call("rbind", out)
})
Final <- as.data.frame(almost, row.names = 1:nrow(almost))
Final <- data.frame(lapply(Final, function(x) type.convert(as.character(x))))
if (isTRUE(keep.rownames)) {
row.names(Final) <- make.unique(unlist(lapply(datalist, row.names)))
}
Final
}
我有一个做这个的。我在这里做了一个修改,允许指定填充
RBIND <- function(datalist, keep.rownames = TRUE, fill = NA) {
Len <- sapply(datalist, ncol)
if (all(diff(Len) == 0)) {
temp <- names(datalist[[1]])
if (all(sapply(datalist, function(x) names(x) %in% temp))) tryme <- "basic"
else tryme <- "complex"
}
else tryme <- "complex"
almost <- switch(
tryme,
basic = { do.call("rbind", datalist) },
complex = {
Names <- unique(unlist(lapply(datalist, names)))
NROWS <- c(0, cumsum(sapply(datalist, nrow)))
NROWS <- paste(NROWS[-length(NROWS)]+1, NROWS[-1], sep=":")
out <- lapply(1:length(datalist), function(x) {
emptyMat <- matrix(fill, nrow = nrow(datalist[[x]]), ncol = length(Names))
colnames(emptyMat) <- Names
emptyMat[, match(names(datalist[[x]]),
colnames(emptyMat))] <- as.matrix(datalist[[x]])
emptyMat
})
do.call("rbind", out)
})
Final <- as.data.frame(almost, row.names = 1:nrow(almost))
Final <- data.frame(lapply(Final, function(x) type.convert(as.character(x))))
if (isTRUE(keep.rownames)) {
row.names(Final) <- make.unique(unlist(lapply(datalist, row.names)))
}
Final
}
比我的怪物功能要好得多:-)@Ananda我鼓励你让自己的功能保持正常,因为这对其他人来说是一种学习经验。也许未来的搜索者会遇到相关问题,而你的功能将填补这一空缺。这比我的怪物功能要好得多:-)@Ananda我鼓励你让你的功能保持正常,因为这对其他人来说是一种学习经验。也许未来的搜索者会遇到相关问题,而你的功能将填补这一空缺。查看plyr::rbind.fill
查看plyr::rbind.fill