合并(基R)-意外结果
我正在尝试使用R中的合并(基R)-意外结果,r,join,merge,R,Join,Merge,我正在尝试使用R中的merge函数合并两个数据帧。这两个数据帧通过公共列advBucket进行合并,但是新数据帧中只存在advBucket中的第一个因子,我不理解 library("dplyr") dmB <- c(0.0, 0.5, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 8.0, 10.0, 15.0, 20.0, 30.0, 100.0, 200.0) ordersDM$advBucket <- cut(ordersDM$adv, breaks=dmB, i
merge
函数合并两个数据帧。这两个数据帧通过公共列advBucket
进行合并,但是新数据帧中只存在advBucket
中的第一个因子,我不理解
library("dplyr")
dmB <- c(0.0, 0.5, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 8.0, 10.0, 15.0, 20.0, 30.0, 100.0, 200.0)
ordersDM$advBucket <- cut(ordersDM$adv, breaks=dmB, include.lowest=TRUE, right=TRUE)
temp1 <- summarize( group_by(ordersDM, advBucket),
eISmedian = median(is),
eISmean = mean(is))
orders1 <- merge(ordersDM, temp1, by = "advBucket", all=TRUE)
意外结果:ordersDM$advBucket仅包含(0.5,1),ordersDM$adv仅包含介于0.5和1.0之间的值
identical( levels(ordersDM$advBucket), levels( temp1$advBucket) )
[1] TRUE
dput(head(ordersDM$advBucket))
structure(c(13L, 9L, 13L, 13L, 13L, 6L), .Label = c("[0,0.5]",
"(0.5,1]", "(1,2]", "(2,3]", "(3,4]", "(4,5]", "(5,6]", "(6,8]",
"(8,10]", "(10,15]", "(15,20]", "(20,30]", "(30,100]", "(100,200]"
), class = "factor")
dput(head(temp1))
structure(list(advBucket = structure(1:6, .Label = c("[0,0.5]",
"(0.5,1]", "(1,2]", "(2,3]", "(3,4]", "(4,5]", "(5,6]", "(6,8]",
"(8,10]", "(10,15]", "(15,20]", "(20,30]", "(30,100]", "(100,200]"
), class = "factor"), eISmedian = c(0, -0.1612095, -4.8167, -4.478417,
-19.447492, -20.224064), eISmean = c(-2.28172053945819, -6.18051401299694,
-10.8404419365303, -16.4115004132139, -30.8983449604262, -31.3046641767241
)), .Names = c("advBucket", "eISmedian", "eISmean"), class = c("tbl_df",
"data.frame"), row.names = c(NA, -6L))
根据“迷人的手指”的建议,以下各项按预期工作:
orders1 <- full_join(ordersDM, temp1, by = "advBucket")
orders 1您能让您的问题完全重现吗?例如,什么是ordersDM
?summary
不在任何默认包中。它在我加载的两个包中,我怀疑您的问题可能来自第三个包。什么是相同的(级别(ordersDMadvBucket)、级别(temp1$advBucket))
return?至少需要dput(head(ordersDMadvBucket))
和dput(head(temp1))
看起来您可能正在使用dplyr
?如果是这样,我相信在调用摘要时它会降低未使用的因子级别。尝试使用完全加入(ordersDM,temp1)
而不是base::merge
。我敢打赌,您会收到一条关于比较不同级别的因素的警告,但结果看起来与预期的一样?完全连接(ordersDM,temp1,by=“advBucket”)按预期工作,没有任何警告-谢谢您的指尖。
orders1 <- full_join(ordersDM, temp1, by = "advBucket")