dplyr 0.3无法内部联接data.table?

dplyr 0.3无法内部联接data.table?,r,data.table,dplyr,R,Data.table,Dplyr,我已加载以下设置和dplyr(0.3)以及数据表(1.9.3) 这是数据集。2个data.tables和2个data.frames。这两套内容相同 DT_1 = data.table(x = rep(c("a","b","c"), each = 3), y = c(1,3,6), v = 1:9) DT_2 = data.table(V1 = c("b","c"),foo = c(4,2)) DT_1_df = data.frame(x = rep(c("a","b","c"), each =

我已加载以下设置和dplyr(0.3)以及数据表(1.9.3)

这是数据集。2个data.tables和2个data.frames。这两套内容相同

DT_1 = data.table(x = rep(c("a","b","c"), each = 3), y = c(1,3,6), v = 1:9)
DT_2 = data.table(V1 = c("b","c"),foo = c(4,2))

DT_1_df = data.frame(x = rep(c("a","b","c"), each = 3), y = c(1,3,6), v = 1:9)
DT_2_df = data.frame(V1 = c("b","c"),foo = c(4,2))
数据表方式 当使用data.table方式对两个数据表进行内部联接时,我们得到以下结果:

setkey(DT_1, x); setkey(DT_2, V1)
DT_1[DT_2]
  x y v foo
1: b 1 4   4
2: b 3 5   4
3: b 6 6   4
4: c 1 7   2
5: c 3 8   2
6: c 6 9   2
dplyr0.3 data.tables上的内部_联接 在两个数据表上使用dplyr的内部_连接时,会出现错误:

inner_join(DT_1, DT_2, by=("x"="V1"))
Error in setkeyv(x, by$x) : some columns are not in the data.table: V1
dplyr0.3 data.frame和data.table上的内部连接 如果使用dataframe处理datatable,则会出现另一个错误:

inner_join(DT_1, DT_2_df, by = c("x" = "V1"))
Error: Data table joins must be on same key
dplyr0.3 data.frames上的内部连接 然而,内部连接在数据帧上运行良好:

inner_join(DT_1_df, DT_2_df, by = c("x" = "V1"))
  x y v foo
1 b 1 4   4
2 b 3 5   4
3 b 6 6   4
4 c 1 7   2
5 c 3 8   2
6 c 6 9   2
有人能解释为什么会发生这种情况吗?

为了完整起见,请在此处发布研究结果。 经过检查,目前dplyr“join”的功能似乎有限。引用:“当前连接变量在左侧和右侧必须相同。”下面的测试似乎证实了这一点:

library(dplyr); library(data.table)
DT_1 = data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9)
DT_2 = data.table(V1=c("b","c"),foo=c(4,2)) # note the variable name assigned to first column
DT_2b = data.table(x=c("b","c"),foo=c(4,2)) # note the variable name assigned to first column

inner_join(DT_1, DT_2b, by= "x")
Source: local data table [6 x 4]
  x y v foo
1 b 1 4   4
2 b 3 5   4
3 b 6 6   4
4 c 1 7   2
5 c 3 8   2
6 c 6 9   2

inner_join(DT_1, DT_2, by = c("x" = "V1"))
Error: Data table joins must be on same key

显然,最明显的解释是这是一个bug?@hadley:怀疑这也是一个bug,除非它是一个不太可能的预期设计。dplyr和data.table是非常有用的包。如果包的功能可以无缝地在数据帧和数据表上工作,那就太好了。非常感谢。
library(dplyr); library(data.table)
DT_1 = data.table(x=rep(c("a","b","c"),each=3), y=c(1,3,6), v=1:9)
DT_2 = data.table(V1=c("b","c"),foo=c(4,2)) # note the variable name assigned to first column
DT_2b = data.table(x=c("b","c"),foo=c(4,2)) # note the variable name assigned to first column

inner_join(DT_1, DT_2b, by= "x")
Source: local data table [6 x 4]
  x y v foo
1 b 1 4   4
2 b 3 5   4
3 b 6 6   4
4 c 1 7   2
5 c 3 8   2
6 c 6 9   2

inner_join(DT_1, DT_2, by = c("x" = "V1"))
Error: Data table joins must be on same key