Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/postgresql/9.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 使用data.table按组计算马氏距离_R_Data.table - Fatal编程技术网

R 使用data.table按组计算马氏距离

R 使用data.table按组计算马氏距离,r,data.table,R,Data.table,我有一个下面的样本数据(d1和d2),我正试图通过变量carb计算马氏距离,然后附加到d1 library(data.table) library(StatMatch) #mahalanobis.distance df<-as.data.table(mtcars)[carb %in% c(2,4), .(mpg, carb, vs)] # two groups of carb d1<-df[vs==0,.(mpg,carb)] d2<-df[vs==1,.(mpg,carb)

我有一个下面的样本数据(d1和d2),我正试图通过变量carb计算
马氏距离
,然后附加到d1

library(data.table)
library(StatMatch) #mahalanobis.distance

df<-as.data.table(mtcars)[carb %in% c(2,4), .(mpg, carb, vs)] # two groups of carb
d1<-df[vs==0,.(mpg,carb)]
d2<-df[vs==1,.(mpg,carb)]

#for carb==2, 

md2<-mahalanobis.dist(d1[carb==2,mpg],d2[carb==2,mpg])

             1        2        3         4         5
1 1.0416378 1.626417 1.681240 0.9502661 0.2923896
2 0.7492482 1.334027 1.388850 0.6578765 0.5847791
3 2.1380986 2.722878 2.777701 2.0467269 0.8040713
4 2.1380986 2.722878 2.777701 2.0467269 0.8040713
5 0.4934074 1.078186 1.133010 0.4020356 0.8406200

您不需要单独的数据集。只需在原始数据集中按条件计算距离

df[, mahalanobis.dist(mpg[vs == 0], mpg[vs == 1]), keyby = carb]
#    carb        V1
# 1:    2 1.0416378
# 2:    2 1.6264169
# 3:    2 1.6812399
# 4:    2 0.9502661
# 5:    2 0.2923896
# 6:    2 0.7492482
# 7:    2 1.3340273
# 8:    2 1.3888504
# 9:    2 0.6578765
# ...
实际上,您可以直接在
mtcars
上运行此操作,而无需创建任何新的数据集

as.data.table(mtcars)[carb %in% c(2, 4), 
                      mahalanobis.dist(mpg[vs == 0], mpg[vs == 1]), 
                      keyby = carb]

嗯,我很高兴知道你想要什么样的输出。另外,您根本不需要
with=FALSE
。@David:谢谢您的建议。所需的输出是我为每种碳水化合物单独计算的输出。谢谢你,大卫。这正是我想要的。
df[, mahalanobis.dist(mpg[vs == 0], mpg[vs == 1]), keyby = carb]
#    carb        V1
# 1:    2 1.0416378
# 2:    2 1.6264169
# 3:    2 1.6812399
# 4:    2 0.9502661
# 5:    2 0.2923896
# 6:    2 0.7492482
# 7:    2 1.3340273
# 8:    2 1.3888504
# 9:    2 0.6578765
# ...
as.data.table(mtcars)[carb %in% c(2, 4), 
                      mahalanobis.dist(mpg[vs == 0], mpg[vs == 1]), 
                      keyby = carb]