R 组合两个循环结构以获得矩阵输出_R_Dataframe_Matrix_Dplyr_Tidyverse

R 组合两个循环结构以获得矩阵输出

r dataframe matrix

R 组合两个循环结构以获得矩阵输出,r,dataframe,matrix,dplyr,tidyverse,R,Dataframe,Matrix,Dplyr,Tidyverse,我在R中使用了两个密切相关的公式。我想知道是否可以将B1和B2组合起来，以获得下面所示的所需矩阵输出 z <- "group y1 y2 1 1 2 3 2 1 3 4 3 1 5 4 4 1 2 5 5 2 4 8 6 2 5 6 7 2 6 7 8 3 7 6 9 3

我在R中使用了两个密切相关的公式。我想知道是否可以将

B1

和

B2

组合起来，以获得下面所示的所需矩阵输出

z <- "group    y1    y2
1 1         2     3
2 1         3     4
3 1         5     4
4 1         2     5
5 2         4     8
6 2         5     6
7 2         6     7
8 3         7     6
9 3         8     7
10 3        10     8
11 3         9     5
12 3         7     6"

dat <- read.table(text = z, header = T)

(B1 = Reduce("+", group_split(dat, group, .keep = FALSE) %>%
  map(~ nrow(.)*(colMeans(.)-colMeans(dat[-1]))^2)))

#     y1       y2 
#61.86667 19.05000

(B2 = Reduce("+",group_split(dat, group, .keep = FALSE) %>%
              map(~ nrow(.)*prod(colMeans(.)-colMeans(dat[-1])))))

# 24.4

也许是这样

mat <- matrix(B2, length(B1), length(B1))
diag(mat) <- B1
mat
#      [,1]  [,2]
#[1,] 61.87 24.40
#[2,] 24.40 19.05

mat我们也可以在单链中执行此操作，而无需重新计算。与Reduce
中的+
相比，使用sum
的优点之一是，它可以使用na.rm
参数考虑缺少的值，而如果在执行+
时存在任何na，则由于na
的属性，它返回na

library(dplyr)
dat %>% 
     # // group by group
     group_by(group) %>%
     # // create a count column 'n' 
     summarise(n = n(), 
      # // loop across y1, y2, get the difference between the grouped 
      # // column  mean value and the full data column mean
       across(c(y1, y2), ~ (mean(.) - mean(dat[[cur_column()]]))),
          .groups = 'drop') %>% 
      # // create the columns by multiplying the output of y1, y2 with n        
     transmute(y1y2 = y1 * y2 * n, 
            # //  Get the raised power of y1, y2, and multiply with n
           across(c(y1, y2), list(new1 = ~ n * .^2))) %>%
     # // then do a columnwise sum, replicate the 'y1y2' clumn
     summarise(across(everything(), sum, na.rm = TRUE), y1y2new = y1y2) %>% 
     # // rearrange the column order
     select(c(2, 1, 4, 3)) %>% 
     # // unlist to a vector
     unlist %>%
     # // create a matrix with 2 rows, 2 columns
     matrix(2, 2)
#         [,1]  [,2]
#[1,] 61.86667 24.40
#[2,] 24.40000 19.05

您认为，B1
和B2
需要单独计算吗？（这就是问题所在）@rnorouzian你可以做Reduce（“+”，group_split（dat，group，.keep=FALSE）%>%map（{c（nrow（.）*（colMeans（）-colMeans（dat[-1]）^2，nrow（）*prod（colMeans（.）-colMeans（dat[-1]））来执行所有的计算。但是您仍然需要进行一些操作，以获得所需格式的输出。这两种操作几乎相同。因此，您可以进行一次分组并获得输出。请检查下面的解决方案。
library(dplyr)
dat %>% 
     # // group by group
     group_by(group) %>%
     # // create a count column 'n' 
     summarise(n = n(), 
      # // loop across y1, y2, get the difference between the grouped 
      # // column  mean value and the full data column mean
       across(c(y1, y2), ~ (mean(.) - mean(dat[[cur_column()]]))),
          .groups = 'drop') %>% 
      # // create the columns by multiplying the output of y1, y2 with n        
     transmute(y1y2 = y1 * y2 * n, 
            # //  Get the raised power of y1, y2, and multiply with n
           across(c(y1, y2), list(new1 = ~ n * .^2))) %>%
     # // then do a columnwise sum, replicate the 'y1y2' clumn
     summarise(across(everything(), sum, na.rm = TRUE), y1y2new = y1y2) %>% 
     # // rearrange the column order
     select(c(2, 1, 4, 3)) %>% 
     # // unlist to a vector
     unlist %>%
     # // create a matrix with 2 rows, 2 columns
     matrix(2, 2)
#         [,1]  [,2]
#[1,] 61.86667 24.40
#[2,] 24.40000 19.05