R 宽数据帧上的卡方检验_R_Chi Squared

R 宽数据帧上的卡方检验

R 宽数据帧上的卡方检验,r,chi-squared,R,Chi Squared,我有如下数据： ID gamesAlone gamesWithOthers gamesRemotely tvAlone tvWithOthers tvRemotely 1 1 1 2 1 1 3 1

我有如下数据：

ID  gamesAlone  gamesWithOthers  gamesRemotely  tvAlone  tvWithOthers  tvRemotely
1   1                                                    1
2                                1                       1
3                                1              1
4                                1              1
5                                1                       1
6                                1              1
7                                1              1
8               1                                        1
9   1                                                                   1

        Alone   WithOthers   Remotely
games   2       1            6
tv      4       4            1

我希望代码能够完成以下两件事：

首先，将其转换为一个整洁的列联表，如下所示：

ID  gamesAlone  gamesWithOthers  gamesRemotely  tvAlone  tvWithOthers  tvRemotely
1   1                                                    1
2                                1                       1
3                                1              1
4                                1              1
5                                1                       1
6                                1              1
7                                1              1
8               1                                        1
9   1                                                                   1

        Alone   WithOthers   Remotely
games   2       1            6
tv      4       4            1

其次，使用卡方检验这些活动（游戏和电视）在社交环境中是否不同

这是生成数据帧的代码：

data<-data.frame(ID=c(1,2,3,4,5,6,7,8,9),
             gamesAlone=c(1,NA,NA,NA,NA,NA,NA,NA,1),
             gamesWithOthers=c(NA,NA,NA,NA,NA,NA,NA,1,NA),
             gamesRemotely=c(NA,1,1,1,1,1,1,NA,NA),
             tvAlone=c(NA,NA,1,1,NA,1,1,NA,NA),
             tvWithOthers=c(1,1,NA,NA,1,NA,NA,1,NA),
             tvRemotely=c(NA,NA,NA,NA,NA,NA,NA,NA,1))

data这将使您以给定的形式进入列联表。建议：调用数据框data1
而不是data
，以避免混淆
library(dplyr)
library(tidyr)
data1_table <- data1 %>% 
  gather(key, value, -ID) %>% 
  mutate(activity = ifelse(grepl("^tv", key), substring(key, 1, 2), substring(key, 1, 5)), 
         context = ifelse(grepl("^tv", key), substring(key, 3), substring(key, 6))) %>% 
  group_by(activity, context) %>% 
  summarise(n = sum(value, na.rm = TRUE)) %>% 
  ungroup() %>% 
  spread(context, n)

# A tibble: 2 x 4
  activity Alone Remotely WithOthers
*    <chr> <dbl>    <dbl>      <dbl>
1    games     2        6          1
2       tv     4        1          4

省略第一列id（[-1]
），然后在删除NA值（NA.rm=TRUE
）的同时获取每列的总和（colSums
），并将长度为6的结果向量放入一个包含2行的矩阵中。如果需要，还可以相应地为矩阵标注标签（dimnames
参数）：
m简单而充分的解决方案。