Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/82.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
dplyr中分组数据的卡方检验_R_Dplyr_Chi Squared - Fatal编程技术网

dplyr中分组数据的卡方检验

dplyr中分组数据的卡方检验,r,dplyr,chi-squared,R,Dplyr,Chi Squared,我很难总结如下所示的data.frame: db <- data.frame(ID = c(rep(1, 3), rep(2,4), rep(3, 2), 4), Gender = factor(c(rep("woman", 7), rep("man", 2), "woman")), Grade = c(rep(3, 3), rep(1, 4), rep(2, 2), 1), Drug = c(1, 2, 2,

我很难总结如下所示的
data.frame

db <- data.frame(ID = c(rep(1, 3), rep(2,4), rep(3, 2), 4),
             Gender = factor(c(rep("woman", 7), rep("man", 2), "woman")),
             Grade = c(rep(3, 3), rep(1, 4), rep(2, 2), 1),
             Drug = c(1, 2, 2, 1, 2, 6, 9, 8, 5, 1),
             Group = c(rep(1, 3), rep(2,4), rep(1, 2), 2))
db

#    ID Gender Grade Drug Group
# 1   1  woman     3    1     1
# 2   1  woman     3    2     1
# 3   1  woman     3    2     1
# 4   2  woman     1    1     2
# 5   2  woman     1    2     2
# 6   2  woman     1    6     2
# 7   2  woman     1    9     2
# 8   3    man     2    8     1
# 9   3    man     2    5     1
# 10  4  woman     1    1     2
gen <- factor(c("woman", "woman", "man", "woman"))
gr <- c(1, 2 ,1 ,2)
chisq.test(gen, gr)

#   Pearson's Chi-squared test with Yates' continuity correction
# 
# data:  gen and gr
# X-squared = 0, df = 1, p-value = 1
#
# Warning message:
# In chisq.test(gen, gr) : Chi-squared approximation may be incorrect
如何使用
dplyr
从我的
data.frame
计算p值?


我失败的方法是:

db %>% 
  group_by(ID) %>% 
  distinct(ID, Gender, Group) %>% 
  summarise_all(funs(chisq.test(db$Gender, 
                               db$Group)$p.value))
# A tibble: 4 x 3
#      ID Gender Group
#  <dbl>  <dbl> <dbl>
# 1    1.  0.429 0.429
# 2    2.  0.429 0.429
# 3    3.  0.429 0.429
# 4    4.  0.429 0.429
# Warning messages:
# 1: In chisq.test(db$Gender, db$Group) :
#   Chi-squared approximation may be incorrect
# 2: In chisq.test(db$Gender, db$Group) :
#   Chi-squared approximation may be incorrect
# 3: In chisq.test(db$Gender, db$Group) :
#  Chi-squared approximation may be incorrect
# 4: In chisq.test(db$Gender, db$Group) :
#  Chi-squared approximation may be incorrect
# 5: In chisq.test(db$Gender, db$Group) :
#   Chi-squared approximation may be incorrect
# 6: In chisq.test(db$Gender, db$Group) :
#  Chi-squared approximation may be incorrect
# 7: In chisq.test(db$Gender, db$Group) :
#  Chi-squared approximation may be incorrect
# 8: In chisq.test(db$Gender, db$Group) :
#  Chi-squared approximation may be incorrect
db%>%
分组依据(ID)%>%
不同的(ID、性别、组)%>%
总结所有(funs)测验(db$性别,
db$Group)$p.value)
#一个tibble:4x3
#ID性别组
#     
# 1    1.  0.429 0.429
# 2    2.  0.429 0.429
# 3    3.  0.429 0.429
# 4    4.  0.429 0.429
#警告信息:
#1:在智力测验中(db$性别,db$组):
#卡方近似可能不正确
#2:在智力测验中(db$性别,db$组):
#卡方近似可能不正确
#3:在智力测验中(db$性别,db$组):
#卡方近似可能不正确
#4:在智力测验中(db$性别,db$组):
#卡方近似可能不正确
#5:在智力测验中(db$性别,db$组):
#卡方近似可能不正确
#6:在智力测验中(db$性别,db$组):
#卡方近似可能不正确
#7:在智力测验中(db$性别,db$组):
#卡方近似可能不正确
#8:在智力测验中(db$性别,db$组):
#卡方近似可能不正确

我们可以
取消分组
,然后使用
摘要

db %>% 
  group_by(ID) %>% 
  distinct(ID, Gender, Group) %>%
  ungroup %>%
  summarise(pval = chisq.test(Gender, Group)$p.value)