R二项测试偏好,数据框

R二项测试偏好,数据框,r,R,这是一门Coursera课程,它期望我们在没有任何R编程经验的情况下进行R编程,我真的很难理解,但没有任何线索。我甚至看过基本的R教程,但还是不知道 我们有一个csv文件,内容: 主题:30 残疾:0,1 首选项:轨迹球、触摸板 对于没有残疾的人,进行二项测试,看看他们对触摸板的偏好是否与偶然性有显著差异。精确到万分之一(四位数),p值是多少?提示:运行一个二项测试,比较喜欢触摸板的非残疾人行数和所有非残疾人行数。有两种可能的偏好,触摸板和轨迹球,概率是1/2。不纠正多重比较;考虑这是对数据

这是一门Coursera课程,它期望我们在没有任何R编程经验的情况下进行R编程,我真的很难理解,但没有任何线索。我甚至看过基本的R教程,但还是不知道

我们有一个csv文件,内容:

  • 主题:30
  • 残疾:0,1
  • 首选项:轨迹球、触摸板
对于没有残疾的人,进行二项测试,看看他们对触摸板的偏好是否与偶然性有显著差异。精确到万分之一(四位数),p值是多少?提示:运行一个二项测试,比较喜欢触摸板的非残疾人行数和所有非残疾人行数。有两种可能的偏好,触摸板和轨迹球,概率是1/2。不纠正多重比较;考虑这是对数据子集的单一测试。

解决方案应该是:

  • 首先,通过绘制非残疾人的偏好来获得直觉:

    plot(df[df$Disability == "0",]$Pref)
    
  • 第二,根据偶然性测试触摸板与轨迹球的偏好,这不是偏好:

    binom.test(sum(df[df$Disability == "0",]$Pref == "touchpad"), 
               nrow(df[df$Disability == "0",]), p=1/2)
    plot(df[df$Disability == "0",]$Pref)
    

我理解,这应该给我们一个残疾=0的偏好的视觉表示,但dfs有一个错误,我不知道如何纠正它。有人能帮忙吗?

我模拟了一个具有给定特征的随机数据集,一切正常:

df <- data.frame(Subject = c("Sub1", "Sub2", "Sub3", "Sub4", "Sub5", "Sub6", "Sub7", "Sub8", "Sub9", "Sub10", "Sub11", "Sub12", "Sub13", "Sub14", "Sub15", "Sub16", "Sub17", "Sub18", "Sub19", "Sub20", "Sub21", "Sub22",     "Sub23", "Sub24", "Sub25", "Sub26", "Sub27", "Sub28", "Sub29", "Sub30"),
                 Disability = c("0", "0", "1", "1", "1", "1", "0", "0", "0", "1", "1", "0", "0", "0", "0", "1", "0", "0", "1", "0", "0", "0", "0", "1", "1", "1", "0", "0", "1", "0"),
                 Pref = c("touchpad", "touchpad", "touchpad", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "touchpad", "trackball", "trackball", "touchpad", "touchpad", "trackball", "touchpad", "trackball", "touchpad", "touchpad", "trackball", "touchpad", "touchpad", "touchpad", "touchpad", "touchpad", "trackball", "trackball"))

编辑

为了对实际数据应用相同的测试(链接到注释中给出的文件),第一步应替换为读取实际数据帧中存储的值的命令:

df <- read.csv("deviceprefs-1.csv")

df我模拟了一个具有给定特征的随机数据集,一切正常:

df <- data.frame(Subject = c("Sub1", "Sub2", "Sub3", "Sub4", "Sub5", "Sub6", "Sub7", "Sub8", "Sub9", "Sub10", "Sub11", "Sub12", "Sub13", "Sub14", "Sub15", "Sub16", "Sub17", "Sub18", "Sub19", "Sub20", "Sub21", "Sub22",     "Sub23", "Sub24", "Sub25", "Sub26", "Sub27", "Sub28", "Sub29", "Sub30"),
                 Disability = c("0", "0", "1", "1", "1", "1", "0", "0", "0", "1", "1", "0", "0", "0", "0", "1", "0", "0", "1", "0", "0", "0", "0", "1", "1", "1", "0", "0", "1", "0"),
                 Pref = c("touchpad", "touchpad", "touchpad", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "trackball", "touchpad", "trackball", "trackball", "touchpad", "touchpad", "trackball", "touchpad", "trackball", "touchpad", "touchpad", "trackball", "touchpad", "touchpad", "touchpad", "touchpad", "touchpad", "trackball", "trackball"))

编辑

为了对实际数据应用相同的测试(链接到注释中给出的文件),第一步应替换为读取实际数据帧中存储的值的命令:

df <- read.csv("deviceprefs-1.csv")

df如果您提供您正在使用的数据,那么我们可以复制您的代码。请尝试
dput
,或者将此csv上传到某个地方并发布链接。请在问题中添加错误消息。感谢您的帮助!我刚刚发现我需要用构建的xtab的名称替换“df”。Files:@testimo如果您愿意,您可以将其添加为答案。如果您提供了正在使用的数据,那么我们会更好地复制您的代码。请尝试
dput
,或者将此csv上传到某个地方并发布链接。请在问题中添加错误消息。感谢您的帮助!我刚刚发现我需要用构建的xtab的名称替换“df”。档案:@testimo如果你愿意,你可以将其添加为答案。感谢你尝试@vincent guillemot。我给出的答案p-value=0.8145,测试结果不正确。我想你误解了我的答案:当我说“模拟”时,意味着我随机生成了一些数据,因此p值与您的数据不符。感谢您尝试@vincent guillemot。我给出的答案p值=0.8145,测试结果表明它不正确。我想您误解了我的答案:当我说“模拟”时,意味着我随机生成了一些数据,因此p值与您的数据不符。