R 鸟类攻击性得分统计显著性的最终测试
导言R 鸟类攻击性得分统计显著性的最终测试,r,testing,final,R,Testing,Final,导言 Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = score ~ dummy_species * location) $dummy_species diff lwr upr p adj PF-CF -1.648846 -2.613568 -0.6841239 0.0009332 $location
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = score ~ dummy_species * location)
$dummy_species
diff lwr upr p adj
PF-CF -1.648846 -2.613568 -0.6841239 0.0009332
$location
diff lwr upr p adj
S-N 0.9440487 -0.07800284 1.9661 0.0699746
$`dummy_species:location`
diff lwr upr p adj
PF:N-CF:N -1.389250 -2.8774793 0.09898005 0.0766924
CF:S-CF:N 1.286676 -0.3619293 2.93528192 0.1824646
PF:S-CF:N -1.123377 -3.2649782 1.01822492 0.5246337
CF:S-PF:N 2.675926 0.7993571 4.55249475 0.0016744
PF:S-PF:N 0.265873 -2.0557788 2.58752484 0.9908082
PF:S-CF:S -2.410053 -4.8376320 0.01752615 0.0524523
我正在做一个小型的试点研究,研究鸟类在殖民地的繁殖地的侵略行为
背景
这项研究进行了多年,将殖民地(南部)和定居地(北部)领口捕蝇雄性与同种和斑纹捕蝇雄性呈现在一起。根据可量化的攻击行为对其行为进行评分。
它们在60年前发现了这个岛,并从繁殖地的一个点开始稳步扩散,将它们的相对斑纹捕蝇器从昆虫较多的区域推离。
以前的研究表明,更具侵略性的雄性处于这种殖民行为的前沿。在北部地区,有近100%的衣领,而南部仍然有混合人口
假设
*Anova(lm(score~dummy_species*location))
Anova Table (Type II tests)*
Response: score
Sum Sq Df F value Pr(>F)
dummy_species 93.91 1 11.6673 0.0008186 ***
location 26.82 1 3.3326 0.0699100 .
dummy_species:location 6.98 1 0.8675 0.3531437
Residuals 1207.39 150
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
*> summary(aov(score~dummy_species*location))*
Df Sum Sq Mean Sq F value Pr(>F)
dummy_species 1 91.8 91.80 11.405 0.000933 ***
location 1 26.8 26.82 3.333 0.069910 .
dummy_species:location 1 7.0 6.98 0.868 0.353144
Residuals 150 1207.4 8.05
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
*> summary(lm(score~dummy_species*location))*
Call:
lm(formula = score ~ dummy_species * location)
Residuals:
Min 1Q Median 3Q Max
-4.4815 -2.1948 -0.8056 2.1280 6.9286
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.1948 0.3233 9.881 <2e-16 ***
dummy_speciesPF -1.3892 0.5728 -2.425 0.0165 *
locationS 1.2867 0.6346 2.028 0.0444 *
dummy_speciesPF:locationS -1.0208 1.0960 -0.931 0.3531
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.837 on 150 degrees of freedom
Multiple R-squared: 0.09423, Adjusted R-squared: 0.07611
F-statistic: 5.202 on 3 and 150 DF, p-value: 0.001909
在南部地区,雄性项圈捕蝇器对这两个物种的攻击性更高。
雄性在北方比在南方对同种动物的反应相对更强烈
'data.frame': 154 obs. of 8 variables:
$ location : Factor w/ 2 levels "N","S": 1 1 1 1 1 1 1 1 2 1 ...
$ score : int 1 4 0 1 1 8 9 9 4 3 ...
$ dummy_species : Factor w/ 2 levels "CF","PF": 1 1 2 2 1 1 1 1 1 2 ...
model.tables(aov(scoreCF$score~scoreCF$location),"means")
Tables of means
Grand mean
2.993506
dummy_species
CF PF
3.529 1.88
rep 104.000 50.00
location
N S
2.742 3.686
rep 113.000 41.000
dummy_species:location
location
dummy_species N S
CF 3.19 4.48
rep 77.00 27.00
PF 1.81 2.07
rep 36.00 14.00
问题
在对所有交互进行评分后,我现在不知道使用什么测试来呈现数据。许多人给出不同的建议或简单的方差分析等
我一直在同时学习R和统计学,但许多术语仍然让我感到困惑,我发现互联网上的问题和答案很难用我的数据来解释。(这就是我打扰你的地方)
问题
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = score ~ dummy_species * location)
$dummy_species
diff lwr upr p adj
PF-CF -1.648846 -2.613568 -0.6841239 0.0009332
$location
diff lwr upr p adj
S-N 0.9440487 -0.07800284 1.9661 0.0699746
$`dummy_species:location`
diff lwr upr p adj
PF:N-CF:N -1.389250 -2.8774793 0.09898005 0.0766924
CF:S-CF:N 1.286676 -0.3619293 2.93528192 0.1824646
PF:S-CF:N -1.123377 -3.2649782 1.01822492 0.5246337
CF:S-PF:N 2.675926 0.7993571 4.55249475 0.0016744
PF:S-PF:N 0.265873 -2.0557788 2.58752484 0.9908082
PF:S-CF:S -2.410053 -4.8376320 0.01752615 0.0524523
以下三个测试中,哪一个最适合用来证明有或没有统计学意义
- 方差分析(lm(分数~虚拟物种*位置))
- 总结(aov(分数~虚拟物种*位置))
- 总结(lm(分数~虚拟物种*位置))
'data.frame': 154 obs. of 8 variables:
$ location : Factor w/ 2 levels "N","S": 1 1 1 1 1 1 1 1 2 1 ...
$ score : int 1 4 0 1 1 8 9 9 4 3 ...
$ dummy_species : Factor w/ 2 levels "CF","PF": 1 1 2 2 1 1 1 1 1 2 ...
model.tables(aov(scoreCF$score~scoreCF$location),"means")
Tables of means
Grand mean
2.993506
dummy_species
CF PF
3.529 1.88
rep 104.000 50.00
location
N S
2.742 3.686
rep 113.000 41.000
dummy_species:location
location
dummy_species N S
CF 3.19 4.48
rep 77.00 27.00
PF 1.81 2.07
rep 36.00 14.00
TukeyHSD(aov(分数~虚拟物种*位置))
结果
*Anova(lm(score~dummy_species*location))
Anova Table (Type II tests)*
Response: score
Sum Sq Df F value Pr(>F)
dummy_species 93.91 1 11.6673 0.0008186 ***
location 26.82 1 3.3326 0.0699100 .
dummy_species:location 6.98 1 0.8675 0.3531437
Residuals 1207.39 150
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
*> summary(aov(score~dummy_species*location))*
Df Sum Sq Mean Sq F value Pr(>F)
dummy_species 1 91.8 91.80 11.405 0.000933 ***
location 1 26.8 26.82 3.333 0.069910 .
dummy_species:location 1 7.0 6.98 0.868 0.353144
Residuals 150 1207.4 8.05
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
*> summary(lm(score~dummy_species*location))*
Call:
lm(formula = score ~ dummy_species * location)
Residuals:
Min 1Q Median 3Q Max
-4.4815 -2.1948 -0.8056 2.1280 6.9286
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.1948 0.3233 9.881 <2e-16 ***
dummy_speciesPF -1.3892 0.5728 -2.425 0.0165 *
locationS 1.2867 0.6346 2.028 0.0444 *
dummy_speciesPF:locationS -1.0208 1.0960 -0.931 0.3531
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.837 on 150 degrees of freedom
Multiple R-squared: 0.09423, Adjusted R-squared: 0.07611
F-statistic: 5.202 on 3 and 150 DF, p-value: 0.001909
*方差分析(lm(分数~虚拟物种*位置))
方差分析表(II型试验)*
答复:得分
平方和Df F值Pr(>F)
虚拟_物种93.91 11.6673 0.0008186***
位置26.82 1 3.3326 0.0699100。
虚拟_物种:位置6.98 1 0.8675 0.3531437
残差1207.39 150
---
签名。代码:0'***'0.001'***'0.01'*'0.05'.'0.1''1
*>总结(aov(分数~虚拟物种*位置))*
Df和Sq平均Sq F值Pr(>F)
虚拟_物种1 91.8 91.80 11.405 0.000933***
位置1 26.8 26.82 3.333 0.069910。
虚拟_物种:位置17.06.98 0.868 0.353144
残差150 1207.4 8.05
---
签名。代码:0'***'0.001'***'0.01'*'0.05'.'0.1''1
*>总结(lm(分数~虚拟物种*位置))*
电话:
lm(公式=分数~虚拟物种*位置)
残差:
最小1季度中值3季度最大值
-4.4815 -2.1948 -0.8056 2.1280 6.9286
系数:
估计标准误差t值Pr(>t)
你在大学吗?如果是这样的话,你可能想联系统计部门,因为他们会比在互联网上发布信息更好地帮助你。另外,这更多的是一个统计问题,而不是一个编码问题,所以交叉验证网站可能更适合这个问题。这里你感兴趣的是如何而不是为什么?我会提出我的质询,谢谢你的答覆。另外,我还没有上大学,我想在回来之前好好想想:)也许有一些想法会有所帮助。线性模型的一个假设是误差是正态分布的。如果您有足够远离0的计数数据,这可能是一个合理的假设。如果它们接近0,则泊松分布更合适。出于这个原因,您可能需要研究GLMs,或者如果您的数据有许多零负二项模型或zip模型。