Excel formula Excel统计:如何计算2x2列联表的p值?

Excel formula Excel统计:如何计算2x2列联表的p值?,excel-formula,p-value,chi-squared,Excel Formula,P Value,Chi Squared,给定数据,例如: A B C 1 Group 1 Group 2 2 Property 1 56 651 3 Property 2 97 1,380 如何直接计算p值(即卡方分布的“右尾”概率),而不单独计算表中的预期值 如果您知道表格的卡方值,则通过函数ChiSq.dist.RT在Excel中计算p值;如果您知道表格的“预期值”表格,则通过ChiSq.T

给定数据,例如:

        A         B           C
1               Group 1     Group 2
2   Property 1     56         651
3   Property 2     97       1,380
如何直接计算p值(即卡方分布的“右尾”概率),而不单独计算表中的预期值

如果您知道表格的卡方值,则通过函数
ChiSq.dist.RT
在Excel中计算p值;如果您知道表格的“预期值”表格,则通过
ChiSq.Test
在Excel中计算p值。卡方值是使用期望值计算的,而期望值是通过一个稍微复杂的公式从原始表中计算出来的,因此,无论哪种方式,Excel都要求我们自己计算期望值以获得p值,这似乎有点愚蠢。那么,在不单独计算期望值的情况下,如何在Excel中获得p值


编辑:这个问题最初的标题是“如何使用2属性数组计算皮尔逊相关系数?”并询问为什么函数给出了错误的答案。答案是我把p值和皮尔逊相关系数混淆了,这是不同的东西。所以我重新设计了这个问题,问我真正需要知道的是什么,我正在发布答案。在接受我自己的答案之前,我会等待一段时间,以防其他人有更好的答案。

我觉得这需要VBA。我编写了以下VBA函数来计算卡方或p值,以及2x2列联表的两个其他关联度量:

Public Function nStatAssoc_2x2(sType As String, nGrp1PropCounts As Range, nGrp2PropCounts As Range) As Single

' Return one of several measures of statistical association of a 2×2 contingency table:
'                   Property 1      Property 2
'       Group 1     nCount(1, 1)    nCount(1, 2)
'       Group 2     nCount(2, 1)    nCount(2, 2)

' sType is:     to calculate:
'   "OR"        Odds ratio
'   "phi"       Phi coefficient
'   "chi-sq"    Chi-squared
'   "p"         p-value, i.e., right-tailed probability of the chi-squared distribution

' nGrp<n>PropCounts is a range of two cells containing the number of members of group n that have each of two properties.
' These arguments are 1-D arrays in order to allow the data to appear in non-adjacent ranges in the spreadsheet.

' References:
    ' Contingency table:        https://en.wikipedia.org/wiki/Contingency_table
    ' Measure of association:   www.britannica.com/topic/measure-of-association
    ' Odds ratio:               https://en.wikipedia.org/wiki/Odds_ratio
    '                           https://en.wikipedia.org/wiki/Effect_size#Odds_ratio
    ' Phi coefficient:          https://en.wikipedia.org/wiki/Phi_coefficient
    ' Chi-sq:                   https://en.wikipedia.org/wiki/Pearson's_chi-squared_test#Calculating_the_test-statistic
    '                           www.mathsisfun.com/data/chi-square-test.html
    '                               Shows calculation of expected values.
    ' p-value:                  https://docs.microsoft.com/en-us/office/vba/api/excel.worksheetfunction.ChiSq_Dist_RT

Dim nCount(1 To 2, 1 To 2) As Integer
Dim nSumGrp(1 To 2) As Integer, nSumProp(1 To 2) As Integer, nSumAll As Integer
Dim nExpect(1 To 2, 1 To 2) As Single
Dim nIndex1 As Byte, nIndex2 As Byte
Dim nRetVal As Single

' Combine input arguments into contingency table:
For nIndex1 = 1 To 2
    nCount(1, nIndex1) = nGrp1PropCounts(nIndex1)
    nCount(2, nIndex1) = nGrp2PropCounts(nIndex1)
  Next nIndex1

' Calculate totals of group counts, property counts, and all counts (used for phi and chi-sq):
For nIndex1 = 1 To 2
    For nIndex2 = 1 To 2
        nSumGrp(nIndex1) = nSumGrp(nIndex1) + nCount(nIndex1, nIndex2)
        nSumProp(nIndex2) = nSumProp(nIndex2) + nCount(nIndex1, nIndex2)
      Next nIndex2
  Next nIndex1
nSumAll = nSumGrp(1) + nSumGrp(2)

If nSumAll <> nSumProp(1) + nSumProp(2) Then
    nRetVal = -2           ' Error: Sums differ.
    GoTo Finished
  End If

Select Case sType

    ' Odds ratio
    Case "OR":
        nRetVal = (nCount(1, 1) / nCount(1, 2)) / (nCount(2, 1) / nCount(2, 2))
        If nRetVal <> (nCount(1, 1) / nCount(2, 1)) / (nCount(1, 2) / nCount(2, 2)) Then
            nRetVal = -3            ' Error: OR calculation results differ.
            GoTo Finished
          End If

    ' Phi coefficient
    Case "phi":
        nRetVal = ((CLng(nCount(1, 1)) * nCount(2, 2)) - (CLng(nCount(1, 2)) * nCount(2, 1))) / _
                    (CSng(nSumGrp(1)) * nSumGrp(2) * nSumProp(1) * nSumProp(2)) ^ 0.5

    ' Chi-squared
    Case "chi-sq", "p":     ' For "p", nRetVal is passed to the next select case statement.
        ' Calculate table of expected values:
        For nIndex1 = 1 To 2
            For nIndex2 = 1 To 2
                    ' In next line, the division is done first to prevent integer overflow,
                    '   which can happen if the multiplication is done first.
                nExpect(nIndex1, nIndex2) = nSumGrp(nIndex1) / nSumAll * nSumProp(nIndex2)
                If nExpect(nIndex1, nIndex2) < 5 Then
                    ' https://en.wikipedia.org/wiki/Pearson's_chi-squared_test#Assumptions
                    nRetVal = -4        ' Error: Expected value too small.
                    GoTo Finished
                  Else
                    nRetVal = nRetVal + _
                        (nCount(nIndex1, nIndex2) - nExpect(nIndex1, nIndex2)) ^ 2 / nExpect(nIndex1, nIndex2)
                  End If
              Next nIndex2
          Next nIndex1

    Case Else:
        nRetVal = -1           ' Error: Invalid measure type.
        GoTo Finished
  End Select

Select Case sType
    Case "OR", "phi", "chi-sq":

    ' p-value       ' Uses value of nRetVal passed from the previous select case statement.
    Case "p": nRetVal = WorksheetFunction.ChiSq_Dist_RT(nRetVal, 1)
  End Select

Finished: nStatAssoc_2x2 = nRetVal

End Function        ' nStatAssoc_2x2()
公共函数nStatAssoc_2x2(sType作为字符串,nGrp1PropCounts作为范围,nGrp2PropCounts作为范围)作为单个
'返回2×2列联表统计关联的几种度量之一:
'属性1属性2
'第1组nCount(1,1)nCount(1,2)
'第2组n计数(2,1)n计数(2,2)
'sType是:要计算:
或"优势比"
“φ”φ系数
“卡方”卡方
“p”p值,即卡方分布的右尾概率
'nGrpPropCounts是两个单元格的范围,其中包含组n中具有两个属性的成员数。
'这些参数是一维数组,以便允许数据显示在电子表格中的非相邻范围内。
“参考资料:
“列联表:https://en.wikipedia.org/wiki/Contingency_table
“关联度:www.britannica.com/topic/Measure-of-association
“优势比:https://en.wikipedia.org/wiki/Odds_ratio
'                           https://en.wikipedia.org/wiki/Effect_size#Odds_ratio
'φ系数:https://en.wikipedia.org/wiki/Phi_coefficient
"智方:https://en.wikipedia.org/wiki/Pearsons卡方检验——计算检验统计量
'www.mathsisfun.com/data/chi-square-test.html
'显示预期值的计算。
“p值:https://docs.microsoft.com/en-us/office/vba/api/excel.worksheetfunction.ChiSq_Dist_RT
Dim nCount(1到2,1到2)为整数
Dim nSumGrp(1到2)为整数,nSumProp(1到2)为整数,NSIMLL为整数
将下一个维度(1到2,1到2)设置为单个
尺寸nIndex1为字节,nIndex2为字节
单曲
'将输入参数合并到列联表中:
对于Nindex 1=1到2
nCount(1,nIndex1)=nGrp1PropCounts(nIndex1)
nCount(2,nIndex1)=nGrp2PropCounts(nIndex1)
下一个九位数1
'计算组计数、属性计数和所有计数的总和(用于phi和chi sq):
对于Nindex 1=1到2
对于nIndex2=1到2
nSumGrp(nIndex1)=nSumGrp(nIndex1)+nCount(nIndex1,nIndex2)
nSumProp(nIndex2)=nSumProp(nIndex2)+nCount(nIndex1,nIndex2)
下一个nIndex2
下一个九位数1
nSumAll=nSumGrp(1)+nSumGrp(2)
如果nSumProp(1)+nSumProp(2),则
nRetVal=-2'错误:总和不同。
后藤完成
如果结束
选择大小写样式
“优势比
案例“或”:
nRetVal=(nCount(1,1)/nCount(1,2))/(nCount(2,1)/nCount(2,2))
如果nRetVal(nCount(1,1)/nCount(2,1))/(nCount(1,2)/nCount(2,2)),那么
nRetVal=-3'错误:或计算结果不同。
后藤完成
如果结束
'φ系数
案例“phi”:
nRetVal=((CLng(nCount(1,1))*nCount(2,2))-(CLng(nCount(1,2))*nCount(2,1))/_
(CSng(nSumGrp(1))*nSumGrp(2)*nSumProp(1)*nSumProp(2))^0.5
“卡方
案例“chi sq”,“p”:“p”代表“p”,nRetVal传递给下一个select案例语句。
'计算期望值表:
对于Nindex 1=1到2
对于nIndex2=1到2
'在下一行中,首先进行除法以防止整数溢出,
“如果先做乘法,就会发生这种情况。
nExpect(nIndex1,nIndex2)=nSumGrp(nIndex1)/nsmall*nSumProp(nIndex2)
如果nExpect(nIndex1,nIndex2)<5,则
' https://en.wikipedia.org/wiki/Pearson's_卡方检验#假设
nRetVal=-4'错误:预期值太小。
后藤完成
其他的
nRetVal=nRetVal+_
(nCount(nIndex1,nIndex2)-nExpect(nIndex1,nIndex2))^2/nExpect(nIndex1,nIndex2)
如果结束
下一个nIndex2
下一个九位数1
其他情况:
nRetVal=-1'错误:度量值类型无效。
后藤完成
结束选择
选择大小写样式
案例“或”、“phi”、“chi sq”:
“p-value”使用从上一个select case语句传递的nRetVal值。
案例“p”:nRetVal=工作表函数。ChiSq_Dist_RT(nRetVal,1)
结束选择
完成:NSTASSOC_2x2=nRetVal