Stata xtreg在使用i.year时省略年份伪变量

Stata xtreg在使用i.year时省略年份伪变量,stata,Stata,我拥有以下年份的面板数据集: tab year year | Freq. Percent Cum. ------------+----------------------------------- 2000 | 31 12.55 12.55 2001 | 31 12.55 25.10 2002 | 30

我拥有以下年份的面板数据集:

tab year

       year |      Freq.     Percent        Cum.
------------+-----------------------------------
       2000 |         31       12.55       12.55
       2001 |         31       12.55       25.10
       2002 |         30       12.15       37.25
       2003 |         31       12.55       49.80
       2004 |         31       12.55       62.35
       2005 |         31       12.55       74.90
       2006 |         31       12.55       87.45
       2007 |         31       12.55      100.00
------------+-----------------------------------
      Total |        247      100.00
当我执行
xtreg dv iv I.year
时,我发现该年份
2000
以及
2007

xtreg local_gr rtxdum i.year
note: 2007.year omitted because of collinearity

Random-effects GLS regression                   Number of obs     =        247
Group variable: province_n~e                    Number of groups  =         31

R-sq:                                           Obs per group:
     within  = 0.6194                                         min =          7
     between = 0.0016                                         avg =        8.0
     overall = 0.2356                                         max =          8

                                                Wald chi2(7)      =     341.51
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
    local_gr |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      rtxdum |  -753799.7   291543.7    -2.59   0.010     -1325215   -182384.5
             |
        year |
       2001  |     388246   291543.7     1.33   0.183    -183169.2    959661.2
       2002  |   745406.4   294294.5     2.53   0.011     168599.8     1322213
       2003  |    1175610   291543.7     4.03   0.000     604194.4     1747025
       2004  |    1773982   291543.7     6.08   0.000      1202567     2345397
       2005  |    2600005   291543.7     8.92   0.000      2028589     3171420
       2006  |    4425318   291543.7    15.18   0.000      3853903     4996734
       2007  |          0  (omitted)
             |
       _cons |    1564670   447832.4     3.49   0.000     686934.1     2442405
-------------+----------------------------------------------------------------
     sigma_u |  2217878.8
     sigma_e |  1150064.9
         rho |  .78809251   (fraction of variance due to u_i)
------------------------------------------------------------------------------

消息说2007年由于共线性被省略了,但我不明白为什么2000年不会出现在结果中?因为它是基准水平。您可以使用
allbaselevels
选项查看它:

webuse nlswork, clear
xtset idcode

xtreg ln_w grade tenure i.race not_smsa south, allbaselevels


Random-effects GLS regression                   Number of obs     =     28,091
Group variable: idcode                          Number of groups  =      4,697

R-sq:                                           Obs per group:
     within  = 0.1005                                         min =          1
     between = 0.4498                                         avg =        6.0
     overall = 0.3305                                         max =         15

                                                Wald chi2(6)      =    6509.50
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       grade |     .07605   .0018128    41.95   0.000     .0724969    .0796031
      tenure |   .0361319   .0006298    57.37   0.000     .0348975    .0373663
             |
        race |
      white  |          0  (base)
      black  |  -.0530121   .0102916    -5.15   0.000    -.0731832   -.0328409
      other  |   .0762678   .0415911     1.83   0.067    -.0052492    .1577849
             |
    not_smsa |  -.1289554   .0074296   -17.36   0.000    -.1435172   -.1143936
       south |  -.0786512   .0075533   -10.41   0.000    -.0934555    -.063847
       _cons |   .6759773   .0244723    27.62   0.000     .6280125    .7239421
-------------+----------------------------------------------------------------
     sigma_u |  .26440074
     sigma_e |  .30295598
         rho |  .43235646   (fraction of variance due to u_i)
------------------------------------------------------------------------------

正如@PearlySpuncer所暗示的,您总是——使用这种语法,您看到的指标(虚拟)变量的结果要比您投入回归的指标(虚拟)变量少。为什么省略2007也是因为数据集中的共线性。这是一个统计理解的问题。很明显,你没有读过这个问题,我说的结果是省略了两年,而不是一年。2000年和2007年。所以不是少一个,少两个。我不明白为什么我要在这篇评论中再次解释我自己。我确实阅读并理解了这个问题。重复:总是少看到一个指标(那就是一个);由于数据集中的共线性,2007也被省略(在您的案例中是两个)。