Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/google-maps/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
逻辑回归R StepAIC的Python等价物(方向=';向后';)_Python_Logistic Regression_Statsmodels_Coefficients - Fatal编程技术网

逻辑回归R StepAIC的Python等价物(方向=';向后';)

逻辑回归R StepAIC的Python等价物(方向=';向后';),python,logistic-regression,statsmodels,coefficients,Python,Logistic Regression,Statsmodels,Coefficients,我试着在R中模拟stepAIC函数“手动”完成它,但这需要很长时间(我只发布了前两次尝试)。python中是否有类似于逻辑回归的stepAIC函数(在迭代时消除一个p值最高的变量并最小化AIC) #create model with double interactions datapol = data.drop(['flag'], axis=1) #elimino colonna flag dai dati poly=sklearn.preprocessing.PolynomialFeature

我试着在R中模拟stepAIC函数“手动”完成它,但这需要很长时间(我只发布了前两次尝试)。python中是否有类似于逻辑回归的stepAIC函数(在迭代时消除一个p值最高的变量并最小化AIC)

#create model with double interactions
datapol = data.drop(['flag'], axis=1) #elimino colonna flag dai dati
poly=sklearn.preprocessing.PolynomialFeatures(interaction_only=True,include_bias = False)

#calculate AIC for model with double interactions 
m_sat=poly.fit_transform(datapol)
m1=sm.Logit(np.asarray(flag.astype(int)),m_sat.astype(int))
m1.fit()
print(m1.fit().summary2())

#create new model without variable that has p-value>0.05
mx1=pd.DataFrame(m_sat)
mx2=np.asarray(mx1.drop(mx1.columns[[3]], axis=1))
m2=sm.Logit(np.asarray(flag.astype(int)),mx2.astype(int))
m2.fit()
print(m2.fit().summary2())

编辑:我发现了一个模拟stepAIC的算法,它使用来自sklearn软件包的前进方向检查名为RFE的函数

# Running RFE with the output number of the variable equal to 9
lm = LinearRegression()
rfe = RFE(lm, 9)             # running RFE
rfe = rfe.fit(X_train, y_train)
print(rfe.support_)           # Printing the boolean results
print(rfe.ranking_)  

stepAIC只是寻找降低AIC的特征组合:AIC越低越好。所以我认为如果你有固定数量的你想要的特性,你可以使用OLS显式地比较AIC

import statsmodels.api as sm
#you can explicitly change x, x can be changed with number of features
regressor_OLS = sm.OLS(Y, x).fit() 
regressor_OLS.summary()
regressor_OLS.aic #return AIC value

我发现这有点不同,因为stepAIC返回最佳数量的预测器,而RFE用户需要知道指定字段的数量。