Scikit learn 模型性能为；好；。但系数权重很奇怪_Scikit Learn_Logistic Regression

Scikit learn 模型性能为；好；。但系数权重很奇怪

scikit-learn

Scikit learn 模型性能为；好；。但系数权重很奇怪,scikit-learn,logistic-regression,Scikit Learn,Logistic Regression,我正在培训一个模型来检测好/坏客户。我的输入功能包括： 'Net Receivables', 'Sales', 'Cost of Goods sold', 'Current Assets', 'Property, plant and equipment', 'Securities', 'Total assets', 'Depreciation', 'Selling, General & Administrative Expense', 'Tota

我正在培训一个模型来检测好/坏客户。我的输入功能包括：

'Net Receivables', 'Sales', 'Cost of Goods sold', 'Current Assets',
       'Property, plant and equipment', 'Securities', 'Total assets',
       'Depreciation', 'Selling, General & Administrative Expense',
       'Total long term debt', 'Current Liabilites', 'Net Receivables.1',
       'Sales.1', 'Cost of Goods sold.1', 'Current Assets.1',
       'Property, plant and equipment.1', 'Securities.1', 'Total assets.1',
       'Depreciation.1', 'Selling, General & Administrative Expense.1',
       'Total long term debt.1', 'Current Liabilites.1',
       'Income from Continuing Operations', 'Cash Flows from Operations'

我使用逻辑回归训练了一个简单的模型：

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

clf = LogisticRegression()
clf.fit(X_train, y_train)
pred = clf.predict(X_test)

然后，我尝试使用AUC和准确性对模型进行评估

print(roc_auc_score(y_test, pred))
print(accuracy_score(y_test, pred))

结果是

0.765625
0.7727272727272727

但是当我试图通过

odds = np.exp(clf.coef_[0])

我发现了一些奇怪的系数。似乎没有比这更重要的特征了

array([1.00000001, 1.00000035, 0.99999963, 0.99999987, 0.99999928,
       1.        , 1.        , 0.99999993, 1.00000019, 0.9999994 ,
       0.99999976, 1.00000016, 0.99999996, 1.00000003, 0.99999967,
       0.99999967, 1.        , 1.00000035, 0.99999995, 0.99999985,
       1.00000035, 1.00000021, 1.00000008, 1.00000051])

我的训练集相对较小：174行*24个功能

我可以相信模型的分数吗？

为什么使用

np.exp

为什么要使用

coef[0]

，获取物流再期望系数的正常方法应该是：

print(clf.coef_, clf.intercept_)

后面还有这个

我参考了这个链接：。这是因为系数是对数奇数。np.exp应该会产生很大的差异，但是你应该尝试

clf.coef\uuu

我认为问题在于财务数据的单位是1美元，所以系数很小。