Python 2.7 scikit学习SGD中的class_权重参数是什么_Python 2.7_Machine Learning_Scikit Learn

Python 2.7 scikit学习SGD中的class_权重参数是什么

python-2.7 machine-learning scikit-learn

Python 2.7 scikit学习SGD中的class_权重参数是什么,python-2.7,machine-learning,scikit-learn,Python 2.7,Machine Learning,Scikit Learn,我是scikit learn的常客，我想了解有关SGD的“class_uuuweight”参数的一些见解直到函数调用时我才知道 plain_sgd(coef, intercept, est.loss_function, penalty_type, alpha, C, est.l1_ratio, dataset, n_iter, int(est.fit_intercept), int(est.v

我是scikit learn的常客，我想了解有关SGD的“class_uuuweight”参数的一些见解

直到函数调用时我才知道

plain_sgd(coef, intercept, est.loss_function,
                 penalty_type, alpha, C, est.l1_ratio,
                 dataset, n_iter, int(est.fit_intercept),
                 int(est.verbose), int(est.shuffle), est.random_state,
                 pos_weight, neg_weight,
                 learning_rate_type, est.eta0,
                 est.power_t, est.t_, intercept_decay)

在这之后，它很快就被交给了sgd_，我对cpython不是很好。你能在这些问题上说得快一点吗

我在dev集合中有一个类有偏，其中正类是15k，负类是36k。类的权重是否可以解决此问题。或者进行欠采样将是一个更好的主意。我得到了更好的数字，但这很难解释

如果是，那么它实际上是如何做到的。我的意思是，它是应用于特征惩罚，还是作为优化函数的权重。我怎么能向外行解释呢

class_weight

确实有助于提高基于不平衡数据训练的分类模型的ROC AUC或f1得分

您可以尝试

class\u weight=“auto”

选择与课程频率成反比的权重。您还可以尝试传递自己的权重，因为有一个python字典，其中类标签作为键，权重作为值

可以通过交叉验证的网格搜索来调整权重

在内部，这是通过从

类别权重

中导出

样本权重

（取决于每个样本的类别标签）来完成的。然后使用样本权重来衡量单个样本对损失函数的贡献，该损失函数用于训练具有随机梯度下降的线性分类模型

通过

惩罚

和

α

超参数独立控制特征惩罚<代码>样品重量/

等级重量

对其没有影响