Python sklearn二进制分类器的标签值_Python_Machine Learning_Scikit Learn_Classification

Python sklearn二进制分类器的标签值

python machine-learning scikit-learn

Python sklearn二进制分类器的标签值,python,machine-learning,scikit-learn,classification,Python,Machine Learning,Scikit Learn,Classification,我将样本标记为“0”或“1”的数据集标签是否应为“-1”和“1”以便正确分类我不确定sklearn分类器可以最小化哪些损失函数。可能它依赖于值是“1”还是“-1” sklearn分类器通常可以使用不同的损失函数或惩罚。尽管我在任何地方都找不到这方面的文档，但根据我的经验，它通常对您传入的类是明智的。实际的解算器使用外部库，因此在引擎罩下可能会发生一些清理。但总的来说，我发现这些工作是开箱即用的： >>> from sklearn.linear_model import Lo

我将样本标记为“0”或“1”的数据集

标签是否应为“-1”和“1”以便正确分类

我不确定sklearn分类器可以最小化哪些损失函数。可能它依赖于值是“1”还是“-1”

sklearn

分类器通常可以使用不同的损失函数或惩罚。尽管我在任何地方都找不到这方面的文档，但根据我的经验，它通常对您传入的类是明智的。实际的解算器使用外部库，因此在引擎罩下可能会发生一些清理。但总的来说，我发现这些工作是开箱即用的：

>>> from sklearn.linear_model import LogisticRegression
>>> import numpy as np
>>> X = np.random.randint(0,10,(20,5))
>>> y1 = np.random.choice([-1,1], 20)
>>> y2 = np.random.choice([0,1], 20)
>>> y1
array([-1, -1,  1, -1, -1,  1, -1,  1, -1,  1,  1, -1,  1, -1, -1, -1,  1,
        1, -1,  1])
>>> y2
array([0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0])
>>> model1, model2 = LogisticRegression(), LogisticRegression()
>>> model1.fit(X,y1)
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
          penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
          verbose=0, warm_start=False)
>>> model2.fit(X, y2)
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
          penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
          verbose=0, warm_start=False)
>>> model1.predict(X)
array([-1,  1,  1, -1,  1, -1, -1,  1, -1, -1,  1,  1,  1, -1, -1, -1, -1,
        1, -1,  1])
>>> model2.predict(X)
array([1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0])
>>> y1
array([-1, -1,  1, -1, -1,  1, -1,  1, -1,  1,  1, -1,  1, -1, -1, -1,  1,
        1, -1,  1])
>>> y2
array([0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0])

甚至：

>>> y3 = np.random.choice(['a','b'], 20)
>>> model3 = LogisticRegression()
>>> model3.fit(X,y3)
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
          penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
          verbose=0, warm_start=False)
>>> model3.classes_
array(['a', 'b'],
      dtype='<U1')
>>> model3.predict(X)
array(['b', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'a', 'a', 'a',
       'a', 'b', 'b', 'a', 'a', 'a', 'a'],
      dtype='<U1')

>y3=np.random.choice（['a'，'b'，20）
>>>模型3=逻辑回归（）
>>>模型3.拟合（X，y3）
逻辑回归（C=1.0，等级权重=None，双重=False，拟合截距=True，
截距=1，最大截距=100，多个作业=1，
惩罚='l2'，随机_状态=无，解算器='liblinear'，tol=0.0001，
详细信息=0，热启动=False）
>>>模型3.3类_
数组（['a'，'b']，
dtype='1
>>> from sklearn.svm import LinearSVC
>>> svm1 = LinearSVC()
>>> svm1.fit(X,y1)
LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
     intercept_scaling=1, loss='squared_hinge', max_iter=1000,
     multi_class='ovr', penalty='l2', random_state=None, tol=0.0001,
     verbose=0)
>>> svm1.predict(X)
array([-1,  1, -1, -1,  1, -1, -1,  1, -1, -1,  1,  1,  1, -1, -1, -1,  1,
        1, -1,  1])
>>> svm2 = LinearSVC()
>>> svm2.fit(X,y3)
LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
     intercept_scaling=1, loss='squared_hinge', max_iter=1000,
     multi_class='ovr', penalty='l2', random_state=None, tol=0.0001,
     verbose=0)
>>> svm2.predict(X)
array(['b', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a',
       'a', 'a', 'a', 'a', 'a', 'a', 'a'],
      dtype='<U1')