Machine learning 支持向量机中带卡方距离度量的RBF核

Machine learning 支持向量机中带卡方距离度量的RBF核,machine-learning,scikit-learn,classification,svm,Machine Learning,Scikit Learn,Classification,Svm,如何实现标题提到的任务。RBF核中是否有任何参数将距离度量设置为卡方距离度量。我可以在sk学习库中看到一个chi2_内核 下面是我编写的代码 import numpy as np from sklearn import datasets from sklearn import svm from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score, f1_score,

如何实现标题提到的任务。RBF核中是否有任何参数将距离度量设置为卡方距离度量。我可以在sk学习库中看到一个chi2_内核

下面是我编写的代码

import numpy as np
from sklearn import datasets
from sklearn import svm
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score, classification_report, confusion_matrix

from sklearn.preprocessing import Imputer
from numpy import genfromtxt
from sklearn.metrics.pairwise import chi2_kernel


file_csv = 'dermatology.data.csv'
dataset = genfromtxt(file_csv, delimiter=',')

imp = Imputer(missing_values='NaN', strategy='most_frequent', axis=1)
dataset = imp.fit_transform(dataset)

target = dataset[:, [34]].flatten()
data = dataset[:, range(0,34)]

X_train, X_test, y_train, y_test = train_test_split(data, target, test_size=0.3)

# TODO : willing to set chi-squared distance metric instead. How to do that ?
clf = svm.SVC(kernel='rbf', C=1)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)

print(f1_score(y_test, y_pred, average="macro"))
print(precision_score(y_test, y_pred, average="macro"))
print(recall_score(y_test, y_pred, average="macro"))

您确定要组合rbf和chi2吗?Chi2本身定义了一个有效的内核,您所要做的就是

clf = svm.SVC(kernel=chi2_kernel, C=1)
因为sklearn接受函数作为内核(但是这需要O(N^2)内存和时间)。如果您想编写这两个代码,它会稍微复杂一些,您必须实现自己的内核才能做到这一点。为了获得更多的控制(和其他内核),您也可以尝试,但是目前还不支持编写