基于Gram矩阵的预计算RBF核的Python实现?
Python有非常有限的信息和基于Gram矩阵的预计算RBF核的Python实现?,python,matrix,scikit-learn,kernel,svm,Python,Matrix,Scikit Learn,Kernel,Svm,Python有非常有限的信息和预计算内核示例sklearn 仅提供一个简单的线性内核示例: 以下是线性内核的代码: import numpy as np from scipy.spatial.distance import cdist from sklearn.datasets import load_iris # import data iris = datasets.load_iris() X = iris.data Y = iris.target
预计算内核
示例<代码>sklearn
仅提供一个简单的线性内核示例:
以下是线性内核的代码:
import numpy as np
from scipy.spatial.distance import cdist
from sklearn.datasets import load_iris
# import data
iris = datasets.load_iris()
X = iris.data
Y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, Y)
clf = svm.SVC(kernel='precomputed')
# Linear kernel
G_train = np.dot(X_train, X_train.T)
clf.fit(G_train, y_train)
G_test = np.dot(X_test, X_train.T)
y_pred = clf.predict(G_test)
这对于进一步理解其他非平凡内核的实现没有多大帮助,例如,RBF内核
,它将是:
K(X, X') = np.exp(divide(-cdist(X, X, 'euclidean), 2*np.std(X**2)))
如何对RBF
进行相同的训练和测试
分割和实现预计算内核
如果内核变得更加复杂,这取决于需要在单独的函数中计算的其他参数,比如参数alpha>=0
:
K(X, X') = alpha('some function depending on X_train, X_test')*np.exp(divide(-cdist(X, X, 'euclidean), 2*np.std(X**2)))
我们需要这些非平凡内核的例子。如有任何建议,我将不胜感激。我们可以手动编写内核pca。让我们从多项式核开始
from sklearn.datasets import make_circles
from scipy.spatial.distance import pdist, squareform
from scipy.linalg import eigh
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
X_c, y_c = make_circles(n_samples=100, random_state=654)
plt.figure(figsize=(8,6))
plt.scatter(X_c[y_c==0, 0], X_c[y_c==0, 1], color='red')
plt.scatter(X_c[y_c==1, 0], X_c[y_c==1, 1], color='blue')
plt.ylabel('y coordinate')
plt.xlabel('x coordinate')
plt.show()
数据:
现在转换数据并绘制它
X_c1 = degree_pca(X_c, gamma=5, degree=2, n_components=2)
plt.figure(figsize=(8,6))
plt.scatter(X_c1[y_c==0, 0], X_c1[y_c==0, 1], color='red')
plt.scatter(X_c1[y_c==1, 0], X_c1[y_c==1, 1], color='blue')
plt.ylabel('y coordinate')
plt.xlabel('x coordinate')
plt.show()
线性可分:
现在点可以线性分开
接下来,让我们编写RBF内核。为了演示,让我们看月亮
from sklearn.datasets import make_moons
X, y = make_moons(n_samples=100, random_state=654)
plt.figure(figsize=(8,6))
plt.scatter(X[y==0, 0], X[y==0, 1], color='red')
plt.scatter(X[y==1, 0], X[y==1, 1], color='blue')
plt.ylabel('y coordinate')
plt.xlabel('x coordinate')
plt.show()
月亮:
内核pca转换:
def stepwise_kpca(X, gamma, n_components):
"""
X: A MxN dataset as NumPy array where the samples are stored as rows (M), features as columns (N).
gamma: coefficient for the RBF kernel.
n_components: number of components to be returned.
"""
# Calculating the squared Euclidean distances for every pair of points
# in the MxN dimensional dataset.
sq_dists = pdist(X, 'sqeuclidean')
# Converting the pairwise distances into a symmetric MxM matrix.
mat_sq_dists = squareform(sq_dists)
K=np.exp(-gamma*mat_sq_dists)
# Centering the symmetric NxN kernel matrix.
N = K.shape[0]
one_n = np.ones((N,N)) / N
K = K - one_n.dot(K) - K.dot(one_n) + one_n.dot(K).dot(one_n)
# Obtaining eigenvalues in descending order with corresponding
# eigenvectors from the symmetric matrix.
eigvals, eigvecs = eigh(K)
# Obtaining the i eigenvectors that corresponds to the i highest eigenvalues.
X_pc = np.column_stack((eigvecs[:,-i] for i in range(1,n_components+1)))
return X_pc
让我们来策划
X_4 = stepwise_kpca(X, gamma=15, n_components=2)
plt.scatter(X_4[y==0, 0], X_4[y==0, 1], color='red')
plt.scatter(X_4[y==1, 0], X_4[y==1, 1], color='blue')
plt.ylabel('y coordinate')
plt.xlabel('x coordinate')
plt.show()
结果:
我们可以手动编写内核pca。让我们从多项式核开始
from sklearn.datasets import make_circles
from scipy.spatial.distance import pdist, squareform
from scipy.linalg import eigh
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
X_c, y_c = make_circles(n_samples=100, random_state=654)
plt.figure(figsize=(8,6))
plt.scatter(X_c[y_c==0, 0], X_c[y_c==0, 1], color='red')
plt.scatter(X_c[y_c==1, 0], X_c[y_c==1, 1], color='blue')
plt.ylabel('y coordinate')
plt.xlabel('x coordinate')
plt.show()
数据:
现在转换数据并绘制它
X_c1 = degree_pca(X_c, gamma=5, degree=2, n_components=2)
plt.figure(figsize=(8,6))
plt.scatter(X_c1[y_c==0, 0], X_c1[y_c==0, 1], color='red')
plt.scatter(X_c1[y_c==1, 0], X_c1[y_c==1, 1], color='blue')
plt.ylabel('y coordinate')
plt.xlabel('x coordinate')
plt.show()
线性可分:
现在点可以线性分开
接下来,让我们编写RBF内核。为了演示,让我们看月亮
from sklearn.datasets import make_moons
X, y = make_moons(n_samples=100, random_state=654)
plt.figure(figsize=(8,6))
plt.scatter(X[y==0, 0], X[y==0, 1], color='red')
plt.scatter(X[y==1, 0], X[y==1, 1], color='blue')
plt.ylabel('y coordinate')
plt.xlabel('x coordinate')
plt.show()
月亮:
内核pca转换:
def stepwise_kpca(X, gamma, n_components):
"""
X: A MxN dataset as NumPy array where the samples are stored as rows (M), features as columns (N).
gamma: coefficient for the RBF kernel.
n_components: number of components to be returned.
"""
# Calculating the squared Euclidean distances for every pair of points
# in the MxN dimensional dataset.
sq_dists = pdist(X, 'sqeuclidean')
# Converting the pairwise distances into a symmetric MxM matrix.
mat_sq_dists = squareform(sq_dists)
K=np.exp(-gamma*mat_sq_dists)
# Centering the symmetric NxN kernel matrix.
N = K.shape[0]
one_n = np.ones((N,N)) / N
K = K - one_n.dot(K) - K.dot(one_n) + one_n.dot(K).dot(one_n)
# Obtaining eigenvalues in descending order with corresponding
# eigenvectors from the symmetric matrix.
eigvals, eigvecs = eigh(K)
# Obtaining the i eigenvectors that corresponds to the i highest eigenvalues.
X_pc = np.column_stack((eigvecs[:,-i] for i in range(1,n_components+1)))
return X_pc
让我们来策划
X_4 = stepwise_kpca(X, gamma=15, n_components=2)
plt.scatter(X_4[y==0, 0], X_4[y==0, 1], color='red')
plt.scatter(X_4[y==1, 0], X_4[y==1, 1], color='blue')
plt.ylabel('y coordinate')
plt.xlabel('x coordinate')
plt.show()
结果:
感谢您在内核PCA方面的努力,但这与我的问题无关。感谢您在内核PCA方面的努力,但这与我的问题无关。