使用python绑定支持向量机库LIBSVM的示例_Python_Machine Learning_Svm_Libsvm

使用python绑定支持向量机库LIBSVM的示例

python machine-learning

使用python绑定支持向量机库LIBSVM的示例,python,machine-learning,svm,libsvm,Python,Machine Learning,Svm,Libsvm,我急需一个在python中使用LibSVM的分类任务示例。我不知道输入应该是什么样子，哪个函数负责培训，哪个函数负责测试谢谢LIBSVM从包含两个列表的元组中读取数据。第一个列表包含类，第二个列表包含输入数据。创建包含两个可能类的简单数据集您还需要通过创建svm_参数来指定要使用的内核 >> from libsvm import * >> prob = svm_problem([1,-1],[[1,0,1],[-1,0,-1]]) >> param = svm_parameter(kern

我急需一个在python中使用LibSVM的分类任务示例。我不知道输入应该是什么样子，哪个函数负责培训，哪个函数负责测试

谢谢

LIBSVM从包含两个列表的元组中读取数据。第一个列表包含类，第二个列表包含输入数据。创建包含两个可能类的简单数据集您还需要通过创建svm_参数来指定要使用的内核


>> from libsvm import *
>> prob = svm_problem([1,-1],[[1,0,1],[-1,0,-1]])
>> param = svm_parameter(kernel_type = LINEAR, C = 10)
  ## training  the model
>> m = svm_model(prob, param)
#testing the model
>> m.predict([1, 1, 1])

这个例子演示了一个单类的SVM分类器；它尽可能简单，同时仍然显示完整的LIBSVM工作流

步骤1：导入NumPy和LIBSVM

  import numpy as NP
    from svm import *

步骤2:生成合成数据：例如，给定边界内的500个点（注意：LIBSVM上提供了相当多的真实数据集）

步骤3:现在，为一类分类器选择一些非线性决策边界：

rx = [ (x**2 + y**2) < 9 and 1 or 0 for (x, y) in Data ]

步骤6:为非线性映射选择一个核函数
对于这个exmaple，我选择了RBF（径向基函数）作为我的核函数

pm = svm_parameter(kernel_type=RBF)
步骤7:训练分类器，通过调用svm_模型，传入问题描述（px）和内核（pm）
步骤8:最后，通过对训练模型对象（'v'）调用predict来测试训练分类器

对于上面的例子，我使用了LIBSVM的3.0版（发布此答案时的当前稳定版本）
最后，w/r/t关于选择核函数的问题部分，支持向量机并不特定于特定的核函数——例如，我可以选择不同的核（高斯、多项式等）
LIBSVM包含了所有最常用的内核函数——这是一个很大的帮助，因为您可以看到所有可能的替代方案，并选择一个用于模型中，只需调用svm_参数并为kernel_type（所选内核的三个字母缩写）传递一个值即可

最后，您选择的内核函数必须与测试数据的内核函数相匹配。
< P>您可以考虑使用

它有一个很棒的libsvm python绑定，应该很容易安装
这里列出的代码示例不适用于libsvm 3.1，因此我或多或少地移植了：

添加到@shinNoNoir：
param.kernel_type表示要使用的内核函数的类型， 0：线性 1：多项式 2：径向基函数 3：乙状结肠
还要记住，svm_问题（y，x）：这里y是类标签，x是类实例，x和y只能是列表、元组和字典。（没有numpy数组）
我不知道以前的版本，但是在LibSVM 3.xx中，方法
svm\u参数（'options'）
只需要一个参数
在我的例子中，
C
、
G
、
p
和
nu
是动态值。您可以根据代码进行更改

选项：

-s svm_type : set type of SVM (default 0) 0 -- C-SVC (multi-class classification) 1 -- nu-SVC (multi-class classification) 2 -- one-class SVM 3 -- epsilon-SVR (regression) 4 -- nu-SVR (regression) -t kernel_type : set type of kernel function (default 2) 0 -- linear: u'*v 1 -- polynomial: (gamma*u'*v + coef0)^degree 2 -- radial basis function: exp(-gamma*|u-v|^2) 3 -- sigmoid: tanh(gamma*u'*v + coef0) 4 -- precomputed kernel (kernel values in training_set_file) -d degree : set degree in kernel function (default 3) -g gamma : set gamma in kernel function (default 1/num_features) -r coef0 : set coef0 in kernel function (default 0) -c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1) -n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5) -p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1) -m cachesize : set cache memory size in MB (default 100) -e epsilon : set tolerance of termination criterion (default 0.001) -h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1) -b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0) -wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1) -v n: n-fold cross validation mode -q : quiet mode (no outputs)

文档来源：
SVM通过SciKit学习：

from sklearn.svm import SVC X = [[0, 0], [1, 1]] y = [0, 1] model = SVC().fit(X, y) tests = [[0.,0.], [0.49,0.49], [0.5,0.5], [2., 2.]] print(model.predict(tests)) # prints [0 0 1 1]

有关更多详细信息，请参见：
以下是我拼凑的一个虚拟示例：

import numpy import matplotlib.pyplot as plt from random import seed from random import randrange import svmutil as svm seed(1) # Creating Data (Dense) train = list([randrange(-10, 11), randrange(-10, 11)] for i in range(10)) labels = [-1, -1, -1, 1, 1, -1, 1, 1, 1, 1] options = '-t 0' # linear model # Training Model model = svm.svm_train(labels, train, options) # Line Parameters w = numpy.matmul(numpy.array(train)[numpy.array(model.get_sv_indices()) - 1].T, model.get_sv_coef()) b = -model.rho.contents.value if model.get_labels()[1] == -1: # No idea here but it should be done :| w = -w b = -b print(w) print(b) # Plotting plt.figure(figsize=(6, 6)) for i in model.get_sv_indices(): plt.scatter(train[i - 1][0], train[i - 1][1], color='red', s=80) train = numpy.array(train).T plt.scatter(train[0], train[1], c=labels) plt.plot([-5, 5], [-(-5 * w[0] + b) / w[1], -(5 * w[0] + b) / w[1]]) plt.xlim([-13, 13]) plt.ylim([-13, 13]) plt.show()

此代码似乎不适用于最新版本的libsvm。svm_参数需要不同的关键字，我想。@JeremyKun我也有同样的问题，看起来像是使用了svmutil import*中的
。参见下面@ShinNoNoir的答案。在第5步，我得到了：回溯（最近一次调用）：文件“”，第1行，在文件“/usr/lib/pymodules/python2.7/svm.py”中，第83行，在uuu init_uuuuutmp_xi中，tmp_idx=gen_svmnoderray（xi，isKernel=isKernel）文件“/usr/lib/pymodules/python2.7/svm.py”，第51行，在gen_svmnoderray中，引起类型错误（XI应该是字典、列表或元组）类型错误：XI应该是字典、列表或元组< /代码>，对于第6步，我得到了<代码> Type错误：席-IntIx（），得到了一个意想不到的关键字参数“KelnNyType”< /C>。 v.predict([3, 1]) # returns the class label (either '1' or '0') from svmutil import * svm_model.predict = lambda self, x: svm_predict([0], [x], self)[0][0] prob = svm_problem([1,-1], [[1,0,1], [-1,0,-1]]) param = svm_parameter() param.kernel_type = LINEAR param.C = 10 m=svm_train(prob, param) m.predict([1,1,1]) param = svm_parameter('-s 0 -t 2 -d 3 -c '+str(C)+' -g '+str(G)+' -p '+str(self.epsilon)+' -n '+str(self.nu)) -s svm_type : set type of SVM (default 0) 0 -- C-SVC (multi-class classification) 1 -- nu-SVC (multi-class classification) 2 -- one-class SVM 3 -- epsilon-SVR (regression) 4 -- nu-SVR (regression) -t kernel_type : set type of kernel function (default 2) 0 -- linear: u'*v 1 -- polynomial: (gamma*u'*v + coef0)^degree 2 -- radial basis function: exp(-gamma*|u-v|^2) 3 -- sigmoid: tanh(gamma*u'*v + coef0) 4 -- precomputed kernel (kernel values in training_set_file) -d degree : set degree in kernel function (default 3) -g gamma : set gamma in kernel function (default 1/num_features) -r coef0 : set coef0 in kernel function (default 0) -c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1) -n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5) -p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1) -m cachesize : set cache memory size in MB (default 100) -e epsilon : set tolerance of termination criterion (default 0.001) -h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1) -b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0) -wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1) -v n: n-fold cross validation mode -q : quiet mode (no outputs) from sklearn.svm import SVC X = [[0, 0], [1, 1]] y = [0, 1] model = SVC().fit(X, y) tests = [[0.,0.], [0.49,0.49], [0.5,0.5], [2., 2.]] print(model.predict(tests)) # prints [0 0 1 1] import numpy import matplotlib.pyplot as plt from random import seed from random import randrange import svmutil as svm seed(1) # Creating Data (Dense) train = list([randrange(-10, 11), randrange(-10, 11)] for i in range(10)) labels = [-1, -1, -1, 1, 1, -1, 1, 1, 1, 1] options = '-t 0' # linear model # Training Model model = svm.svm_train(labels, train, options) # Line Parameters w = numpy.matmul(numpy.array(train)[numpy.array(model.get_sv_indices()) - 1].T, model.get_sv_coef()) b = -model.rho.contents.value if model.get_labels()[1] == -1: # No idea here but it should be done :| w = -w b = -b print(w) print(b) # Plotting plt.figure(figsize=(6, 6)) for i in model.get_sv_indices(): plt.scatter(train[i - 1][0], train[i - 1][1], color='red', s=80) train = numpy.array(train).T plt.scatter(train[0], train[1], c=labels) plt.plot([-5, 5], [-(-5 * w[0] + b) / w[1], -(5 * w[0] + b) / w[1]]) plt.xlim([-13, 13]) plt.ylim([-13, 13]) plt.show()