Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/289.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python中支持向量机的可视化(2D)_Python_Plot_Scikit Learn_Svm_Svc - Fatal编程技术网

Python中支持向量机的可视化(2D)

Python中支持向量机的可视化(2D),python,plot,scikit-learn,svm,svc,Python,Plot,Scikit Learn,Svm,Svc,我有一个作业,在下面。我已经完成了前5项任务,但最后一项任务有问题。来策划它。请说明怎么做。先谢谢你 *(我几天前刚开始学习SVM和ML,请考虑) **(我认为对于所有类型的内核,操作顺序应该是相同的。如果您为其中一个内核显示操作顺序,那就太好了。我将尝试为其他内核调整您的代码) 应遵循的程序: 从这张地图上随机抽取样本。(#100)并将其转化为Python for SVC。此数据集包括东距、北距和岩石信息 使用这100个随机选择的样本,再次随机分割以训练和测试数据集 尝试使用线性、多项式、

我有一个作业,在下面。我已经完成了前5项任务,但最后一项任务有问题。来策划它。请说明怎么做。先谢谢你

*(我几天前刚开始学习SVM和ML,请考虑)

**(我认为对于所有类型的内核,操作顺序应该是相同的。如果您为其中一个内核显示操作顺序,那就太好了。我将尝试为其他内核调整您的代码)

应遵循的程序:

  • 从这张地图上随机抽取样本。(#100)并将其转化为Python for SVC。此数据集包括东距、北距和岩石信息

  • 使用这100个随机选择的样本,再次随机分割以训练和测试数据集

  • 尝试使用线性、多项式、径向基函数和切线的核运行SVC

  • 例如,如果您使用的是径向基函数,那么根据您从精度分数中获得的精度,“C”和“gamma”可以是最佳的

  • 获得拟合模型并计算准确度分数(从测试数据集获得)后,将整个数据集导入获得的拟合模型,并预测reference.csv中所有90000个样本点的输出

  • 向我展示获得的地图以及从每个拟合模型获得的精度分数

  • 数据集如下所示:

    同一风格的90000分

    代码如下:

    import numpy as np
    import pandas as pd
    import seaborn as sns
    import matplotlib.pyplot as plt
    
    ### Importing Info
    
    df = pd.read_csv("C:/Users/Admin/Desktop/RA/step 1/reference.csv", header=0)
    df_model = df.sample(n = 100)
    df_model.shape
    
    ## X-y split
    
    X = df_model.loc[:,df_model.columns!="Rock"]
    y = df_model["Rock"]
    y_initial = df["Rock"]
    
    ### for whole dataset
    
    X_wd = df.loc[:, df_model.columns!="Rock"]
    y_wd = df["Rock"]
    
    ## Test-train split
    
    from sklearn.model_selection import train_test_split
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
    
    ## Standardizing the Data
    
    from sklearn.preprocessing import StandardScaler
    
    sc = StandardScaler().fit(X_train)
    X_train_std = sc.transform(X_train)
    X_test_std = sc.transform(X_test)
    
    ## Linear
    ### Grid Search
    
    from sklearn.model_selection import GridSearchCV
    from sklearn import svm
    from sklearn.metrics import accuracy_score, confusion_matrix
    
    params_linear = {'C' : (0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10, 50, 100, 500,1000)}
    clf_svm_l = svm.SVC(kernel = 'linear')
    svm_grid_linear = GridSearchCV(clf_svm_l, params_linear, n_jobs=-1,
                                  cv = 3, verbose = 1, scoring = 'accuracy')
    
    svm_grid_linear.fit(X_train_std, y_train)
    svm_grid_linear.best_params_
    linsvm_clf = svm_grid_linear.best_estimator_
    accuracy_score(y_test, linsvm_clf.predict(X_test_std))
    
    ### training svm
    
    clf_svm_l = svm.SVC(kernel = 'linear', C = 0.1)
    clf_svm_l.fit(X_train_std, y_train)
    
    ### predicting model
    
    y_train_pred_linear = clf_svm_l.predict(X_train_std)
    y_test_pred_linear = clf_svm_l.predict(X_test_std)
    y_test_pred_linear
    clf_svm_l.n_support_
    
    ### whole dataset
    
    y_pred_linear_wd = clf_svm_l.predict(X_wd)
    
    ### map
            
    
    
    ## Poly
    ### grid search for poly
    
    params_poly = {'C' : (0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10, 50, 100, 500,1000),
             'degree' : (1,2,3,4,5,6)}
    clf_svm_poly = svm.SVC(kernel = 'poly')
    svm_grid_poly = GridSearchCV(clf_svm_poly, params_poly, n_jobs = -1,
                                cv = 3, verbose = 1, scoring = 'accuracy')
    svm_grid_poly.fit(X_train_std, y_train)
    svm_grid_poly.best_params_
    polysvm_clf = svm_grid_poly.best_estimator_
    accuracy_score(y_test, polysvm_clf.predict(X_test_std))
    
    ### training svm
    
    clf_svm_poly = svm.SVC(kernel = 'poly', C = 50, degree = 2)
    clf_svm_poly.fit(X_train_std, y_train)
    
    ### predicting model
    
    y_train_pred_poly = clf_svm_poly.predict(X_train_std)
    y_test_pred_poly = clf_svm_poly.predict(X_test_std)
    
    clf_svm_poly.n_support_
    
    ### whole dataset
    
    y_pred_poly_wd = clf_svm_poly.predict(X_wd)
    
    ### map            
    
    
    ## RBF
    
    ### grid search rbf
    
    params_rbf = {'C' : (0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10, 50, 100, 500,1000),
             'gamma' : (0.001, 0.01, 0.1, 0.5, 1)}
    clf_svm_r = svm.SVC(kernel = 'rbf')
    svm_grid_r = GridSearchCV(clf_svm_r, params_rbf, n_jobs = -1,
                             cv = 10, verbose = 1, scoring = 'accuracy')
    svm_grid_r.fit(X_train_std, y_train)
    svm_grid_r.best_params_
    rsvm_clf = svm_grid_r.best_estimator_
    accuracy_score(y_test, rsvm_clf.predict(X_test_std))
    
    ### training svm
    
    clf_svm_r = svm.SVC(kernel = 'rbf', C = 500, gamma = 0.5)
    clf_svm_r.fit(X_train_std, y_train)
    
    ### predicting model
    
    y_train_pred_r = clf_svm_r.predict(X_train_std)
    y_test_pred_r = clf_svm_r.predict(X_test_std)
    
    ### whole dataset
    
    y_pred_r_wd = clf_svm_r.predict(X_wd)
    
    ### map            
    
    
    ## Tangent
    
    ### grid search
    
    params_tangent = {'C' : (0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10, 50),
             'gamma' : (0.001, 0.01, 0.1, 0.5, 1)}
    clf_svm_tangent = svm.SVC(kernel = 'sigmoid')
    svm_grid_tangent = GridSearchCV(clf_svm_tangent, params_tangent, n_jobs = -1,
                                cv = 10, verbose = 1, scoring = 'accuracy')
    svm_grid_tangent.fit(X_train_std, y_train)
    svm_grid_tangent.best_params_
    tangentsvm_clf = svm_grid_tangent.best_estimator_
    accuracy_score(y_test, tangentsvm_clf.predict(X_test_std))
    
    ### training svm
    
    clf_svm_tangent = svm.SVC(kernel = 'sigmoid', C = 1, gamma = 0.1)
    clf_svm_tangent.fit(X_train_std, y_train)
    
    ### predicting model
    
    y_train_pred_tangent = clf_svm_tangent.predict(X_train_std)
    y_test_pred_tangent = clf_svm_tangent.predict(X_test_std)
    
    ### whole dataset
    
    y_pred_tangent_wd = clf_svm_tangent.predict(X_wd)
    
    ### map
    

    从示例数据来看,您似乎正在处理规则间隔的数据,并且以单调递增的方式迭代行/列。 以下是将此数据集重塑为二维数组(通过将数组重塑为行)并相应打印的一种方法:

    import pandas as pd
    import matplotlib.pyplot as plt
    import numpy as np
    
    # create sample data
    data = {
        'Easting': [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3],
        'Northing': [0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2],
        'Rocks': [0, 0, 1, 0, 0, 2, 0, 0, 0, 1, 0, 0],
    }
    df = pd.DataFrame(data)
    
    # reshape data into 2d matrix (assuming easting / northing steps from 0 to max value)
    max_easting = np.max(df['Easting'])
    img_data = np.reshape(data['Rocks'], (max_easting, -1))
    
    # plot as image
    plt.imshow(img_data)
    plt.show()
    

    如果您处理的是间距不规则的数据,即不是每个东距/北距组合都有一个值,您可以查看。

    对于那些遇到与我相同问题的人,这里是绘制线性可视化的答案。将这些代码改编成其他内核是很容易的

    # Visualising the Training set results
    from matplotlib.colors import ListedColormap
    X_set, y_set = X_train_std, y_train
    X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                         np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
    plt.contourf(X1, X2, clf_svm_l.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
                 alpha = 0.75, cmap = ListedColormap(('darkblue', 'yellow')))
    plt.xlim(X1.min(), X1.max())
    plt.ylim(X2.min(), X2.max())
    for i, j in enumerate(np.unique(y_set)):
        plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
                    c = ListedColormap(('blue', 'gold'))(i), label = j)
    plt.title('SVM (Training set)')
    plt.xlabel('Easting')
    plt.ylabel('Northing')
    plt.legend()
    plt.show()
    

    您希望如何绘制数据?作为图像,即东距是列,北距是行,岩石是值?是的,你作为图像是对的,x轴代表东距,y轴代表北距。感谢您对输出的澄清。