Scikit learn 直方图的长度在不同的情况下是不同的_Scikit Learn_Histogram_Svm_Scikit Image_Lbph Algorithm

Scikit learn 直方图的长度在不同的情况下是不同的

scikit-learn

Scikit learn 直方图的长度在不同的情况下是不同的,scikit-learn,histogram,svm,scikit-image,lbph-algorithm,Scikit Learn,Histogram,Svm,Scikit Image,Lbph Algorithm,我正在运行LBP算法，根据纹理特征对图像进行分类。分类方法为sklearn.svm包中的LinearSVC 通过SVM获取直方图并进行拟合，但有时直方图的长度会因图像的不同而不同示例如下： from skimage import feature from scipy.stats import itemfreq from sklearn.svm import LinearSVC import numpy as np import cv2 import cvutils import csv imp

我正在运行

LBP

算法，根据纹理特征对

图像进行分类。分类方法为sklearn.svm
包中的LinearSVC

通过SVM
获取直方图并进行拟合，但有时直方图的长度
会因图像
的不同而不同
示例如下：
from skimage import feature
from scipy.stats import itemfreq
from sklearn.svm import LinearSVC
import numpy as np
import cv2
import cvutils
import csv
import os

def __get_hist(image, radius):
    NumPoint = radius*8
    lbp = feature.local_binary_pattern(image, NumPoint, radius, method="uniform")
    x = itemfreq(lbp.ravel())
    hist = x[:,1]/sum(x[:,1])
    return hist
def get_trainHist_list(train_txt):
    train_dic = {}
    with open(train_txt, 'r') as csvfile:
        reader = csv.reader(csvfile, delimiter = ' ')
        for row in reader:
            train_dic[row[0]] = int(row[1])

    hist_list=[]
    key_list=[]
    label_list=[]
    for key, label in train_dic.items():
        img = cv2.imread("D:/Python36/images/texture/%s" %key, cv2.IMREAD_GRAYSCALE)
        key_list.append(key)
        label_list.append(label)
        hist_list.append(__get_hist(img,3))
    bundle = [np.array(key_list), np.array(label_list), np.array(hist_list)]
    return bundle

train_txt = 'D:/Python36/images/class_train.txt'
train_hist = get_trainHist_list(train_txt)
model = LinearSVC(C=100.0, random_state=42)
model.fit(train_hist[2], train_hist[1])
for i in train_hist[2]:
    print(len(i))

test_img = cv2.imread("D:/Python36/images/texture_test/flat-3.png", cv2.IMREAD_GRAYSCALE)
hist= np.array(__get_hist(test_img, 3))
print(len(hist))
prediction = model.predict([hist])
print(prediction)

后果
26
26
26
26
26
26
25
回溯（最近一次呼叫最后一次）：
文件“D:\Python36\texture.py”，第44行，在
预测=模型。预测（[hist]）
文件“D:\Python36\lib\site packages\sklearn\linear\u model\base.py”，第324行，在predict中
分数=自我决策函数（X）
文件“D:\Python36\lib\site packages\sklearn\linear\u model\base.py”，第305行，在decision\u函数中
%（X.形状[1]，n_特征）
ValueError:X每个样本有25个特征；26岁

如您所见，训练图像的直方图的长度全部为26，但测试img
的长度为25。因此，SVM
中的predict
不起作用
我猜test\u img
在直方图中有空部分，空部分可以跳过。（我不确定）
有人有办法解决这个问题吗？
在8点附近有59种不同的统一LBP。这应该是特征向量的维度，但这不是因为您使用了itemfreq
来计算直方图（作为旁注，不推荐使用）。通过itemfreq
获得的直方图长度是图像中不同均匀LBP的数量。如果图像中不存在一些均匀的LBP，则生成的直方图的箱数将低于59。此问题可通过使用以下玩具示例中所示的方法轻松解决：
import numpy as np
from skimage import feature
from scipy.stats import itemfreq

lbp = np.array([[0, 0, 0, 0],
                [1, 1, 1, 1],
                [8, 8, 9, 9]])

hi = itemfreq(lbp.ravel())[:, 1]  # wrong approach
hb = np.bincount(lbp.ravel(), minlength=59)  # proposed method

输出如下所示：
In [815]: hi
Out[815]: array([4, 4, 2, 2], dtype=int64)

In [816]: hb
Out[816]: 
array([4, 4, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0], dtype=int64)

In [815]: hi
Out[815]: array([4, 4, 2, 2], dtype=int64)

In [816]: hb
Out[816]: 
array([4, 4, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0], dtype=int64)