Deep learning Caffe HDF5像素级分类

Deep learning Caffe HDF5像素级分类,deep-learning,caffe,pycaffe,Deep Learning,Caffe,Pycaffe,我正在尝试使用caffe实现图像的像素级二值分类。对于每个维度为3x256x256的图像,我有一个256x256标签数组,其中每个条目都标记为0或1。另外,当我使用下面的代码读取我的HDF5文件时 dirname = "examples/hdf5_classification/data" f = h5py.File(os.path.join(dirname, 'train.h5'), "r") ks = f.keys() data = np.array(f[ks[0]]) label = np

我正在尝试使用caffe实现图像的像素级二值分类。对于每个维度为3x256x256的图像,我有一个256x256标签数组,其中每个条目都标记为0或1。另外,当我使用下面的代码读取我的HDF5文件时

dirname = "examples/hdf5_classification/data"

f = h5py.File(os.path.join(dirname, 'train.h5'), "r")
ks = f.keys()
data = np.array(f[ks[0]])
label = np.array(f[ks[1]])
print "Data dimension from HDF5", np.shape(data)
print "Label dimension from HDF5", np.shape(label)
我得到的数据和标签尺寸如下

Data dimension from HDF5 (402, 3, 256, 256)
Label dimension from HDF5 (402, 256, 256)
我试图将这些数据输入给定的hdf5分类网络,在训练时,我有以下输出(使用默认解算器,但在GPU模式下)

给予


我无法理解为什么预期的标签数量与我的批量大小相同。我究竟应该如何解决这个问题?这是我的标签方法的问题吗

您的问题是
“SoftmaxWithLoss”
层试图将每个输入图像2个元素的预测向量与每个图像256×256大小的标签进行比较。
这毫无意义

错误的根本原因:我想您最想做的是对图像中的每个像素应用一个二进制分类器。为此,您将“fc1”定义为具有
num\u output=2
“InnerProduct”
层。然而,caffe看到这一点的方式是,对整个图像应用一个二进制分类器。因此,caffe为您提供了对整个图像的单一二进制预测

解决方法:在进行像素级预测时,您不再需要使用
“InnerProduct”
层,并且您拥有一个“完全卷积网络”。如果用conv层替换“fc1”(例如,检查每个像素的5乘5环境并根据此修补程序作出决定的内核):


现在,将“SoftmaxWithLoss”应用到
底部:bin_class
底部:label
应该可以工作。

我尝试过,但问题仍然存在。现在我的维度是来自HDF5(402,3,256,256)的数据维度来自HDF5(402,1,256,256)的标签维度@Unni我更改了我的答案。你太棒了!我犯了一个愚蠢的错误。非常感谢。@Unni如果你傻的话,你一开始就不能安装caffe;)有可能不对像素进行二进制分类吗?就像把深度图像作为基本事实一样?你能用类似的方法吗?类似这样的内容:@ShaiI要执行类似的任务,请参见:。我想知道你的输入图像和地面真相图像是什么样子的。你能提供吗?不像你,我想做回归任务,而不是分类任务。如果你能帮助我,那就太好了:)
!cd /home/unni/MTPMain/caffe-master/ && ./build/tools/caffe train -solver examples/hdf5_classification/solver.prototxt
I1119 01:29:02.222512 11910 caffe.cpp:184] Using GPUs 0
I1119 01:29:02.509752 11910 solver.cpp:47] Initializing solver from parameters: 
train_net: "examples/hdf5_classification/train_val.prototxt"
test_net: "examples/hdf5_classification/train_val.prototxt"
test_iter: 250
test_interval: 1000
base_lr: 0.01
display: 1000
max_iter: 10000
lr_policy: "step"
gamma: 0.1
momentum: 0.9
weight_decay: 0.0005
stepsize: 5000
snapshot: 10000
snapshot_prefix: "examples/hdf5_classification/data/train"
solver_mode: GPU
device_id: 0
I1119 01:29:02.519805 11910 solver.cpp:80] Creating training net from train_net file: examples/hdf5_classification/train_val.prototxt
I1119 01:29:02.520031 11910 net.cpp:322] The NetState phase (0) differed from the phase (1) specified by a rule in layer data
I1119 01:29:02.520053 11910 net.cpp:322] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
I1119 01:29:02.520104 11910 net.cpp:49] Initializing net from parameters: 
name: "LogisticRegressionNet"
state {
  phase: TRAIN
}
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  hdf5_data_param {
    source: "examples/hdf5_classification/data/train.txt"
    batch_size: 10
  }
}
layer {
  name: "fc1"
  type: "InnerProduct"
  bottom: "data"
  top: "fc1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc1"
  bottom: "label"
  top: "loss"
}
I1119 01:29:02.520256 11910 layer_factory.hpp:76] Creating layer data
I1119 01:29:02.520277 11910 net.cpp:106] Creating Layer data
I1119 01:29:02.520290 11910 net.cpp:411] data -> data
I1119 01:29:02.520331 11910 net.cpp:411] data -> label
I1119 01:29:02.520352 11910 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: examples/hdf5_classification/data/train.txt
I1119 01:29:02.529341 11910 hdf5_data_layer.cpp:94] Number of HDF5 files: 1
I1119 01:29:02.542645 11910 hdf5.cpp:32] Datatype class: H5T_FLOAT
I1119 01:29:10.601307 11910 net.cpp:150] Setting up data
I1119 01:29:10.612926 11910 net.cpp:157] Top shape: 10 3 256 256 (1966080)
I1119 01:29:10.612963 11910 net.cpp:157] Top shape: 10 256 256 (655360)
I1119 01:29:10.612969 11910 net.cpp:165] Memory required for data: 10485760
I1119 01:29:10.612983 11910 layer_factory.hpp:76] Creating layer fc1
I1119 01:29:10.624948 11910 net.cpp:106] Creating Layer fc1
I1119 01:29:10.625015 11910 net.cpp:454] fc1 <- data
I1119 01:29:10.625039 11910 net.cpp:411] fc1 -> fc1
I1119 01:29:10.645814 11910 net.cpp:150] Setting up fc1
I1119 01:29:10.645864 11910 net.cpp:157] Top shape: 10 2 (20)
I1119 01:29:10.645875 11910 net.cpp:165] Memory required for data: 10485840
I1119 01:29:10.645912 11910 layer_factory.hpp:76] Creating layer loss
I1119 01:29:10.657094 11910 net.cpp:106] Creating Layer loss
I1119 01:29:10.657133 11910 net.cpp:454] loss <- fc1
I1119 01:29:10.657147 11910 net.cpp:454] loss <- label
I1119 01:29:10.657163 11910 net.cpp:411] loss -> loss
I1119 01:29:10.657189 11910 layer_factory.hpp:76] Creating layer loss
F1119 01:29:14.883095 11910 softmax_loss_layer.cpp:42] Check failed: outer_num_ * inner_num_ == bottom[1]->count() (10 vs. 655360) Number of labels must match number of predictions; e.g., if softmax axis == 1 and prediction shape is (N, C, H, W), label count (number of labels) must be N*H*W, with integer values in {0, 1, ..., C-1}.
*** Check failure stack trace: ***
    @     0x7f0652e1adaa  (unknown)
    @     0x7f0652e1ace4  (unknown)
    @     0x7f0652e1a6e6  (unknown)
    @     0x7f0652e1d687  (unknown)
    @     0x7f0653494219  caffe::SoftmaxWithLossLayer<>::Reshape()
    @     0x7f065353f50f  caffe::Net<>::Init()
    @     0x7f0653541f05  caffe::Net<>::Net()
    @     0x7f06535776cf  caffe::Solver<>::InitTrainNet()
    @     0x7f0653577beb  caffe::Solver<>::Init()
    @     0x7f0653578007  caffe::Solver<>::Solver()
    @     0x7f06535278b3  caffe::Creator_SGDSolver<>()
    @           0x410831  caffe::SolverRegistry<>::CreateSolver()
    @           0x40a16b  train()
    @           0x406908  main
    @     0x7f065232cec5  (unknown)
    @           0x406e28  (unknown)
    @              (nil)  (unknown)
Aborted
softmax_loss_layer.cpp:42] Check failed: 
outer_num_ * inner_num_ == bottom[1]->count() (10 vs. 655360) 
Number of labels must match number of predictions; 
e.g., if softmax axis == 1 and prediction shape is (N, C, H, W), 
label count (number of labels) must be N*H*W, 
with integer values in {0, 1, ..., C-1}.
layer {
  name: "bin_class"
  type: "Convolution"
  bottom: "data"
  top: "bin_class"
  convolution_param {
    num_output: 2 # binary class output
    kernel_size: 5 # 5-by-5 patch for prediciton
    pad: 2 # make sure spatial output size equals size of label 
  }
}