Python 用Numpy实现卷积层的反向传播_Python_Numpy_Conv Neural Network

Python 用Numpy实现卷积层的反向传播

python numpy

Python 用Numpy实现卷积层的反向传播,python,numpy,conv-neural-network,Python,Numpy,Conv Neural Network,我在使用Numpy实现Conv2D反向传播时遇到问题。输入的形状为[通道、高度、宽度]。过滤器的形状为[n_过滤器、通道、高度、宽度] 这就是我在正向传播中所做的： ch, h, w = x.shape Hout = (h - self.filters.shape[-2]) // self.stride + 1 Wout = (w - self.filters.shape[-1]) // self.stride + 1 a = np.lib.stride_tricks.as_strided

我在使用Numpy实现Conv2D反向传播时遇到问题。输入的形状为[通道、高度、宽度]。过滤器的形状为[n_过滤器、通道、高度、宽度] 这就是我在正向传播中所做的：

ch, h, w = x.shape
Hout = (h - self.filters.shape[-2]) // self.stride + 1
Wout = (w - self.filters.shape[-1]) // self.stride + 1

a = np.lib.stride_tricks.as_strided(x, (Hout, Wout, ch, self.filters.shape[2], self.filters.shape[3]),
                                    (x.strides[1] * self.stride, x.strides[2] * self.stride) + (
                                    x.strides[0], x.strides[1], x.strides[2]))
out = np.einsum('ijckl,ackl->aij', a, self.filters)

我尝试这样做是为了计算dF，但它不起作用：

F = np.lib.stride_tricks.as_strided(x, (n_filt, size_filt, size_filt, dim_filt, size_filt, size_filt),
                                    (x.strides[0], x.strides[1] * self.stride, x.strides[2] * self.stride) + (
                                    x.strides[0], x.strides[1], x.strides[2]))
F = np.einsum('aijckl,anm->acij', F, dA_prev)

这很好，但速度很慢：

dA = np.zeros(shape=x.shape)  # shape: [input channels, input height, input width]
dF = np.zeros(shape=self.filters.shape)  # shape: [n_filters, channels, height, width]
dB = np.zeros(shape=self.bias.shape)  # shape: [n_filters, 1]
size_filt = self.filters.shape[2]
for filt in range(n_filt):
    y_filt = y_out = 0
    while y_filt + size_filt <= size_img:
        x_filt = x_out = 0
        while x_filt + size_filt <= size_img:
            dF[filt] += dA_prev[filt, y_out, x_out] * x[:, y_filt:y_filt + size_filt, x_filt:x_filt + size_filt]

            dA[:, y_filt:y_filt + size_filt, x_filt:x_filt + size_filt] += (
                    dA_prev[filt, y_out, x_out] * self.filters[filt])

            x_filt += self.stride
            x_out += 1

        y_filt += self.stride
        x_out += 1
    dB += np.sum(dA_prev[filt])

dA=np.zeros（shape=x.shape）#shape:[输入通道，输入高度，输入宽度]
dF=np.zeros（shape=self.filters.shape）#shape:[n_过滤器、通道、高度、宽度]
dB=np.zero（shape=self.bias.shape）#shape:[n_过滤器，1]
size\u filt=self.filters.shape[2]
对于范围内的过滤器（n_过滤器）：
y_filt=y_out=0
虽然y_filt+size_filt我设法找到了一个解决方案，但计算dA的tensordot花费了太多时间，但至少它起作用了
as_strided = np.lib.stride_tricks.as_strided

F = as_strided(x,
               shape=(ch_img, h_filt, w_filt, dA_h, dA_w),
               strides=(x.strides[0], x.strides[1] * self.stride,
                        x.strides[2] * self.stride,
                        x.strides[1], x.strides[2])
               )
F = np.tensordot(F, dA_prev, axes=[(-2, -1), (1, 2)])
dF = F.transpose((3, 0, 1, 2))

pad_h = dA_h - 1
pad_w = dA_w - 1
pad_filt = np.pad(self.filters, ((0, 0), (0, 0), (pad_h, pad_h), (pad_w, pad_w)), 'constant')
sub_windows = as_strided(pad_filt,
                         shape=(n_filt, h, w, dA_h, dA_w, ch_filt),
                         strides=(pad_filt.strides[0], pad_filt.strides[2] * self.stride,
                                  pad_filt.strides[3] * self.stride, pad_filt.strides[2],
                                  pad_filt.strides[3], pad_filt.strides[1])
                         )

dA = np.tensordot(sub_windows, dA_prev[:, ::-1, ::-1], axes=[(0, 3, 4), (0, 1, 2)])
dA = dA.transpose((2, 0, 1))



我设法找到了一个解决方案，计算dA的张量消耗了太多的时间，但至少它起作用了
as_strided = np.lib.stride_tricks.as_strided

F = as_strided(x,
               shape=(ch_img, h_filt, w_filt, dA_h, dA_w),
               strides=(x.strides[0], x.strides[1] * self.stride,
                        x.strides[2] * self.stride,
                        x.strides[1], x.strides[2])
               )
F = np.tensordot(F, dA_prev, axes=[(-2, -1), (1, 2)])
dF = F.transpose((3, 0, 1, 2))

pad_h = dA_h - 1
pad_w = dA_w - 1
pad_filt = np.pad(self.filters, ((0, 0), (0, 0), (pad_h, pad_h), (pad_w, pad_w)), 'constant')
sub_windows = as_strided(pad_filt,
                         shape=(n_filt, h, w, dA_h, dA_w, ch_filt),
                         strides=(pad_filt.strides[0], pad_filt.strides[2] * self.stride,
                                  pad_filt.strides[3] * self.stride, pad_filt.strides[2],
                                  pad_filt.strides[3], pad_filt.strides[1])
                         )

dA = np.tensordot(sub_windows, dA_prev[:, ::-1, ::-1], axes=[(0, 3, 4), (0, 1, 2)])
dA = dA.transpose((2, 0, 1))