Python 用Numpy实现卷积层的反向传播
我在使用Numpy实现Conv2D反向传播时遇到问题。 输入的形状为[通道、高度、宽度]。 过滤器的形状为[n_过滤器、通道、高度、宽度] 这就是我在正向传播中所做的:Python 用Numpy实现卷积层的反向传播,python,numpy,conv-neural-network,Python,Numpy,Conv Neural Network,我在使用Numpy实现Conv2D反向传播时遇到问题。 输入的形状为[通道、高度、宽度]。 过滤器的形状为[n_过滤器、通道、高度、宽度] 这就是我在正向传播中所做的: ch, h, w = x.shape Hout = (h - self.filters.shape[-2]) // self.stride + 1 Wout = (w - self.filters.shape[-1]) // self.stride + 1 a = np.lib.stride_tricks.as_strided
ch, h, w = x.shape
Hout = (h - self.filters.shape[-2]) // self.stride + 1
Wout = (w - self.filters.shape[-1]) // self.stride + 1
a = np.lib.stride_tricks.as_strided(x, (Hout, Wout, ch, self.filters.shape[2], self.filters.shape[3]),
(x.strides[1] * self.stride, x.strides[2] * self.stride) + (
x.strides[0], x.strides[1], x.strides[2]))
out = np.einsum('ijckl,ackl->aij', a, self.filters)
我尝试这样做是为了计算dF,但它不起作用:
F = np.lib.stride_tricks.as_strided(x, (n_filt, size_filt, size_filt, dim_filt, size_filt, size_filt),
(x.strides[0], x.strides[1] * self.stride, x.strides[2] * self.stride) + (
x.strides[0], x.strides[1], x.strides[2]))
F = np.einsum('aijckl,anm->acij', F, dA_prev)
这很好,但速度很慢:
dA = np.zeros(shape=x.shape) # shape: [input channels, input height, input width]
dF = np.zeros(shape=self.filters.shape) # shape: [n_filters, channels, height, width]
dB = np.zeros(shape=self.bias.shape) # shape: [n_filters, 1]
size_filt = self.filters.shape[2]
for filt in range(n_filt):
y_filt = y_out = 0
while y_filt + size_filt <= size_img:
x_filt = x_out = 0
while x_filt + size_filt <= size_img:
dF[filt] += dA_prev[filt, y_out, x_out] * x[:, y_filt:y_filt + size_filt, x_filt:x_filt + size_filt]
dA[:, y_filt:y_filt + size_filt, x_filt:x_filt + size_filt] += (
dA_prev[filt, y_out, x_out] * self.filters[filt])
x_filt += self.stride
x_out += 1
y_filt += self.stride
x_out += 1
dB += np.sum(dA_prev[filt])
dA=np.zeros(shape=x.shape)#shape:[输入通道,输入高度,输入宽度]
dF=np.zeros(shape=self.filters.shape)#shape:[n_过滤器、通道、高度、宽度]
dB=np.zero(shape=self.bias.shape)#shape:[n_过滤器,1]
size\u filt=self.filters.shape[2]
对于范围内的过滤器(n_过滤器):
y_filt=y_out=0
虽然y_filt+size_filt我设法找到了一个解决方案,但计算dA的tensordot花费了太多时间,但至少它起作用了
as_strided = np.lib.stride_tricks.as_strided
F = as_strided(x,
shape=(ch_img, h_filt, w_filt, dA_h, dA_w),
strides=(x.strides[0], x.strides[1] * self.stride,
x.strides[2] * self.stride,
x.strides[1], x.strides[2])
)
F = np.tensordot(F, dA_prev, axes=[(-2, -1), (1, 2)])
dF = F.transpose((3, 0, 1, 2))
pad_h = dA_h - 1
pad_w = dA_w - 1
pad_filt = np.pad(self.filters, ((0, 0), (0, 0), (pad_h, pad_h), (pad_w, pad_w)), 'constant')
sub_windows = as_strided(pad_filt,
shape=(n_filt, h, w, dA_h, dA_w, ch_filt),
strides=(pad_filt.strides[0], pad_filt.strides[2] * self.stride,
pad_filt.strides[3] * self.stride, pad_filt.strides[2],
pad_filt.strides[3], pad_filt.strides[1])
)
dA = np.tensordot(sub_windows, dA_prev[:, ::-1, ::-1], axes=[(0, 3, 4), (0, 1, 2)])
dA = dA.transpose((2, 0, 1))
我设法找到了一个解决方案,计算dA的张量消耗了太多的时间,但至少它起作用了
as_strided = np.lib.stride_tricks.as_strided
F = as_strided(x,
shape=(ch_img, h_filt, w_filt, dA_h, dA_w),
strides=(x.strides[0], x.strides[1] * self.stride,
x.strides[2] * self.stride,
x.strides[1], x.strides[2])
)
F = np.tensordot(F, dA_prev, axes=[(-2, -1), (1, 2)])
dF = F.transpose((3, 0, 1, 2))
pad_h = dA_h - 1
pad_w = dA_w - 1
pad_filt = np.pad(self.filters, ((0, 0), (0, 0), (pad_h, pad_h), (pad_w, pad_w)), 'constant')
sub_windows = as_strided(pad_filt,
shape=(n_filt, h, w, dA_h, dA_w, ch_filt),
strides=(pad_filt.strides[0], pad_filt.strides[2] * self.stride,
pad_filt.strides[3] * self.stride, pad_filt.strides[2],
pad_filt.strides[3], pad_filt.strides[1])
)
dA = np.tensordot(sub_windows, dA_prev[:, ::-1, ::-1], axes=[(0, 3, 4), (0, 1, 2)])
dA = dA.transpose((2, 0, 1))