Python 如何将向量重塑为TensorFlow';s过滤器?
我想将另一个网络训练的一些权重转移到TensorFlow,权重存储在一个向量中,如下所示:Python 如何将向量重塑为TensorFlow';s过滤器?,python,numpy,tensorflow,Python,Numpy,Tensorflow,我想将另一个网络训练的一些权重转移到TensorFlow,权重存储在一个向量中,如下所示: 1 2 3 9 10 11 3 4 5 12 13 14 6 7 8 15 16 17 img = np.zeros([1,3,224,224]) img = img - 1 img = np.rollaxis(img, 1, 4) weight_array = googleNet.layers[1].weights weight_array = np.reshape(wei
1 2 3 9 10 11
3 4 5 12 13 14
6 7 8 15 16 17
img = np.zeros([1,3,224,224])
img = img - 1
img = np.rollaxis(img, 1, 4)
weight_array = googleNet.layers[1].weights
weight_array = np.reshape(weight_array,[64,3,7,7])
biases_array = googleNet.layers[1].biases
tf_weight = tf.Variable(weight_array)
tf_img = tf.Variable(img)
tf_img = tf.cast(tf_img,tf.float32)
tf_biases = tf.Variable(biases_array)
conv_feature = tf.nn.bias_add(tf.nn.conv2d(tf_img,tf_weight,strides=[1,2,2,1],padding='SAME'),tf_biases)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
feautre = sess.run(conv_feature)
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]
通过使用numpy,我可以将其重塑为两个3乘3的过滤器,如下所示:
1 2 3 9 10 11
3 4 5 12 13 14
6 7 8 15 16 17
img = np.zeros([1,3,224,224])
img = img - 1
img = np.rollaxis(img, 1, 4)
weight_array = googleNet.layers[1].weights
weight_array = np.reshape(weight_array,[64,3,7,7])
biases_array = googleNet.layers[1].biases
tf_weight = tf.Variable(weight_array)
tf_img = tf.Variable(img)
tf_img = tf.cast(tf_img,tf.float32)
tf_biases = tf.Variable(biases_array)
conv_feature = tf.nn.bias_add(tf.nn.conv2d(tf_img,tf_weight,strides=[1,2,2,1],padding='SAME'),tf_biases)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
feautre = sess.run(conv_feature)
因此,我的过滤器的形状是(1,2,3,3)
。然而,在TensorFlow中,过滤器的形状是(3,3,2,1)
:
在将tf_权重重塑为预期形状后,权重变得混乱,我无法获得预期的卷积结果
具体来说,当图像或过滤器的形状为[数字、通道、大小、大小]时,我编写了一个卷积函数,它给出了正确的答案,但速度太慢:
def convol(images,weights,biases,stride):
"""
Args:
images:input images or features, 4-D tensor
weights:weights, 4-D tensor
biases:biases, 1-D tensor
stride:stride, a float number
Returns:
conv_feature: convolved feature map
"""
image_num = images.shape[0] #the number of input images or feature maps
channel = images.shape[1] #channels of an image,images's shape should be like [n,c,h,w]
weight_num = weights.shape[0] #number of weights, weights' shape should be like [n,c,size,size]
ksize = weights.shape[2]
h = images.shape[2]
w = images.shape[3]
out_h = (h+np.floor(ksize/2)*2-ksize)/2+1
out_w = out_h
conv_features = np.zeros([image_num,weight_num,out_h,out_w])
for i in range(image_num):
image = images[i,...,...,...]
for j in range(weight_num):
sum_convol_feature = np.zeros([out_h,out_w])
for c in range(channel):
#extract a single channel image
channel_image = image[c,...,...]
#pad the image
padded_image = im_pad(channel_image,ksize/2)
#transform this image to a vector
im_col = im2col(padded_image,ksize,stride)
weight = weights[j,c,...,...]
weight_col = np.reshape(weight,[-1])
mul = np.dot(im_col,weight_col)
convol_feature = np.reshape(mul,[out_h,out_w])
sum_convol_feature = sum_convol_feature + convol_feature
conv_features[i,j,...,...] = sum_convol_feature + biases[j]
return conv_features
相反,通过使用tensorflow的conv2d,如下所示:
1 2 3 9 10 11
3 4 5 12 13 14
6 7 8 15 16 17
img = np.zeros([1,3,224,224])
img = img - 1
img = np.rollaxis(img, 1, 4)
weight_array = googleNet.layers[1].weights
weight_array = np.reshape(weight_array,[64,3,7,7])
biases_array = googleNet.layers[1].biases
tf_weight = tf.Variable(weight_array)
tf_img = tf.Variable(img)
tf_img = tf.cast(tf_img,tf.float32)
tf_biases = tf.Variable(biases_array)
conv_feature = tf.nn.bias_add(tf.nn.conv2d(tf_img,tf_weight,strides=[1,2,2,1],padding='SAME'),tf_biases)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
feautre = sess.run(conv_feature)
我得到的特征图是错误的。不要使用
np。重塑
。可能吧
改用:
请注意,大小为3的两个轴的顺序没有改变。如果我给它们添加标签,两个rollaxis
操作会导致形状更改为(1,2,31,32)->(1,31,32,2)->(31,32,2,1)。您的最终阵列如下所示:
>>> b
array([[[[ 1],
[10]],
[[ 2],
[11]],
[[ 3],
[12]]],
[[[ 4],
[13]],
[[ 5],
[14]],
[[ 6],
[15]]],
[[[ 7],
[16]],
[[ 8],
[17]],
[[ 9],
[18]]]])
样本张量操作
我不知道这是否有帮助。考虑整形、收集、动态分割和分割操作,并使之适应您的需要。
下面是这些操作的说明,这些操作可以根据您的情况进行调整。我从我的git回购中复制了这个。我相信,如果您在ipython中运行此示例,您可以找出您真正想要的内容,并获得更好的洞察力
重塑、聚集、动态分区和拆分
聚集操作(tf.Gather())
生成一个数组并测试聚集操作。请注意,这种快速原型制作方法:
- 我们在Numpy中生成了一个数组,并在其上测试了张量流的运算
array = np.array([[1,2,3],[4,9,6],[2,3,4],[7,8,0]])
array.shape
(4, 3)
In [27]:
gather_output0 = tf.gather(array,1)
gather_output01 = tf.gather(array,2)
gather_output02 = tf.gather(array,3)
gather_output11 = tf.gather(array,[1,2])
gather_output12 = tf.gather(array,[1,3])
gather_output13 = tf.gather(array,[3,2])
gather_output = tf.gather(array,[1,0,2])
gather_output1 = tf.gather(array,[1,1,2])
gather_output2 = tf.gather(array,[1,2,1])
In [28]:
with tf.Session() as sess:
print (gather_output0.eval());print("\n")
print (gather_output01.eval());print("\n")
print (gather_output02.eval());print("\n")
print (gather_output11.eval());print("\n")
print (gather_output12.eval());print("\n")
print (gather_output13.eval());print("\n")
print (gather_output.eval());print("\n")
print (gather_output1.eval());print("\n")
print (gather_output2.eval());print("\n")
#print (gather_output2.eval());print("\n")
[4 9 6]
[2 3 4]
[7 8 0]
[[4 9 6]
[2 3 4]]
[[4 9 6]
[7 8 0]]
[[7 8 0]
[2 3 4]]
[[4 9 6]
[1 2 3]
[2 3 4]]
[[4 9 6]
[4 9 6]
[2 3 4]]
[[4 9 6]
[2 3 4]
[4 9 6]]
* we define (5,30) aray in numpy
* we split the array along axis 1
* We specify the number of splits as 1-Dimen Tensor along axis 1. So we have 3 splits.
Specify an array
Create a (5 by 30) numpy array. The syntax using numpy is shown below
In [2]:
ArrayBeforeSplitting = np.arange(150).reshape(5,30)
print ("Array shape without split operation is : " ,ArrayBeforeSplitting.shape)
('Array shape without split operation is : ', (5, 30))
specify number of splits
In [3]:
split_1D = tf.Variable([8,13,9])
print("specify number of partions using 1-Dimen Variable:" , tf.shape(split_1D))
('specify number of partions using 1-Dimen Variable:', <tf.Tensor 'Shape:0' shape=(1,) dtype=int32>)
Use tf.split
Make 3 splits aong y axis so that we have (5,8) ,(5,13),(5,9) splits. The axis 1 add up to give 30-- we can see axis 1 has 30 elements so the partition along that axis should add up to 30 otherwise it gives error.
In [6]:
split1,split2,split3 = tf.split(ArrayBeforeSplitting,split_1D,1)
# we have 3 splits along axis 1 specified spcifically
# by the split_1D . That is split axis 1D (with 30 elements) into partions with 8 ,13, and 9 elements while the x axis
#remains constant
In [7]:
#INitialise global variables. because split_ID is a variable and needs to be initialised before being
#used in a computational graph
init_op = tf.global_variables_initializer()
In [16]:
with tf.Session() as sess:
sess.run(init_op) # run variable initialisation.
result=split1.eval();print("\n")
print(result)
print("the shape of the first split operation is : ",result.shape)
result2=split2.eval();print("\n")
print(result2)
print("the shape of the second split operation is : ",result2.shape)
result3=split3.eval();print("\n")
print(result3)
print("the shape of the third split operation is : ",result3.shape)
[[ 0 1 2 3 4 5 6 7]
[ 30 31 32 33 34 35 36 37]
[ 60 61 62 63 64 65 66 67]
[ 90 91 92 93 94 95 96 97]
[120 121 122 123 124 125 126 127]]
('the shape of the first split operation is : ', (5, 8))
[[ 8 9 10 11 12 13 14 15 16 17 18 19 20]
[ 38 39 40 41 42 43 44 45 46 47 48 49 50]
[ 68 69 70 71 72 73 74 75 76 77 78 79 80]
[ 98 99 100 101 102 103 104 105 106 107 108 109 110]
[128 129 130 131 132 133 134 135 136 137 138 139 140]]
('the shape of the second split operation is : ', (5, 13))
看看这个简单的例子:
- 初始化简单数组
- 测试采集操作
在[11]中:
注意:输入大小和输出大小必须相同---否则会产生错误。检查这一点的简单方法是,通过执行简单的乘法,确保可以将输入划分为重塑参数array_simple = np.array([1,2,3]) In [15]: print "shape of simple array is: ", array_simple.shape shape of simple array is: (3,) In [57]: gather1 = tf.gather(array1,[0]) gather01 = tf.gather(array1,[1]) gather02 = tf.gather(array1,[2]) gather2 = tf.gather(array1,[1,2]) gather3 = tf.gather(array1,[0,1]) with tf.Session() as sess: print (gather1.eval());print("\n") print (gather01.eval());print("\n") print (gather02.eval());print("\n") print (gather2.eval());print("\n") print (gather3.eval());print("\n") [1] [2] [3] [2 3] [1 2] tf.reshape( ) Note: * Use the same array that was initiated * Do reshape using tf.reshape( ) In [64]: array.shape # Confirm array shape Out[64]: (4, 3) In [74]: print ("This is the array\n" ,array) # see the output and compare with the initial array, This is the array [[1 2 3] [4 9 6] [2 3 4] [7 8 0]] In [84]: reshape_ops= tf.reshape(array,[-1,4]) # Note the parameters in reshpe reshape_ops1= tf.reshape(array,[-1,3]) # Note the parameters in reshpe reshape_ops2= tf.reshape(array,[-1,6]) # Note the parameters in reshpe reshape_ops_back1= tf.reshape(array,[6,-1]) # Note the parameters in reshpe reshape_ops_back2= tf.reshape(array,[3,-1]) # Note the parameters in reshpe reshape_ops_back3= tf.reshape(array,[4,-1]) # Note the parameters in reshpe In [86]: with tf.Session() as sess: print(reshape_ops.eval());print("\n") print(reshape_ops1.eval());print("\n") print(reshape_ops2.eval());print("\n") print ("Output when we reverse the parameters:");print("\n") print(reshape_ops_back1.eval());print("\n") print(reshape_ops_back2.eval());print("\n") print(reshape_ops_back3.eval());print("\n") [[1 2 3 4] [9 6 2 3] [4 7 8 0]] [[1 2 3] [4 9 6] [2 3 4] [7 8 0]] [[1 2 3 4 9 6] [2 3 4 7 8 0]] Output when we reverse the parameters: [[1 2] [3 4] [9 6] [2 3] [4 7] [8 0]] [[1 2 3 4] [9 6 2 3] [4 7 8 0]] [[1 2 3] [4 9 6] [2 3 4] [7 8 0]]
array = np.array([[1,2,3],[4,9,6],[2,3,4],[7,8,0]])
array.shape
(4, 3)
In [27]:
gather_output0 = tf.gather(array,1)
gather_output01 = tf.gather(array,2)
gather_output02 = tf.gather(array,3)
gather_output11 = tf.gather(array,[1,2])
gather_output12 = tf.gather(array,[1,3])
gather_output13 = tf.gather(array,[3,2])
gather_output = tf.gather(array,[1,0,2])
gather_output1 = tf.gather(array,[1,1,2])
gather_output2 = tf.gather(array,[1,2,1])
In [28]:
with tf.Session() as sess:
print (gather_output0.eval());print("\n")
print (gather_output01.eval());print("\n")
print (gather_output02.eval());print("\n")
print (gather_output11.eval());print("\n")
print (gather_output12.eval());print("\n")
print (gather_output13.eval());print("\n")
print (gather_output.eval());print("\n")
print (gather_output1.eval());print("\n")
print (gather_output2.eval());print("\n")
#print (gather_output2.eval());print("\n")
[4 9 6]
[2 3 4]
[7 8 0]
[[4 9 6]
[2 3 4]]
[[4 9 6]
[7 8 0]]
[[7 8 0]
[2 3 4]]
[[4 9 6]
[1 2 3]
[2 3 4]]
[[4 9 6]
[4 9 6]
[2 3 4]]
[[4 9 6]
[2 3 4]
[4 9 6]]
* we define (5,30) aray in numpy
* we split the array along axis 1
* We specify the number of splits as 1-Dimen Tensor along axis 1. So we have 3 splits.
Specify an array
Create a (5 by 30) numpy array. The syntax using numpy is shown below
In [2]:
ArrayBeforeSplitting = np.arange(150).reshape(5,30)
print ("Array shape without split operation is : " ,ArrayBeforeSplitting.shape)
('Array shape without split operation is : ', (5, 30))
specify number of splits
In [3]:
split_1D = tf.Variable([8,13,9])
print("specify number of partions using 1-Dimen Variable:" , tf.shape(split_1D))
('specify number of partions using 1-Dimen Variable:', <tf.Tensor 'Shape:0' shape=(1,) dtype=int32>)
Use tf.split
Make 3 splits aong y axis so that we have (5,8) ,(5,13),(5,9) splits. The axis 1 add up to give 30-- we can see axis 1 has 30 elements so the partition along that axis should add up to 30 otherwise it gives error.
In [6]:
split1,split2,split3 = tf.split(ArrayBeforeSplitting,split_1D,1)
# we have 3 splits along axis 1 specified spcifically
# by the split_1D . That is split axis 1D (with 30 elements) into partions with 8 ,13, and 9 elements while the x axis
#remains constant
In [7]:
#INitialise global variables. because split_ID is a variable and needs to be initialised before being
#used in a computational graph
init_op = tf.global_variables_initializer()
In [16]:
with tf.Session() as sess:
sess.run(init_op) # run variable initialisation.
result=split1.eval();print("\n")
print(result)
print("the shape of the first split operation is : ",result.shape)
result2=split2.eval();print("\n")
print(result2)
print("the shape of the second split operation is : ",result2.shape)
result3=split3.eval();print("\n")
print(result3)
print("the shape of the third split operation is : ",result3.shape)
[[ 0 1 2 3 4 5 6 7]
[ 30 31 32 33 34 35 36 37]
[ 60 61 62 63 64 65 66 67]
[ 90 91 92 93 94 95 96 97]
[120 121 122 123 124 125 126 127]]
('the shape of the first split operation is : ', (5, 8))
[[ 8 9 10 11 12 13 14 15 16 17 18 19 20]
[ 38 39 40 41 42 43 44 45 46 47 48 49 50]
[ 68 69 70 71 72 73 74 75 76 77 78 79 80]
[ 98 99 100 101 102 103 104 105 106 107 108 109 110]
[128 129 130 131 132 133 134 135 136 137 138 139 140]]
('the shape of the second split operation is : ', (5, 13))
*我们用numpy定义(5,30)aray
*我们沿着轴1拆分阵列
*我们将分裂的数量指定为沿轴1的一维张量。所以我们有三个分裂。
指定一个数组
创建一个(5×30)numpy数组。使用numpy的语法如下所示
在[2]中:
排列前重新排列=np.arange(150)。重塑(5,30)
打印(“无拆分操作的数组形状为:”,ArrayBeforSplitting.shape)
('Array shape without split operation is:',(5,30))
指定拆分的数目
在[3]中:
split_1D=tf.变量([8,13,9])
打印(“使用一维变量指定分区数:”,tf.shape(split_1D))
(“使用1-Dimen变量指定分区数:”,)
使用tf.split
在y轴上做3次劈开,这样我们有(5,8),(5,13),(5,9)次劈开。轴1加起来等于30——我们可以看到轴1有30个元素,所以沿着该轴的分区应该加起来等于30,否则会产生错误。
在[6]中:
split1,split2,split3=tf.split(数组在重新排列前,split_1D,1)
#我们有3个特殊指定的沿轴1的拆分
#通过拆分。即将轴1D(包含30个元素)拆分为包含8、13和9个元素的分区,而x轴
#保持不变
在[7]中:
#初始化全局变量。因为split_ID是一个变量,需要在初始化之前进行初始化
#用于计算图形
init_op=tf.global_variables_initializer()
在[16]中:
使用tf.Session()作为sess:
sess.run(init_op)#运行变量初始化。
结果=split1.eval();打印(“\n”)
打印(结果)
打印(“第一次拆分操作的形状为:”,result.shape)
result2=split2.eval();打印(“\n”)
打印(结果2)
打印(“第二次拆分操作的形状为:”,result2.shape)
result3=split3.eval();打印(“\n”)
打印(结果3)
打印(“第三次拆分操作的形状为:”,result3.shape)
[[ 0 1 2 3 4 5 6 7]
[ 30 31 32 33 34 35 36 37]
[ 60 61 62 63 64 65 66 67]
[ 90 91 92 93 94 95 96 97]
[120 121 122 123 124 125 126 127]]
('第一次拆分操作的形状为:',(5,8))
[[ 8 9 10 11 12 13 14 15 16 17 18 19 20]
[ 38 39 40 41 42 43 44 45 46 47 48 49 50]
[ 68 69 70 71 72 73 74 75 76 77 78 79 80]
[ 98 99 100 101 102 103 104 105 106 107 108 109 110]
[128 129 130 131 132 133 134 135 136 137 138 139 140]]
('第二次拆分操作的形状为:',(5,13))
希望这有帮助 谢谢你的回答!我试过这个代码。但是,重新调整后的阵列无法按预期工作。我认为tensorflow的维度顺序是如此紧密。在numpy中,我们将图像重塑为类似于[number,channel,height,width]的四维张量,但在tensorflow中类似于[number,height,width,channel]。因此,在重塑之后,权重或图像变成了一个质量,我无法获得预期的正确卷积特征。那么,根据您的评论,您可能只需要
b=np.rollaxis(a,1,4)
。那也不行吗?不幸的是,我不熟悉tensorflow。但这绝对是一种解决问题的方法