Python 理解张量
在我学会了如何使用Python 理解张量,python,numpy,linear-algebra,tensor,dot-product,Python,Numpy,Linear Algebra,Tensor,Dot Product,在我学会了如何使用einsum之后,我现在正试图理解np.tensordot是如何工作的 然而,我有点不知所措,尤其是关于参数轴的各种可能性 为了理解它,由于我从未练习过张量演算,我使用以下示例: A = np.random.randint(2, size=(2, 3, 5)) B = np.random.randint(2, size=(3, 2, 4)) 在这种情况下,np.tensordot可能有什么不同?您将如何手动计算它?tensordot交换轴并重塑输入,以便它可以将np.dot应
einsum
之后,我现在正试图理解np.tensordot
是如何工作的
然而,我有点不知所措,尤其是关于参数轴的各种可能性
为了理解它,由于我从未练习过张量演算,我使用以下示例:
A = np.random.randint(2, size=(2, 3, 5))
B = np.random.randint(2, size=(3, 2, 4))
在这种情况下,np.tensordot可能有什么不同?您将如何手动计算它?tensordot
交换轴并重塑输入,以便它可以将np.dot
应用于2个2d阵列。然后,它将交换和重塑回目标。实验可能比解释容易。没有什么特别的张量数学,只是扩展了dot
,在更高的维度上工作tensor
仅表示具有多于2d的数组。如果您已经习惯了einsum
,那么将结果与之进行比较将是最简单的
样本测试,在1对轴上求和
In [823]: np.tensordot(A,B,[0,1]).shape
Out[823]: (3, 5, 3, 4)
In [824]: np.einsum('ijk,lim',A,B).shape
Out[824]: (3, 5, 3, 4)
In [825]: np.allclose(np.einsum('ijk,lim',A,B),np.tensordot(A,B,[0,1]))
Out[825]: True
另一个,在二上求和
In [826]: np.tensordot(A,B,[(0,1),(1,0)]).shape
Out[826]: (5, 4)
In [827]: np.einsum('ijk,jim',A,B).shape
Out[827]: (5, 4)
In [828]: np.allclose(np.einsum('ijk,jim',A,B),np.tensordot(A,B,[(0,1),(1,0)]))
Out[828]: True
我们可以对(1,0)
对执行相同的操作。考虑到维度的混合,我认为没有其他组合。使用tensordot
的想法非常简单-我们输入数组和相应的轴,沿着这些轴进行总和缩减。参与总和缩减的轴在输出中被移除,来自输入阵列的所有剩余轴在输出中作为不同的轴展开,保持输入阵列馈送的顺序
让我们看看几个具有一个和两个求和归约轴的示例案例,并交换输入位置,看看如何在输出中保持顺序
一、一轴总和递减
投入:
In [7]: A = np.random.randint(2, size=(2, 6, 5))
...: B = np.random.randint(2, size=(3, 2, 4))
...:
In [11]: A = np.random.randint(2, size=(2, 3, 5))
...: B = np.random.randint(2, size=(3, 2, 4))
...:
案例1:
案例2(与案例1相同,但输入被交换):
二,。两轴求和归约
投入:
In [7]: A = np.random.randint(2, size=(2, 6, 5))
...: B = np.random.randint(2, size=(3, 2, 4))
...:
In [11]: A = np.random.randint(2, size=(2, 3, 5))
...: B = np.random.randint(2, size=(3, 2, 4))
...:
案例1:
案例2:
我们可以将其扩展到尽可能多的轴。上面的答案非常好,帮助我理解了tensordot
。但它们并没有显示操作背后的实际数学。这就是为什么我在TF2中为自己做了相同的操作,并决定在这里分享它们:
a = tf.constant([1,2.])
b = tf.constant([2,3.])
print(f"{tf.tensordot(a, b, 0)}\t tf.einsum('i,j', a, b)\t\t- ((the last 0 axes of a), (the first 0 axes of b))")
print(f"{tf.tensordot(a, b, ((),()))}\t tf.einsum('i,j', a, b)\t\t- ((() axis of a), (() axis of b))")
print(f"{tf.tensordot(b, a, 0)}\t tf.einsum('i,j->ji', a, b)\t- ((the last 0 axes of b), (the first 0 axes of a))")
print(f"{tf.tensordot(a, b, 1)}\t\t tf.einsum('i,i', a, b)\t\t- ((the last 1 axes of a), (the first 1 axes of b))")
print(f"{tf.tensordot(a, b, ((0,), (0,)))}\t\t tf.einsum('i,i', a, b)\t\t- ((0th axis of a), (0th axis of b))")
print(f"{tf.tensordot(a, b, (0,0))}\t\t tf.einsum('i,i', a, b)\t\t- ((0th axis of a), (0th axis of b))")
[[2. 3.]
[4. 6.]] tf.einsum('i,j', a, b) - ((the last 0 axes of a), (the first 0 axes of b))
[[2. 3.]
[4. 6.]] tf.einsum('i,j', a, b) - ((() axis of a), (() axis of b))
[[2. 4.]
[3. 6.]] tf.einsum('i,j->ji', a, b) - ((the last 0 axes of b), (the first 0 axes of a))
8.0 tf.einsum('i,i', a, b) - ((the last 1 axes of a), (the first 1 axes of b))
8.0 tf.einsum('i,i', a, b) - ((0th axis of a), (0th axis of b))
8.0 tf.einsum('i,i', a, b) - ((0th axis of a), (0th axis of b))
对于(2,2)
形状:
a = tf.constant([[1,2],
[-2,3.]])
b = tf.constant([[-2,3],
[0,4.]])
print(f"{tf.tensordot(a, b, 0)}\t tf.einsum('ij,kl', a, b)\t- ((the last 0 axes of a), (the first 0 axes of b))")
print(f"{tf.tensordot(a, b, (0,0))}\t tf.einsum('ij,ik', a, b)\t- ((0th axis of a), (0th axis of b))")
print(f"{tf.tensordot(a, b, (0,1))}\t tf.einsum('ij,ki', a, b)\t- ((0th axis of a), (1st axis of b))")
print(f"{tf.tensordot(a, b, 1)}\t tf.matmul(a, b)\t\t- ((the last 1 axes of a), (the first 1 axes of b))")
print(f"{tf.tensordot(a, b, ((1,), (0,)))}\t tf.einsum('ij,jk', a, b)\t- ((1st axis of a), (0th axis of b))")
print(f"{tf.tensordot(a, b, (1, 0))}\t tf.matmul(a, b)\t\t- ((1st axis of a), (0th axis of b))")
print(f"{tf.tensordot(a, b, 2)}\t tf.reduce_sum(tf.multiply(a, b))\t- ((the last 2 axes of a), (the first 2 axes of b))")
print(f"{tf.tensordot(a, b, ((0,1), (0,1)))}\t tf.einsum('ij,ij->', a, b)\t\t- ((0th axis of a, 1st axis of a), (0th axis of b, 1st axis of b))")
[[[[-2. 3.]
[ 0. 4.]]
[[-4. 6.]
[ 0. 8.]]]
[[[ 4. -6.]
[-0. -8.]]
[[-6. 9.]
[ 0. 12.]]]] tf.einsum('ij,kl', a, b) - ((the last 0 axes of a), (the first 0 axes of b))
[[-2. -5.]
[-4. 18.]] tf.einsum('ij,ik', a, b) - ((0th axis of a), (0th axis of b))
[[-8. -8.]
[ 5. 12.]] tf.einsum('ij,ki', a, b) - ((0th axis of a), (1st axis of b))
[[-2. 11.]
[ 4. 6.]] tf.matmul(a, b) - ((the last 1 axes of a), (the first 1 axes of b))
[[-2. 11.]
[ 4. 6.]] tf.einsum('ij,jk', a, b) - ((1st axis of a), (0th axis of b))
[[-2. 11.]
[ 4. 6.]] tf.matmul(a, b) - ((1st axis of a), (0th axis of b))
16.0 tf.reduce_sum(tf.multiply(a, b)) - ((the last 2 axes of a), (the first 2 axes of b))
16.0 tf.einsum('ij,ij->', a, b) - ((0th axis of a, 1st axis of a), (0th axis of b, 1st axis of b))
“求和归约”的确切含义是什么?@floflo29您可能知道矩阵乘法包括保持轴对齐的元素乘法,然后沿该公共对齐轴对元素求和。有了这个总和,我们就失去了被称为归约的公共轴,因此在短和归约中,@BryanHead使用np.tensordot
对输出轴重新排序的唯一方法是交换输入。如果没有得到你想要的结果,transpose
将是最好的选择。如果@Divakar添加了从一维张量开始的示例,以及每个条目的计算方式,会更好。例如,t1=K.variable([[1,2],[2,3]])t2=K.variable([2,3])打印(K.eval(tf.tensordot(t1,t2,axes=0))
output:[[2.3.][4.6.][4.6.][6.9.]]
不确定输出形状如何2x2x2
@dereks本文中使用的求和缩减项是元素相乘然后求和缩减的总括项。在dot/tensordot的上下文中,我认为这样说是安全的。如果这让人困惑,我道歉。现在,在矩阵乘法中,有一个和归约轴(第一个数组的第二个轴与第二个数组的第一个轴相对),而在张量中,有多个和归约轴。给出的示例显示了轴在输入阵列中是如何对齐的,以及如何从这些阵列中获得输出轴。我仍然没有完全理解:(.在第一个示例中,它们是按元素2阵列乘以形状(4,3)
然后在这两个轴上进行求和
。如何使用点
乘积获得相同的结果?我可以从中复制第一个结果的方法是在平坦的二维数组上使用np.dot
:对于a.T中的aa:对于b.T中的bb:print(aa.ravel().dot(bb.T.ravel()))
einsum
相当于tensordot
和轴=([1,0],[0,1])
,是np.einsum('ijk,jil->kl',a,b)
。这个dot
也可以做到:a.T.reformate(5,12)。dot(b.reformate(12,2))
。dot介于a(5,12)和(12,2)之间。a.Ta.T
将5放在第一位,并交换(3,4)以匹配b
。
a = tf.constant([1,2.])
b = tf.constant([2,3.])
print(f"{tf.tensordot(a, b, 0)}\t tf.einsum('i,j', a, b)\t\t- ((the last 0 axes of a), (the first 0 axes of b))")
print(f"{tf.tensordot(a, b, ((),()))}\t tf.einsum('i,j', a, b)\t\t- ((() axis of a), (() axis of b))")
print(f"{tf.tensordot(b, a, 0)}\t tf.einsum('i,j->ji', a, b)\t- ((the last 0 axes of b), (the first 0 axes of a))")
print(f"{tf.tensordot(a, b, 1)}\t\t tf.einsum('i,i', a, b)\t\t- ((the last 1 axes of a), (the first 1 axes of b))")
print(f"{tf.tensordot(a, b, ((0,), (0,)))}\t\t tf.einsum('i,i', a, b)\t\t- ((0th axis of a), (0th axis of b))")
print(f"{tf.tensordot(a, b, (0,0))}\t\t tf.einsum('i,i', a, b)\t\t- ((0th axis of a), (0th axis of b))")
[[2. 3.]
[4. 6.]] tf.einsum('i,j', a, b) - ((the last 0 axes of a), (the first 0 axes of b))
[[2. 3.]
[4. 6.]] tf.einsum('i,j', a, b) - ((() axis of a), (() axis of b))
[[2. 4.]
[3. 6.]] tf.einsum('i,j->ji', a, b) - ((the last 0 axes of b), (the first 0 axes of a))
8.0 tf.einsum('i,i', a, b) - ((the last 1 axes of a), (the first 1 axes of b))
8.0 tf.einsum('i,i', a, b) - ((0th axis of a), (0th axis of b))
8.0 tf.einsum('i,i', a, b) - ((0th axis of a), (0th axis of b))
a = tf.constant([[1,2],
[-2,3.]])
b = tf.constant([[-2,3],
[0,4.]])
print(f"{tf.tensordot(a, b, 0)}\t tf.einsum('ij,kl', a, b)\t- ((the last 0 axes of a), (the first 0 axes of b))")
print(f"{tf.tensordot(a, b, (0,0))}\t tf.einsum('ij,ik', a, b)\t- ((0th axis of a), (0th axis of b))")
print(f"{tf.tensordot(a, b, (0,1))}\t tf.einsum('ij,ki', a, b)\t- ((0th axis of a), (1st axis of b))")
print(f"{tf.tensordot(a, b, 1)}\t tf.matmul(a, b)\t\t- ((the last 1 axes of a), (the first 1 axes of b))")
print(f"{tf.tensordot(a, b, ((1,), (0,)))}\t tf.einsum('ij,jk', a, b)\t- ((1st axis of a), (0th axis of b))")
print(f"{tf.tensordot(a, b, (1, 0))}\t tf.matmul(a, b)\t\t- ((1st axis of a), (0th axis of b))")
print(f"{tf.tensordot(a, b, 2)}\t tf.reduce_sum(tf.multiply(a, b))\t- ((the last 2 axes of a), (the first 2 axes of b))")
print(f"{tf.tensordot(a, b, ((0,1), (0,1)))}\t tf.einsum('ij,ij->', a, b)\t\t- ((0th axis of a, 1st axis of a), (0th axis of b, 1st axis of b))")
[[[[-2. 3.]
[ 0. 4.]]
[[-4. 6.]
[ 0. 8.]]]
[[[ 4. -6.]
[-0. -8.]]
[[-6. 9.]
[ 0. 12.]]]] tf.einsum('ij,kl', a, b) - ((the last 0 axes of a), (the first 0 axes of b))
[[-2. -5.]
[-4. 18.]] tf.einsum('ij,ik', a, b) - ((0th axis of a), (0th axis of b))
[[-8. -8.]
[ 5. 12.]] tf.einsum('ij,ki', a, b) - ((0th axis of a), (1st axis of b))
[[-2. 11.]
[ 4. 6.]] tf.matmul(a, b) - ((the last 1 axes of a), (the first 1 axes of b))
[[-2. 11.]
[ 4. 6.]] tf.einsum('ij,jk', a, b) - ((1st axis of a), (0th axis of b))
[[-2. 11.]
[ 4. 6.]] tf.matmul(a, b) - ((1st axis of a), (0th axis of b))
16.0 tf.reduce_sum(tf.multiply(a, b)) - ((the last 2 axes of a), (the first 2 axes of b))
16.0 tf.einsum('ij,ij->', a, b) - ((0th axis of a, 1st axis of a), (0th axis of b, 1st axis of b))