Tensorflow 在稀疏张量中合并重复索引_Tensorflow

Tensorflow 在稀疏张量中合并重复索引

tensorflow

Tensorflow 在稀疏张量中合并重复索引,tensorflow,Tensorflow,假设我有一个具有重复索引的稀疏张量，在它们重复的地方，我想要合并值（求和）最好的方法是什么例如： indicies = [[1, 1], [1, 2], [1, 2], [1, 3]] values = [1, 2, 3, 4] object = tf.SparseTensor(indicies, values, shape=[10, 10]) result = tf.MAGIC(object) 结果应为具有以下值（或具体值！）的备用张量：我想到的唯一一件事就是将这些标记串在一起，创

假设我有一个具有重复索引的稀疏张量，在它们重复的地方，我想要合并值（求和）最好的方法是什么

例如：

indicies = [[1, 1], [1, 2], [1, 2], [1, 3]]
values = [1, 2, 3, 4]

object = tf.SparseTensor(indicies, values, shape=[10, 10])

result = tf.MAGIC(object)

结果应为具有以下值（或具体值！）的备用张量：

我想到的唯一一件事就是将这些标记串在一起，创建一个索引散列，将其应用于第三维，然后减少该三维的总和

indicies = [[1, 1, 11], [1, 2, 12], [1, 2, 12], [1, 3, 13]]
sparse_result = tf.sparse_reduce_sum(sparseTensor, reduction_axes=2, keep_dims=true)

但这感觉非常难看

这里有一个使用

tf.segment\u sum

的解决方案。其思想是将索引线性化为一维空间，使用

tf.unique

获得唯一索引，运行

tf.segment_sum

，并将索引转换回N-D空间

indices = tf.constant([[1, 1], [1, 2], [1, 2], [1, 3]])
values = tf.constant([1, 2, 3, 4])

# Linearize the indices. If the dimensions of original array are
# [N_{k}, N_{k-1}, ... N_0], then simply matrix multiply the indices
# by [..., N_1 * N_0, N_0, 1]^T. For example, if the sparse tensor
# has dimensions [10, 6, 4, 5], then multiply by [120, 20, 5, 1]^T
# In your case, the dimensions are [10, 10], so multiply by [10, 1]^T

linearized = tf.matmul(indices, [[10], [1]])

# Get the unique indices, and their positions in the array
y, idx = tf.unique(tf.squeeze(linearized))

# Use the positions of the unique values as the segment ids to
# get the unique values
values = tf.segment_sum(values, idx)

# Go back to N-D indices
y = tf.expand_dims(y, 1)
indices = tf.concat([y//10, y%10], axis=1)

tf.InteractiveSession()
print(indices.eval())
print(values.eval())

所以。根据上述解决方案

另一个例子

对于形状[12,5]：

代码中要更改的行：

linearized = tf.matmul(indices, [[5], [1]])

indices = tf.concat([y//5, y%5], axis=1)

使用

unsorted_segment_sum

可能更简单：

def deduplicate(tensor):
    if not isinstance(tensor, tf.IndexedSlices):
        return tensor
    unique_indices, new_index_positions = tf.unique(tensor.indices)
    summed_values = tf.unsorted_segment_sum(tensor.values, new_index_positions, tf.shape(unique_indices)[0])
    return tf.IndexedSlices(indices=unique_indices, values=summed_values, dense_shape=tensor.dense_shape)

也许你可以试试：

indicies = [[1, 1], [1, 2], [1, 2], [1, 3]]
values = [1, 2, 3, 4]

object = tf.SparseTensor(indicies, values, shape=[10, 10])
tf.sparse.to_dense(object, validate_indices=False)

这比我想象的要漂亮得多。我用你的技术在TF compute下实现混淆矩阵。省省我的时间！！

indicies = [[1, 1], [1, 2], [1, 2], [1, 3]]
values = [1, 2, 3, 4]

object = tf.SparseTensor(indicies, values, shape=[10, 10])
tf.sparse.to_dense(object, validate_indices=False)