如何在tensorflow中实现滑动窗口？_Tensorflow

如何在tensorflow中实现滑动窗口？

tensorflow

如何在tensorflow中实现滑动窗口？,tensorflow,Tensorflow,我使用numpy创建了一个滑动窗口算法，它可以在wav音频文件上滑动，并将其片段提供给tensorflow中的NN，它可以检测音频片段中的特征。一旦tensorflow完成了它的工作，它将其输出返回到numpy land，在那里我将切片重新组合成一个预测数组，与原始文件的每个样本位置相匹配： import tensorflow as tf import numpy as np import nn def slide_predict(layers, X, modelPath): outp

我使用numpy创建了一个滑动窗口算法，它可以在wav音频文件上滑动，并将其片段提供给tensorflow中的NN，它可以检测音频片段中的特征。一旦tensorflow完成了它的工作，它将其输出返回到numpy land，在那里我将切片重新组合成一个预测数组，与原始文件的每个样本位置相匹配：

import tensorflow as tf
import numpy as np
import nn

def slide_predict(layers, X, modelPath):
    output = None

    graph = tf.Graph()
    with graph.as_default():
        input_layer_size, hidden_layer_size, num_labels = layers

        X_placeholder = tf.placeholder(tf.float32, shape=(None, input_layer_size), name='X')
        Theta1 = tf.Variable(nn.randInitializeWeights(input_layer_size, hidden_layer_size), name='Theta1')
        bias1 = tf.Variable(nn.randInitializeWeights(hidden_layer_size, 1), name='bias1')
        Theta2 = tf.Variable(nn.randInitializeWeights(hidden_layer_size, num_labels), name='Theta2')
        bias2 = tf.Variable(nn.randInitializeWeights(num_labels, 1), name='bias2')
        hypothesis = nn.forward_prop(X_placeholder, Theta1, bias1, Theta2, bias2)

        sess = tf.Session(graph=graph)
        saver = tf.train.Saver()
        init = tf.global_variables_initializer()
        sess.run(init)

        saver.restore(sess, modelPath)

        window_size = layers[0]

        pad_amount = (window_size * 2) - (X.shape[0] % window_size)
        X = np.pad(X, (pad_amount, 0), 'constant')

        for w in range(window_size):
            start = w
            end = -window_size + w
            X_shifted = X[start:end]
            X_matrix = X_shifted.reshape((-1, window_size))

            prediction = sess.run(hypothesis, feed_dict={X_placeholder: X_matrix})

            output = prediction if (output is None) else np.hstack((output, prediction))

        sess.close()

    output.shape = (X.size, -1)

    return output

不幸的是，这个算法相当慢。我沿途放置了一些日志，到目前为止，最慢的部分是我实际运行tensorflow图的部分。这可能是因为实际的tensorflow计算速度很慢（如果是这样，我可能只是SOL），但我想知道速度慢的很大一部分是否不是因为我在tensorflow中来回传输大型音频文件。因此，我的问题是：

1）像这样反复输入占位符会比输入一次并在tensorflow中计算

的值慢很多吗

2）如果是，在tensorflow中实现滑动窗口算法的最佳方法是什么？

第一个问题是，由于在每次迭代中调用

np.hstack（）

来构建

输出

数组，您的算法在

窗口大小中具有二次时间复杂度，它将输出
和预测
的当前值复制到一个新数组中：
for w in range(window_size):
    # ...
    output = prediction if (output is None) else np.hstack((output, prediction))

与其在每次迭代中调用np.hstack（）
，不如构建一个prediction
数组的Python列表，并在循环终止后对它们调用一次np.hstack（）
：
output_list = []
for w in range(window_size):
    # ...
    prediction = sess.run(...)
    output_list.append(prediction)
output = np.hstack(output_list)

第二个问题是，如果在<>代码> sess .Sun（）/Cux>调用中的计算量很小，则将大值馈送到TysFoad可能是低效的，因为那些值（当前）被复制到C++中。（并将结果复制出来。一个有用的策略是尝试使用构造将滑动窗口循环移动到TensorFlow图中。例如，您可以按如下方式重新构造程序：
# NOTE: If you call this function often, you may want to (i) move the `np.pad()`
# into the graph as `tf.pad()`, and (ii) replace `X_t` with a placeholder.
X = np.pad(X, (pad_amount, 0), 'constant')
X_t  = tf.convert_to_tensor(X)

def window_func(w):
    start = w
    end = w - window_size
    X_matrix = tf.reshape(X_t[start:end], (-1, window_size))
    return nn.forward_prop(X_matrix, Theta1, bias1, Theta2, bias2)

output_t = tf.map_fn(window_func, tf.range(window_size))
# ...
output = sess.run(output_t)

tf.map\u fn是否有GPU实现？如果CNN在GPU上使用map\u fn，则只有在注册了GPU内核的情况下才会有效？是的，tf.map\u fn（）
将与使用GPU的函数一起工作。（tf.range（）
和控制流可能会在CPU上执行，但它会将CNN调度到GPU。）Doh，谢谢！我的叠加方法太愚蠢了。你知道像那样切片张量（X\t[start:end]
）是在复制值还是只是在它们上面移动视图？如果我每次都复制东西，这可能是另一个缓慢的原因。如果有一种方法可以在数组上增加一个指针来提供帮助，那就太好了！切片操作符有时可以避免复制…我认为主要的要求是生成的切片是32字节对齐（因为许多操作符实现都需要这样做），并且它们可能需要密集（如果要切片向量，这应该很好）。