Deep learning 在Flux中输入CNN的文本数据格式，单位为Julia_Deep Learning_Nlp_Julia_Conv Neural Network

Deep learning 在Flux中输入CNN的文本数据格式，单位为Julia

deep-learning nlp julia

Deep learning 在Flux中输入CNN的文本数据格式，单位为Julia,deep-learning,nlp,julia,conv-neural-network,Deep Learning,Nlp,Julia,Conv Neural Network,我正在Julia中实现Yoon Kim的CNN（）用于文本分类，使用Flux作为深度学习框架，使用单个句子作为输入数据点。zoo（）模型在某种程度上被证明是有用的，但它没有一个带有CNN的NLP示例。我想检查我的输入数据格式是否正确在1D Conv的Flux中没有显式的实现，所以我使用在中找到的Conv 以下是解释输入数据格式的部分docstring： Data should be stored in WHCN order (width, height, # channels, # batch

我正在Julia中实现Yoon Kim的CNN（）用于文本分类，使用Flux作为深度学习框架，使用单个句子作为输入数据点。zoo（）模型在某种程度上被证明是有用的，但它没有一个带有CNN的NLP示例。我想检查我的输入数据格式是否正确

在1D Conv的Flux中没有显式的实现，所以我使用在中找到的Conv 以下是解释输入数据格式的部分docstring：

Data should be stored in WHCN order (width, height, # channels, # batches).
In other words, a 100×100 RGB image would be a `100×100×3×1` array,
and a batch of 50 would be a `100×100×3×50` array.

我的格式如下：

1. width: since text in a sentence is 1D, the width is always 1 
2. height: this is the maximum number of tokens allowable in a sentence
3. \# of channels: this is the embedding size
4. \# of batches: the number of sentences in each batch

按照模型动物园中的MNIST示例，我有

function make_minibatch(X, Y, idxs)
    X_batch = zeros(1, num_sentences, emb_dims, MAX_LEN)

    function get_sentence_matrix(sentence)
        embeddings = Vector{Array{Float64, 1}}()
        for word in sentence
            embedding = get_embedding(word)
            push!(embeddings, embedding)
        end
        embeddings = hcat(embeddings...)
        return embeddings
    end

    for i in 1:length(idxs)
        X_batch[1, i, :, :] = get_sentence_matrix(X[idxs[i]])
    end
    Y_batch = [Flux.onehot(label+1, 1:2) for label in Y[idxs]]
    return (X_batch, Y_batch)
end

其中X是一个单词数组数组，get_embedding函数将一个嵌入作为数组返回

X_batch

则是一个

数组{Float64,4}

。这是正确的方法吗