Java LSTM与密集层预处理
我试图用LSTM和密集层构建NN Me net是:Java LSTM与密集层预处理,java,lstm,recurrent-neural-network,layer,deeplearning4j,Java,Lstm,Recurrent Neural Network,Layer,Deeplearning4j,我试图用LSTM和密集层构建NN Me net是: MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder() .seed(123) .weightInit(WeightInit.XAVIER) .updater(new Adam(0.1)) .list() .layer(0, new LS
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.seed(123)
.weightInit(WeightInit.XAVIER)
.updater(new Adam(0.1))
.list()
.layer(0, new LSTM.Builder().activation(Activation.TANH).nIn(numInputs).nOut(120).build())
.layer(1, new DenseLayer.Builder().activation(Activation.RELU).nIn(120).nOut(1000).build())
.layer(2, new DenseLayer.Builder().activation(Activation.RELU).nIn(1000).nOut(20).build())
.layer(new OutputLayer.Builder(LossFunction.NEGATIVELOGLIKELIHOOD).activation(Activation.SOFTMAX).nIn(20).nOut(numOutputs).build())
.inputPreProcessor(1, new RnnToFeedForwardPreProcessor())
.build();
我是这样读数据的:
SequenceRecordReader reader = new CSVSequenceRecordReader(0, ",");
reader.initialize(new NumberedFileInputSplit("TRAIN_%d.csv", 1, 17476));
DataSetIterator trainIter = new SequenceRecordReaderDataSetIterator(reader, miniBatchSize, 6, 7, false);
allData = trainIter.next();
//Load the test/evaluation data:
SequenceRecordReader testReader = new CSVSequenceRecordReader(0, ",");
testReader.initialize(new NumberedFileInputSplit("TEST_%d.csv", 1, 8498));
DataSetIterator testIter = new SequenceRecordReaderDataSetIterator(testReader, miniBatchSize, 6, 7, false);
allData = testIter.next();
Received input with size(1) = 7 (input array shape = [32, 7, 60]); input.size(1) must match layer nIn size (nIn = 9)
因此,当它进入网络时,它有形状[批次、特征、时间戳]=[32,7,60]
我可以用这样的特殊错误来定义它:
SequenceRecordReader reader = new CSVSequenceRecordReader(0, ",");
reader.initialize(new NumberedFileInputSplit("TRAIN_%d.csv", 1, 17476));
DataSetIterator trainIter = new SequenceRecordReaderDataSetIterator(reader, miniBatchSize, 6, 7, false);
allData = trainIter.next();
//Load the test/evaluation data:
SequenceRecordReader testReader = new CSVSequenceRecordReader(0, ",");
testReader.initialize(new NumberedFileInputSplit("TEST_%d.csv", 1, 8498));
DataSetIterator testIter = new SequenceRecordReaderDataSetIterator(testReader, miniBatchSize, 6, 7, false);
allData = testIter.next();
Received input with size(1) = 7 (input array shape = [32, 7, 60]); input.size(1) must match layer nIn size (nIn = 9)
所以它通常会上网。在第一个LSTM层之后,它必须重塑为二维,然后再进行密集层
但我还有一个问题:
标签和预输出必须具有相同的形状:得到的形状[32,6,60]与
[1920,6]
它在进入致密层之前没有重塑,我错过了一个特征(现在形状是32,6,60,而不是32,7,60),那么为什么呢?如果可能的话,你可以使用setInputType为你设置预处理器 以下是lstm到dense的配置示例:
MultiLayerConfiguration conf1 = new NeuralNetConfiguration.Builder()
.trainingWorkspaceMode(wsm)
.inferenceWorkspaceMode(wsm)
.seed(12345)
.updater(new Adam(0.1))
.list()
.layer(new LSTM.Builder().nIn(3).nOut(3).dataFormat(rnnDataFormat).build())
.layer(new DenseLayer.Builder().nIn(3).nOut(3).activation(Activation.TANH).build())
.layer(new RnnOutputLayer.Builder().nIn(3).nOut(3).activation(Activation.SOFTMAX).dataFormat(rnnDataFormat)
.lossFunction(LossFunctions.LossFunction.MCXENT).build())
.setInputType(InputType.recurrent(3, rnnDataFormat))
.build();
RNN格式为:
import org.deeplearning4j.nn.conf.RNNFormat;
这是一个枚举,用于指定数据格式(最后一个通道或第一个通道)
从javadoc:
/**
* NCW = "channels first" - arrays of shape [minibatch, channels, width]<br>
* NWC = "channels last" - arrays of shape [minibatch, width, channels]<br>
* "width" corresponds to sequence length and "channels" corresponds to sequence item size.
*/
/**
*NCW=“通道优先”-形状阵列[小批量、通道、宽度]
*NWC=“最后通道”-形状阵列[小批量、宽度、通道]
*“宽度”对应序列长度,“通道”对应序列项大小。
*/
资料来源:
在我们的测试中有更多内容:在第0层中有一个名为numinput的变量。numInputs的价值是多少?如果numInputs=9,则您的TRAIN_u%d.csv文件必须具有9项功能