Deep learning 多任务学习网络结构设计

Deep learning 多任务学习网络结构设计,deep-learning,mxnet,Deep Learning,Mxnet,我正在尝试使用CNN进行文本分类。这是我的一些标签:狗,猫,鸟,足球,篮球。。。由于这些类的粒度太细,精度不高,加上相对较少的训练数据,我将它们分为animal,sports 然后我设计了一个简单的多任务学习结构,如下所示,但它并没有改善我的细粒度标签上的最终性能 18 data = mx.symbol.Variable('data') 19 softmax_label = mx.symbol.Variable('softmax_label') 20 softmax

我正在尝试使用CNN进行文本分类。这是我的一些标签:
足球
篮球
。。。由于这些类的粒度太细,精度不高,加上相对较少的训练数据,我将它们分为
animal
sports

然后我设计了一个简单的多任务学习结构,如下所示,但它并没有改善我的细粒度标签上的最终性能

 18     data = mx.symbol.Variable('data')
 19     softmax_label = mx.symbol.Variable('softmax_label')
 20     softmax_label_finegrained = mx.symbol.Variable('softmax_label_finegrained')
 21
 22     # embedding layer
 23     if not with_embedding:
 24         word_embed = mx.symbol.Embedding(data=data, input_dim=vocab_size,
 25                                       output_dim=embedding_size, name='word_embedding')
 26         conv_input = mx.symbol.Reshape(data=word_embed, target_shape=(batch_size, 1, sentence_size, embedding_size))  # convolution layer needs 4D input.
 27     else:
 28         logging.info('with pretrained embedding.')
 29         conv_input = data
 30
 31     # convolution and pooling layer
 32     pooled_outputs = []
 33     for i, filter_size in enumerate(filter_list):
 34         convi = mx.symbol.Convolution(data=conv_input, kernel=(filter_size, embedding_size), num_filter=num_filter)
 35         acti  = mx.symbol.Activation(data=convi, act_type='relu')
 36         pooli = mx.symbol.Pooling(data=acti, pool_type='max', kernel=(sentence_size - filter_size + 1, 1), stride=(1,1))  # max pooling on entire sentence feature ma    p.
 37         pooled_outputs.append(pooli)
 38
 39     # combine all pooled outputs
 40     num_feature_maps = num_filter * len(filter_list)
 41     concat = mx.symbol.Concat(*pooled_outputs, dim=1)  # max-overtime pooling. concat all feature maps into a long feature before feeding into final dropout and full    y connected layer.
 42     h_pool = mx.symbol.Reshape(data=concat, shape=(batch_size, num_feature_maps))  # make it flat/horizontal
 43
 44     # dropout
 45     if dropout > 0.0:
 46         logging.info('use dropout.')
 47         drop = mx.symbol.Dropout(data=h_pool, p=dropout)
 48     else:
 49         logging.info('Do not use dropout.')
 50         drop = h_pool
 51
 52     # fully connected and softmax output.
 53     logging.info('num_classes: %d', num_classes)
 54     logging.info('num_fine_classes: %d', num_fine_classes)
 55     fc = mx.symbol.FullyConnected(data=drop, num_hidden= num_classes, name='fc')
 56     fc_fine = mx.symbol.FullyConnected(data=drop, num_hidden= num_fine_classes, name='fc_fine')
 57     softmax = mx.symbol.SoftmaxOutput(data= fc, label= softmax_label)
 58     softmax_fine = mx.symbol.SoftmaxOutput(data= fc_fine, label= softmax_label_finegrained)
 59
 60     return mx.symbol.Group([softmax, softmax_fine])
我还尝试通过在
fc
之后添加一个内部
SoftmaxActivation
层来合并更多信息,但没有成功:

 52     # fully connected and softmax output.
 53     logging.info('num_classes: %d', num_classes)
 54     logging.info('num_fine_classes: %d', num_fine_classes)
 55     fc = mx.symbol.FullyConnected(data=drop, num_hidden= num_classes, name='fc')
 56     softmax = mx.symbol.SoftmaxOutput(data=fc, label= softmax_label)
 57     softmax_act = mx.symbol.SoftmaxActivation(data=fc)
 58     # make softmax_domain a internal layer for emitting activation, which we take it as a input into downstream task.
 59     drop_act = mx.symbol.Concat(drop, softmax_act, dim=1)
 60     fc_fine = mx.symbol.FullyConnected(data=drop_act, num_hidden= num_fine_classes, name='fc_fine')
 61     softmax_fine = mx.symbol.SoftmaxOutput(data=fc_fine, label= softmax_label_finegrained)
 62
 63     return mx.symbol.Group([softmax, softmax_fine])
 64

那个么你们在设计这样的网络方面有什么想法或经验吗?欢迎提出任何想法,谢谢~

老实说,我从未见过多任务培训会用于提高细粒度分类的性能。您的两个任务都使用相同的网络唯一的区别是,通用类softmax的输出作为细粒度类softmax的输入

我看不出哪里会出现足以提高细粒度类分类性能的新信息

的确,人们通常使用多任务学习一次学习两件事,但所学的东西是相互独立的。这里是:找到花的颜色和类型。这是一个,比你的简单一点,因为它不把输出连接在一起

希望能有帮助