Python Pad_序列为max_len（Keras）获取多个参数_Python_Keras

Python Pad_序列为max_len（Keras）获取多个参数

python keras

Python Pad_序列为max_len（Keras）获取多个参数,python,keras,Python,Keras,我试图在文本分类的遗传算法中使用Keras模型，但是我在pad_序列中遇到了一个错误，它声称： TypeError: pad_sequences() got multiple values for argument 'maxlen' 实际pad_序列变量赋值为： data = self.pad_sequences(sequences, maxlen=MAX_SEQUENCE_LENGTH) 可在以下文件中找到： def get_data(self): """Retrieve the

我试图在文本分类的遗传算法中使用Keras模型，但是我在pad_序列中遇到了一个错误，它声称：

TypeError: pad_sequences() got multiple values for argument 'maxlen'

实际pad_序列变量赋值为：

data = self.pad_sequences(sequences, maxlen=MAX_SEQUENCE_LENGTH)

可在以下文件中找到：

def get_data(self):
    """Retrieve the dataset and process the data."""

    batch_size = 128
    VALIDATION_SPLIT = 0.2
    MAX_SEQUENCE_LENGTH = 1000
    MAX_NUM_WORDS = 20000
    csv = 'VocabCSV.csv'
    my_df = self.pd.read_csv(csv,index_col=0,encoding = 'latin-1')
    my_df.dropna(inplace=True)
    my_df.reset_index(drop=True,inplace=True)
    print(my_df.info())

    texts = my_df.Text # list of text samples
    labellist = my_df.Target # list of labels
    label_vals = [] # label values list
    labels_index = {} # dictionary mapping label name to numeric id
    labels = [] # list of label ids


    for label in labellist:
        if label not in label_vals:
            label_vals.append(label)

    for idx, text in enumerate(texts):
        for label in label_vals:
            if label == labellist[idx]:
                label_id = label_vals.index(label)
        labels_index[text] = label_id
        labels.append(label_id)

    print("labels index {}".format(len(labels_index)))
    print("labels size: %s " % len(labels))

    print("found %s texts." % len(texts))

    # finally, vectorize the text samples into a 2D integer tensor
    tokenizer = self.Tokenizer(num_words=MAX_NUM_WORDS)
    tokenizer.fit_on_texts(texts)
    sequences = tokenizer.texts_to_sequences(texts)

    word_index = tokenizer.word_index
    print('Found %s unique tokens.' % len(word_index))

    data = self.pad_sequences(sequences, maxlen=MAX_SEQUENCE_LENGTH)
    print(self.np.asarray(labels).shape)
    labels = self.to_categorical(labels)

    print('Shape of data tensor:', data.shape)
    print('Shape of label tensor:', labels.shape)

    # split the data into a training set and a validation set
    indices = self.np.arange(data.shape[0])
    self.np.random.shuffle(indices)
    data = data[indices]
    labels = labels[indices]
    num_validation_samples = int(VALIDATION_SPLIT * data.shape[0])

    x_train = data[:-num_validation_samples]
    y_train = labels[:-num_validation_samples]
    x_test = data[-num_validation_samples:]
    y_test = labels[-num_validation_samples:]

    print(x_train.shape, y_train.shape)
    print(x_test.shape, y_test.shape)

    print(len(x_test))
    print(len(y_test))

    input_shape = MAX_SEQUENCE_LENGTH

    print(input_shape)

    nb_classes = len(label_vals)

    return (nb_classes, batch_size, input_shape, x_train, x_test, y_train, y_test, word_index)

当另一个函数调用get_数据时，错误似乎就会发生，但我无法确定实际原因。

问题是您有

self.pad_序列（序列，maxlen=MAX_序列长度）

。

pad\u sequences

方法不属于您的类，而是来自

keras.preprocessing.sequence

因此，如果希望它正常工作，请按如下方式进行导入：

from keras.preprocessing import sequence

sequences = sequence.pad_sequences(sequences, maxlen=MAX_SEQUENCE_LENGTH)

然后像这样调用

pad\u序列

：

from keras.preprocessing import sequence

sequences = sequence.pad_sequences(sequences, maxlen=MAX_SEQUENCE_LENGTH)