Tensorflow Google colab未在完整数据集下进行培训

Tensorflow Google colab未在完整数据集下进行培训,tensorflow,keras,google-colaboratory,Tensorflow,Keras,Google Colaboratory,我在Google colab中训练神经网络时遇到了一个问题。我的模型没有在完整的训练数据集下训练,即使我已将其上传到驱动器中并提供了正确的路径。这是我写的代码 import tensorflow as tf import tensorflow.keras as keras from keras.models import Sequential from keras.layers import Dense, Flatten, Activation, Dropout from keras.optim

我在Google colab中训练神经网络时遇到了一个问题。我的模型没有在完整的训练数据集下训练,即使我已将其上传到驱动器中并提供了正确的路径。这是我写的代码

import tensorflow as tf
import tensorflow.keras as keras
from keras.models import Sequential
from keras.layers import Dense, Flatten, Activation, Dropout
from keras.optimizers import Adam
from sklearn.metrics import mean_squared_error, mean_absolute_error, max_error, r2_score
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

X=pd.read_csv('/content/drive/My Drive/ML Data/prob_232_full.dat',sep="\s+",header=None)
y=pd.read_csv('/content/drive/My Drive/ML Data/pGuess_232_full.dat',sep="\s+",header=None)

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X.astype(np.float64), y.astype(np.float64), test_size = 0.25, random_state = 1)

X_train = np.array(X_train)
X_test = np.array(X_test)

# Sklearn wants the labels as one-dimensional vectors
y_train = np.array(y_train).reshape((-1,))
y_test = np.array(y_test).reshape((-1,))

ncols=X_train.shape[1]

model = Sequential()

model.add(Dense(activation="relu", input_dim=ncols, units=64, kernel_initializer="uniform"))
model.add(Dense(activation="relu", units=128, kernel_initializer="uniform"))
model.add(Dense(activation="relu", units=256, kernel_initializer="uniform"))
model.add(Dense(activation="relu", units=64, kernel_initializer="uniform"))
model.add(Dense(activation="relu", units=1, kernel_initializer="uniform"))

opt=keras.optimizers.Adam(learning_rate=0.0001)
model.compile(optimizer = opt, loss='mean_squared_error', metrics=['mean_absolute_error'])
history=model.fit(X_train, y_train, validation_data=(X_test, y_test), 
                  batch_size = 32, epochs = 40, verbose=1)

虽然培训集的大小为457500,但它显示模型仅在14297培训数据下进行培训。

欢迎访问Stackoverflow.com


亲爱的,您的
数据集
为457500,并且您使用的
批量大小
为32(在
model.fit中)。因此,数据集的总迭代次数
457500/32
几乎等于=
14296
。最后一批少包含4个示例,因此它不使用最后一批。所以它表现得很好。这只是关于理解。

+1也有同样的“问题”/谷歌Colab实际上一切都很好。我跟踪的视频在本地计算机上使用了Jupyter笔记本,我们都没有指定批量大小,他的显示为50000个,我的显示为1563个。1563*32 = 50,016. 很高兴知道一切都很好。