批量小于keras中指定的I
我为包含Keras的mat文件的大型数据集编写了一个数据生成器 这是我的代码,我试图解决三个类的问题,它们的数据在不同的文件夹(一个,两个,三个)中,并且在每个批次中都会从这个文件夹中随机填充批量小于keras中指定的I,keras,Keras,我为包含Keras的mat文件的大型数据集编写了一个数据生成器 这是我的代码,我试图解决三个类的问题,它们的数据在不同的文件夹(一个,两个,三个)中,并且在每个批次中都会从这个文件夹中随机填充 def generate_arrays_from_file(path,nc1,nc2,nc3): while True: for line in range(batch_size): Data,y=fetch_data(path,nc1,nc2,nc3)
def generate_arrays_from_file(path,nc1,nc2,nc3):
while True:
for line in range(batch_size):
Data,y=fetch_data(path,nc1,nc2,nc3)
yield (Data, y)
def fetch_data(path,nc1,nc2,nc3):
trainData = numpy.empty(shape=[batch_size,img_rows, img_cols ])
y = []
for line in range(batch_size):
labelClass = random.randint(0, 2)
if labelClass == 0:
random_num = random.randint(1, nc1)
file_name = path + '/' + 'one/one-' + str(random_num) + '.mat'
elif labelClass == 1:
random_num = random.randint(1, nc2)
file_name = path + '/' + 'two/two-' + str(random_num) + '.mat'
else:
random_num = random.randint(1, nc3)
file_name = path + '/' + 'three/three-' + str(random_num) + '.mat'
matfile = h5py.File(file_name)
x = matfile['data']
x = numpy.transpose(x.value, axes=(1, 0))
trainData[line,:,:]=x
y.append(labelClass)
trainData = trainData.reshape(trainData.shape[0], img_rows, img_cols, 1)
return trainData,y
此代码正在工作,但批处理大小设置为16,但keras的输出如下
1/50000 [..............................] - ETA: 65067s - loss: 1.1666 - acc: 0.2500
2/50000 [..............................] - ETA: 34057s - loss: 1.4812 - acc: 0.2188
3/50000 [..............................] - ETA: 24202s - loss: 1.6554 - acc: 0.1875
4/50000 [..............................] - ETA: 18799s - loss: 1.5569 - acc: 0.2344
5/50000 [..............................] - ETA: 15611s - loss: 1.4662 - acc: 0.2625
6/50000 [..............................] - ETA: 13863s - loss: 1.4563 - acc: 0.2500
8/50000 [..............................] - ETA: 10978s - loss: 1.3903 - acc: 0.2734
9/50000 [..............................] - ETA: 10402s - loss: 1.3595 - acc: 0.2778
10/50000 [..............................] - ETA: 10253s - loss: 1.3333 - acc: 0.2875
11/50000 [..............................] - ETA: 10389s - loss: 1.3195 - acc: 0.2784
12/50000 [..............................] - ETA: 10411s - loss: 1.3063 - acc: 0.2760
13/50000 [..............................] - ETA: 10360s - loss: 1.2896 - acc: 0.2788
14/50000 [..............................] - ETA: 10424s - loss: 1.2772 - acc: 0.2768
15/50000 [..............................] - ETA: 10464s - loss: 1.2660 - acc: 0.2750
16/50000 [..............................] - ETA: 10483s - loss: 1.2545 - acc: 0.2852
17/50000 [..............................] - ETA: 10557s - loss: 1.2446 - acc: 0.3015
似乎没有考虑批量大小。你能告诉我为什么吗?
谢谢。列车发电机中的每个
步骤(问题中未显示代码)都是一批
因此:
- 批次大小由
生成器定义,但打印输出中未显示李>
- 传递给的
steps\u per\u epoch
参数是将从生成器提取多少批次。每个步骤(或批次)都打印在该输出中李>
epochs
参数将定义重复所有操作的次数
很明显,在输出中,您选择了步骤/epoch=50000
。所以,假设你要训练50000批。它将从生成器中检索50000个批次。(但批次大小由生成器定义)
检查批量大小:
generator = generate_arrays_from_file(path,nc1,nc2,nc3)
generatorSampleX, generatorSampleY = generator.next() #or next(generator)
print(generatorSampleX.shape)
#this will set the generator to the second element, so, it would be good to create the generator again before giving it to training
from keras.callbacks import LambdaCallback
callback = LambdaCallback(on_batch_end=lambda batch,logs:print(logs))
model.fit_generator(........, callbacks = [callback])
检查批次大小有两种可能的方法:
- 从发电机上取一个样品,检查其长度
- 创建一个打印日志的
来自生成器:
generator = generate_arrays_from_file(path,nc1,nc2,nc3)
generatorSampleX, generatorSampleY = generator.next() #or next(generator)
print(generatorSampleX.shape)
#this will set the generator to the second element, so, it would be good to create the generator again before giving it to training
from keras.callbacks import LambdaCallback
callback = LambdaCallback(on_batch_end=lambda batch,logs:print(logs))
model.fit_generator(........, callbacks = [callback])
来自回调:
generator = generate_arrays_from_file(path,nc1,nc2,nc3)
generatorSampleX, generatorSampleY = generator.next() #or next(generator)
print(generatorSampleX.shape)
#this will set the generator to the second element, so, it would be good to create the generator again before giving it to training
from keras.callbacks import LambdaCallback
callback = LambdaCallback(on_batch_end=lambda batch,logs:print(logs))
model.fit_generator(........, callbacks = [callback])
列车发电机
中的每个步骤
(问题中未显示代码)都是一个批次
因此:
- 批次大小由
生成器定义,但打印输出中未显示李>
- 传递给的
steps\u per\u epoch
参数是将从生成器提取多少批次。每个步骤(或批次)都打印在该输出中李>
epochs
参数将定义重复所有操作的次数
很明显,在输出中,您选择了步骤/epoch=50000
。所以,假设你要训练50000批。它将从生成器中检索50000个批次。(但批次大小由生成器定义)
检查批量大小:
generator = generate_arrays_from_file(path,nc1,nc2,nc3)
generatorSampleX, generatorSampleY = generator.next() #or next(generator)
print(generatorSampleX.shape)
#this will set the generator to the second element, so, it would be good to create the generator again before giving it to training
from keras.callbacks import LambdaCallback
callback = LambdaCallback(on_batch_end=lambda batch,logs:print(logs))
model.fit_generator(........, callbacks = [callback])
检查批次大小有两种可能的方法:
- 从发电机上取一个样品,检查其长度
- 创建一个打印日志的
来自生成器:
generator = generate_arrays_from_file(path,nc1,nc2,nc3)
generatorSampleX, generatorSampleY = generator.next() #or next(generator)
print(generatorSampleX.shape)
#this will set the generator to the second element, so, it would be good to create the generator again before giving it to training
from keras.callbacks import LambdaCallback
callback = LambdaCallback(on_batch_end=lambda batch,logs:print(logs))
model.fit_generator(........, callbacks = [callback])
来自回调:
generator = generate_arrays_from_file(path,nc1,nc2,nc3)
generatorSampleX, generatorSampleY = generator.next() #or next(generator)
print(generatorSampleX.shape)
#this will set the generator to the second element, so, it would be good to create the generator again before giving it to training
from keras.callbacks import LambdaCallback
callback = LambdaCallback(on_batch_end=lambda batch,logs:print(logs))
model.fit_generator(........, callbacks = [callback])
谢谢Daniel,但是我得到了一个错误generatorSampleX,generatorSampleY=generator.next()AttributeError:“generator”对象没有属性“next”。您第二个建议的输出是:{'loss':1.0628247,'acc':0.3125,'batch':0,'size':16}1/500[……]-ETA:577s-loss:1.0628-acc:0.3125损失:1.2004553,'acc':0.4375,'batch':1,'size':16}2/500[…]预计:300s-损失:1.1316-预计:0.3750{'loss':1.7897727,'acc':0.4375,'batch':2,'size':16}3/500[…]预计:210s-损失:1.3510-预计:0.3958{'loss':1.2571058,'acc':0.375,'batch':3,'size':16}哦…可能是您的python版本,请尝试next(generator)
。您的批处理大小是'size':16
--您可以自定义回调以仅打印您想要的内容,或者更好地格式化输出,甚至执行其他操作。打印输出(generatorSampleX.shape)是(16,2850,1)这表明批处理大小为16谢谢Daniel,但我得到了一个错误generatorSampleX,generatorSampleX=generator.next()AttributeError:'generator'对象没有属性'next'。第二个建议的输出是:{'loss':1.0628247,'acc':0.3125,'batch':0,'size':16}1/500[……]-预计到达时间:577s-损失:1.0628-预计到达时间:0.3125{'loss':1.2004553,'acc':0.4375,'batch':1,'size':16}2/500[…]-预计到达时间:300s-损失:1.1316-预计到达时间:0.3750{'loss':1.7897727,'acc':0.4375,'batch':2,'size':16}3/500[…]-预计到达时间:210s-损失:1.3510-预计到达时间:0.3958{'loss':1.2571058,'acc':0.375,'batch':3,'size':16}哦……也许这是您的python版本,请尝试next(generator)
。您的批大小是'size':16
——您可以自定义回调以仅打印您想要的内容,或者更好地格式化输出,甚至执行其他操作。打印(generatorSampleX.shape)的输出是(16,2850,1)显示批次大小为16