Python 在多处理中使用keras
这基本上是以下内容的副本: 但是我的设置有点不同,他们的解决方案不适合我 我需要根据另一个模型的预测来训练keras模型。 这些预测与一些CPU密集型代码相关联,因此我希望将它们并行化,并让代码在工作进程中运行。 下面是我要执行的代码:Python 在多处理中使用keras,python,tensorflow,keras,multiprocessing,Python,Tensorflow,Keras,Multiprocessing,这基本上是以下内容的副本: 但是我的设置有点不同,他们的解决方案不适合我 我需要根据另一个模型的预测来训练keras模型。 这些预测与一些CPU密集型代码相关联,因此我希望将它们并行化,并让代码在工作进程中运行。 下面是我要执行的代码: import numpy as np from keras.layers import Input, Dense from keras.models import Model from keras.optimizers import Adam def cre
import numpy as np
from keras.layers import Input, Dense
from keras.models import Model
from keras.optimizers import Adam
def create_model():
input_layer = Input((10,))
dense = Dense(10)(input_layer)
return Model(inputs=input_layer, outputs=dense)
model_outside = create_model()
model_outside.compile(Adam(1e-3), "mse")
def subprocess_routine(weights):
model_inside = create_model()
model_inside.set_weights(weights)
while True:
# lots of CPU
batch = np.random.rand(10, 10)
prediction = model_inside.predict(batch)
yield batch, prediction
weights = model_outside.get_weights()
model_outside.fit_generator(subprocess_routine(weights),
epochs=10,
steps_per_epoch=100,
use_multiprocessing=True,
workers=1)
这会产生一个错误
E tensorflow/core/grappler/clusters/utils.cc:81]无法获取设备
属性,错误代码:3
我发现了上面的问题,答案是将keras导入到子流程中。我已将所有导入添加到子流程_例程中
。但这并不能改变错误。可能有必要从主流程中完全消除keras导入,但在我的设置中,这将意味着巨大的重构
Keras+多线程似乎可以工作。在本期中,向下滚动至最后一条评论:
在我的代码中,它如下所示:
model_inside = create_model()
model_inside._make_predict_function()
graph = tf.get_default_graph()
def subprocess_routine(model_inside, graph):
while True:
batch = np.random.rand(10, 10)
with graph.as_default():
prediction = model_inside.predict(batch)
yield batch, prediction
model_outside.fit_generator(subprocess_routine(model_inside, graph),
epochs=10,
steps_per_epoch=100,
use_multiprocessing=True,
workers=1)
但是错误消息是相同的
由于问题显然与子流程的初始化有关,因此我尝试在每个子流程中创建一个新会话:
def subprocess_routine(weights):
import keras.backend as K
import tensorflow as tf
sess = tf.Session()
K.set_session(sess)
model_inside = create_model()
model_inside.set_weights(weights)
while True:
batch = np.random.rand(10, 10)
prediction = model_inside.predict(batch)
yield batch, prediction
它会在同一错误消息上生成一个变体:
E tensorflow/stream_executor/cuda/cuda_driver.cc:1300]无法运行
检索CUDA设备计数:CUDA\u错误\u未初始化
因此,初始化似乎再次中断
如何在我的主进程和由多进程产生的子进程中运行keras?好消息是tensorflow会话是线程安全的: 要在多个流程中使用keras模型,必须执行以下操作:
- 建立模型
- 调用
\u make\u predict\u function()
- 设置一个会话并使用它获取tensorflow图
- 完成这个图表
- 每次预测某事时,请将此图
作为\u default\u graph()
# the usual imports
import numpy as np
import tensorflow as tf
from keras.models import *
from keras.layers import *
# set up the model
i = Input(shape=(10,))
b = Dense(1)(i)
model = Model(inputs=i, outputs=b)
# now to use it in multiprocessing, the following is necessary
model._make_predict_function()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
default_graph = tf.get_default_graph()
default_graph.finalize()
# now you share the model and graph between processes
# in each process you can call this:
with default_graph.as_default():
return model.predict(something)
好消息是tensorflow会话是线程安全的: 要在多个流程中使用keras模型,必须执行以下操作:
- 建立模型
- 调用
\u make\u predict\u函数()
- 设置一个会话并使用它获取tensorflow图
- 完成这个图表
- 每次预测某事时,请将此图
作为\u default\u graph()
# the usual imports
import numpy as np
import tensorflow as tf
from keras.models import *
from keras.layers import *
# set up the model
i = Input(shape=(10,))
b = Dense(1)(i)
model = Model(inputs=i, outputs=b)
# now to use it in multiprocessing, the following is necessary
model._make_predict_function()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
default_graph = tf.get_default_graph()
default_graph.finalize()
# now you share the model and graph between processes
# in each process you can call this:
with default_graph.as_default():
return model.predict(something)
这种技术对我不起作用 我正在加载保存的模型并将其作为参数传递。我的错误消息与发布的消息略有不同。是的
E tensorflow/core/grappler/clusters/utils.cc:83]无法获取设备属性,错误代码:3
在多处理之外运行它没有任何问题。另外,如果这意味着什么,我使用的是docker映像tensorflow/tensorflow-gpu-py3版本1.13.1
下面是我的目标检测代码,它获取一幅图像并生成该图像的多个比例(称为图像金字塔)。然后它一次处理一个刻度。对于每个比例,它将图像解析为较小的窗口,然后将每个窗口发送给处理器。处理器然后使用model.evaluate([window],[1])
测试当前窗口是否包含我的对象。如果概率很高,窗口框信息将存储在队列中,稍后再检索(以及来自其他进程的值)
这是我的密码:
def start_detection_mp3(image,winDim, minSize, winStep=4, pyramidScale=1.5, minProb=0.7):
# Code to use multiple processors (mp)
boxes=[]
probs=[]
print("Loading CNN Keras Model .... ")
checkpoint_path="trainedmodels/cp.ckpt"
mymodel=create_CNN_model(2,winDim[0],winDim[1])
mymodel.load_weights(checkpoint_path)
mymodel._make_predict_function()
(keepscale,keeplayer)=CalculateNumberOfScales(image,pyramidScale,minSize)
printinfo("There are {} scales in this image.".format(len(keepscale)))
for i in range(0,len(keepscale)):
printinfo("Working on layer {0:4d}. Scale {1:.2f}".format(i,keepscale[i]))
(b,p)=detect_single_layer_mp3(keeplayer[i],keepscale[i],winStep,winDim,minProb,mysess,mymodel)
boxes =boxes + b
probs =probs + p
mysess.close()
return(boxes,probs)
def detect_single_layer_mp3(layer,scale,winStep,winDim,minProb,mysess,mymodel):
# Use multiple processors
q=[]
p=[]
d=[]
i=0
boxes=[]
probs=[]
xx, yy, windows= sliding_window_return(layer, winStep, winDim)
# process in chunks of 4 (for four processors)
NumOfProcessors=4;
for aa in range(0,len(xx)-1,4):
for ii in range(0,NumOfProcessors):
##print("aa: {} ii: {}".format(aa,ii))
printinfo("Processes {} of Loop {}".format(ii,aa))
x=xx[aa]
y=yy[aa]
window=windows[aa]
q=Queue() # Only need to create one Queue (FIFO buffer) to hold output from each process
# when all processes are completed, the buffer will be emptied.
p.append(Process(target=f2,args=(x,y,window,scale, minProb,winDim,q,mysess,mymodel)))
pp=p[-1] # get last
printinfo("Starting process {}".format(pp))
pp.start()
pp.join()
while not q.empty():
d=q.get()
boxes = boxes + d[0]
probs = probs + d[1]
p=[] # Clear Processes
p=[]
q=[]
return(boxes,probs)
def f2(x,y,window,scale,minProb,winDim,q,mysess,mymodel):
processID = os.getpid()
boxes=[]
probs=[]
isHOG = 0
isCNN = 0
isCNN_Keras=1
(winH, winW) = window.shape[:2]
if winW == winDim[0] and winH ==winDim[1]: # Check that window dimension is
if isCNN_Keras ==1:
### TODO It appears that it is freezing at the prediction step
printinfo("Process id: {} Starting test against CNN model".format(processID))
window=window.reshape(-1,winH,winW,1)
loss,prob = mymodel.evaluate([window],[1])
print("Loss: {} Accuracy: {}".format(loss,prob))
if prob > minProb:
printinfo("*** [INFO] ProcessID: {0:7d} Probability: {1:.3f} Scale {2:.3f} ***".format(processID,prob,scale))
# compute the (x, y)-coordinates of the bounding box using the current
# scale of the image pyramid
(startX, startY) = (int(scale * x), int(scale * y))
endX = int(startX + (scale * winW))
endY = int(startY + (scale * winH))
# update the list of bounding boxes and probabilities
boxes.append((startX, startY, endX, endY))
probs.append(prob)
# return a tuple of the bounding boxes and probabilities
if q!=1:
q.put([boxes,probs])
q.close()
q=[]
else:
return(boxes,probs)
这种技术对我不起作用 我正在加载保存的模型并将其作为参数传递。我的错误消息与发布的消息略有不同。是的
E tensorflow/core/grappler/clusters/utils.cc:83]无法获取设备属性,错误代码:3
在多处理之外运行它没有任何问题。另外,如果这意味着什么,我使用的是docker映像tensorflow/tensorflow-gpu-py3版本1.13.1
下面是我的目标检测代码,它获取一幅图像并生成该图像的多个比例(称为图像金字塔)。然后它一次处理一个刻度。对于每个比例,它将图像解析为较小的窗口,然后将每个窗口发送给处理器。处理器然后使用model.evaluate([window],[1])
测试当前窗口是否包含我的对象。如果概率很高,窗口框信息将存储在队列中,稍后再检索(以及来自其他进程的值)
这是我的密码:
def start_detection_mp3(image,winDim, minSize, winStep=4, pyramidScale=1.5, minProb=0.7):
# Code to use multiple processors (mp)
boxes=[]
probs=[]
print("Loading CNN Keras Model .... ")
checkpoint_path="trainedmodels/cp.ckpt"
mymodel=create_CNN_model(2,winDim[0],winDim[1])
mymodel.load_weights(checkpoint_path)
mymodel._make_predict_function()
(keepscale,keeplayer)=CalculateNumberOfScales(image,pyramidScale,minSize)
printinfo("There are {} scales in this image.".format(len(keepscale)))
for i in range(0,len(keepscale)):
printinfo("Working on layer {0:4d}. Scale {1:.2f}".format(i,keepscale[i]))
(b,p)=detect_single_layer_mp3(keeplayer[i],keepscale[i],winStep,winDim,minProb,mysess,mymodel)
boxes =boxes + b
probs =probs + p
mysess.close()
return(boxes,probs)
def detect_single_layer_mp3(layer,scale,winStep,winDim,minProb,mysess,mymodel):
# Use multiple processors
q=[]
p=[]
d=[]
i=0
boxes=[]
probs=[]
xx, yy, windows= sliding_window_return(layer, winStep, winDim)
# process in chunks of 4 (for four processors)
NumOfProcessors=4;
for aa in range(0,len(xx)-1,4):
for ii in range(0,NumOfProcessors):
##print("aa: {} ii: {}".format(aa,ii))
printinfo("Processes {} of Loop {}".format(ii,aa))
x=xx[aa]
y=yy[aa]
window=windows[aa]
q=Queue() # Only need to create one Queue (FIFO buffer) to hold output from each process
# when all processes are completed, the buffer will be emptied.
p.append(Process(target=f2,args=(x,y,window,scale, minProb,winDim,q,mysess,mymodel)))
pp=p[-1] # get last
printinfo("Starting process {}".format(pp))
pp.start()
pp.join()
while not q.empty():
d=q.get()
boxes = boxes + d[0]
probs = probs + d[1]
p=[] # Clear Processes
p=[]
q=[]
return(boxes,probs)
def f2(x,y,window,scale,minProb,winDim,q,mysess,mymodel):
processID = os.getpid()
boxes=[]
probs=[]
isHOG = 0
isCNN = 0
isCNN_Keras=1
(winH, winW) = window.shape[:2]
if winW == winDim[0] and winH ==winDim[1]: # Check that window dimension is
if isCNN_Keras ==1:
### TODO It appears that it is freezing at the prediction step
printinfo("Process id: {} Starting test against CNN model".format(processID))
window=window.reshape(-1,winH,winW,1)
loss,prob = mymodel.evaluate([window],[1])
print("Loss: {} Accuracy: {}".format(loss,prob))
if prob > minProb:
printinfo("*** [INFO] ProcessID: {0:7d} Probability: {1:.3f} Scale {2:.3f} ***".format(processID,prob,scale))
# compute the (x, y)-coordinates of the bounding box using the current
# scale of the image pyramid
(startX, startY) = (int(scale * x), int(scale * y))
endX = int(startX + (scale * winW))
endY = int(startY + (scale * winH))
# update the list of bounding boxes and probabilities
boxes.append((startX, startY, endX, endY))
probs.append(prob)
# return a tuple of the bounding boxes and probabilities
if q!=1:
q.put([boxes,probs])
q.close()
q=[]
else:
return(boxes,probs)
您是否介意详细介绍如何在流程之间共享模型和图形?模型是不可修改的,因此无法通过参数传递,我很难找到可行的解决方法。您是否介意详细介绍如何在进程之间共享模型和图形?模型是不可修改的,所以不能通过参数传递,我很难找到可行的解决方法。