Python 在多处理中使用keras

Python 在多处理中使用keras,python,tensorflow,keras,multiprocessing,Python,Tensorflow,Keras,Multiprocessing,这基本上是以下内容的副本: 但是我的设置有点不同,他们的解决方案不适合我 我需要根据另一个模型的预测来训练keras模型。 这些预测与一些CPU密集型代码相关联,因此我希望将它们并行化,并让代码在工作进程中运行。 下面是我要执行的代码: import numpy as np from keras.layers import Input, Dense from keras.models import Model from keras.optimizers import Adam def cre

这基本上是以下内容的副本: 但是我的设置有点不同,他们的解决方案不适合我

我需要根据另一个模型的预测来训练keras模型。 这些预测与一些CPU密集型代码相关联,因此我希望将它们并行化,并让代码在工作进程中运行。 下面是我要执行的代码:

import numpy as np

from keras.layers import Input, Dense
from keras.models import Model
from keras.optimizers import Adam

def create_model():
    input_layer = Input((10,))
    dense = Dense(10)(input_layer)

    return Model(inputs=input_layer, outputs=dense)

model_outside = create_model()
model_outside.compile(Adam(1e-3), "mse")

def subprocess_routine(weights):
    model_inside = create_model()
    model_inside.set_weights(weights)

    while True:
        # lots of CPU
        batch = np.random.rand(10, 10)
        prediction = model_inside.predict(batch)

        yield batch, prediction

weights = model_outside.get_weights()

model_outside.fit_generator(subprocess_routine(weights),
                            epochs=10,
                            steps_per_epoch=100,
                            use_multiprocessing=True,
                            workers=1)
这会产生一个错误

E tensorflow/core/grappler/clusters/utils.cc:81]无法获取设备 属性,错误代码:3

我发现了上面的问题,答案是将keras导入到子流程中。我已将所有导入添加到
子流程_例程中
。但这并不能改变错误。可能有必要从主流程中完全消除keras导入,但在我的设置中,这将意味着巨大的重构

Keras+多线程似乎可以工作。在本期中,向下滚动至最后一条评论: 在我的代码中,它如下所示:

model_inside = create_model()
model_inside._make_predict_function()

graph = tf.get_default_graph()

def subprocess_routine(model_inside, graph):

    while True:
        batch = np.random.rand(10, 10)

        with graph.as_default():
            prediction = model_inside.predict(batch)

        yield batch, prediction

model_outside.fit_generator(subprocess_routine(model_inside, graph),
                            epochs=10,
                            steps_per_epoch=100,
                            use_multiprocessing=True,
                            workers=1)
但是错误消息是相同的

由于问题显然与子流程的初始化有关,因此我尝试在每个子流程中创建一个新会话:

def subprocess_routine(weights):

    import keras.backend as K
    import tensorflow as tf
    sess = tf.Session()
    K.set_session(sess)

    model_inside = create_model()
    model_inside.set_weights(weights)

    while True:
        batch = np.random.rand(10, 10)
        prediction = model_inside.predict(batch)

        yield batch, prediction
它会在同一错误消息上生成一个变体:

E tensorflow/stream_executor/cuda/cuda_driver.cc:1300]无法运行 检索CUDA设备计数:CUDA\u错误\u未初始化

因此,初始化似乎再次中断


如何在我的主进程和由多进程产生的子进程中运行keras?

好消息是tensorflow会话是线程安全的:

要在多个流程中使用keras模型,必须执行以下操作:

  • 建立模型
  • 调用
    \u make\u predict\u function()
  • 设置一个会话并使用它获取tensorflow图
  • 完成这个图表
  • 每次预测某事时,请将此图
    作为\u default\u graph()
以下是一些示例代码:

# the usual imports
import numpy as np
import tensorflow as tf

from keras.models import *
from keras.layers import *

# set up the model
i = Input(shape=(10,))
b = Dense(1)(i)
model = Model(inputs=i, outputs=b)

# now to use it in multiprocessing, the following is necessary
model._make_predict_function()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
default_graph = tf.get_default_graph()
default_graph.finalize()

# now you share the model and graph between processes
# in each process you can call this:
with default_graph.as_default():
    return model.predict(something)

好消息是tensorflow会话是线程安全的:

要在多个流程中使用keras模型,必须执行以下操作:

  • 建立模型
  • 调用
    \u make\u predict\u函数()
  • 设置一个会话并使用它获取tensorflow图
  • 完成这个图表
  • 每次预测某事时,请将此图
    作为\u default\u graph()
以下是一些示例代码:

# the usual imports
import numpy as np
import tensorflow as tf

from keras.models import *
from keras.layers import *

# set up the model
i = Input(shape=(10,))
b = Dense(1)(i)
model = Model(inputs=i, outputs=b)

# now to use it in multiprocessing, the following is necessary
model._make_predict_function()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
default_graph = tf.get_default_graph()
default_graph.finalize()

# now you share the model and graph between processes
# in each process you can call this:
with default_graph.as_default():
    return model.predict(something)

这种技术对我不起作用

我正在加载保存的模型并将其作为参数传递。我的错误消息与发布的消息略有不同。是的

E tensorflow/core/grappler/clusters/utils.cc:83]无法获取设备属性,错误代码:3

在多处理之外运行它没有任何问题。另外,如果这意味着什么,我使用的是docker映像tensorflow/tensorflow-gpu-py3版本1.13.1

下面是我的目标检测代码,它获取一幅图像并生成该图像的多个比例(称为图像金字塔)。然后它一次处理一个刻度。对于每个比例,它将图像解析为较小的窗口,然后将每个窗口发送给处理器。处理器然后使用
model.evaluate([window],[1])
测试当前窗口是否包含我的对象。如果概率很高,窗口框信息将存储在队列中,稍后再检索(以及来自其他进程的值)

这是我的密码:

def start_detection_mp3(image,winDim, minSize,  winStep=4, pyramidScale=1.5, minProb=0.7):
    # Code to use multiple processors (mp)
    boxes=[]
    probs=[]
    print("Loading CNN Keras Model .... ")
    checkpoint_path="trainedmodels/cp.ckpt"
    mymodel=create_CNN_model(2,winDim[0],winDim[1])
    mymodel.load_weights(checkpoint_path)
    mymodel._make_predict_function()
    (keepscale,keeplayer)=CalculateNumberOfScales(image,pyramidScale,minSize)
    printinfo("There are {} scales in this image.".format(len(keepscale)))
    for i in range(0,len(keepscale)):
        printinfo("Working on layer {0:4d}. Scale {1:.2f}".format(i,keepscale[i]))
        (b,p)=detect_single_layer_mp3(keeplayer[i],keepscale[i],winStep,winDim,minProb,mysess,mymodel)

        boxes =boxes + b
        probs =probs + p
    mysess.close()
    return(boxes,probs)

def detect_single_layer_mp3(layer,scale,winStep,winDim,minProb,mysess,mymodel): 
    # Use multiple processors
    q=[]
    p=[]
    d=[]
    i=0
    boxes=[]
    probs=[]
    xx, yy, windows= sliding_window_return(layer, winStep, winDim)
    # process in chunks of 4 (for four processors)
    NumOfProcessors=4;
    for aa in range(0,len(xx)-1,4):
        for ii in range(0,NumOfProcessors):
            ##print("aa: {}  ii: {}".format(aa,ii))
            printinfo("Processes {} of Loop {}".format(ii,aa))
            x=xx[aa]
            y=yy[aa]
            window=windows[aa]
            q=Queue() # Only need to create one Queue (FIFO buffer) to hold output from each process
            # when all processes are completed, the buffer will be emptied.
            p.append(Process(target=f2,args=(x,y,window,scale, minProb,winDim,q,mysess,mymodel)))
            pp=p[-1] # get last
            printinfo("Starting process {}".format(pp))
            pp.start()
            pp.join()

        while not q.empty():
            d=q.get()
            boxes = boxes + d[0]
            probs = probs + d[1]

        p=[]  # Clear Processes    
        p=[]
        q=[]   

    return(boxes,probs)


def f2(x,y,window,scale,minProb,winDim,q,mysess,mymodel):
    processID = os.getpid()
    boxes=[]
    probs=[]
    isHOG = 0
    isCNN = 0
    isCNN_Keras=1
    (winH, winW) = window.shape[:2]
    if winW == winDim[0] and winH ==winDim[1]: # Check that window dimension is 
        if isCNN_Keras ==1:
            ### TODO  It appears that it is freezing at the prediction step                     
            printinfo("Process id: {} Starting test against CNN model".format(processID))
            window=window.reshape(-1,winH,winW,1)
            loss,prob = mymodel.evaluate([window],[1])
            print("Loss: {}  Accuracy: {}".format(loss,prob))

            if prob > minProb:
                printinfo("*** [INFO] ProcessID: {0:7d} Probability: {1:.3f}  Scale {2:.3f} ***".format(processID,prob,scale))
                # compute the (x, y)-coordinates of the bounding box using the current
                # scale of the image pyramid
                (startX, startY) = (int(scale * x), int(scale * y))
                endX = int(startX + (scale * winW))
                endY = int(startY + (scale * winH))

                # update the list of bounding boxes and probabilities
                boxes.append((startX, startY, endX, endY))
                probs.append(prob)      
    # return a tuple of the bounding boxes and probabilities            
    if q!=1:        
        q.put([boxes,probs])
        q.close()
        q=[]
    else:
        return(boxes,probs)

这种技术对我不起作用

我正在加载保存的模型并将其作为参数传递。我的错误消息与发布的消息略有不同。是的

E tensorflow/core/grappler/clusters/utils.cc:83]无法获取设备属性,错误代码:3

在多处理之外运行它没有任何问题。另外,如果这意味着什么,我使用的是docker映像tensorflow/tensorflow-gpu-py3版本1.13.1

下面是我的目标检测代码,它获取一幅图像并生成该图像的多个比例(称为图像金字塔)。然后它一次处理一个刻度。对于每个比例,它将图像解析为较小的窗口,然后将每个窗口发送给处理器。处理器然后使用
model.evaluate([window],[1])
测试当前窗口是否包含我的对象。如果概率很高,窗口框信息将存储在队列中,稍后再检索(以及来自其他进程的值)

这是我的密码:

def start_detection_mp3(image,winDim, minSize,  winStep=4, pyramidScale=1.5, minProb=0.7):
    # Code to use multiple processors (mp)
    boxes=[]
    probs=[]
    print("Loading CNN Keras Model .... ")
    checkpoint_path="trainedmodels/cp.ckpt"
    mymodel=create_CNN_model(2,winDim[0],winDim[1])
    mymodel.load_weights(checkpoint_path)
    mymodel._make_predict_function()
    (keepscale,keeplayer)=CalculateNumberOfScales(image,pyramidScale,minSize)
    printinfo("There are {} scales in this image.".format(len(keepscale)))
    for i in range(0,len(keepscale)):
        printinfo("Working on layer {0:4d}. Scale {1:.2f}".format(i,keepscale[i]))
        (b,p)=detect_single_layer_mp3(keeplayer[i],keepscale[i],winStep,winDim,minProb,mysess,mymodel)

        boxes =boxes + b
        probs =probs + p
    mysess.close()
    return(boxes,probs)

def detect_single_layer_mp3(layer,scale,winStep,winDim,minProb,mysess,mymodel): 
    # Use multiple processors
    q=[]
    p=[]
    d=[]
    i=0
    boxes=[]
    probs=[]
    xx, yy, windows= sliding_window_return(layer, winStep, winDim)
    # process in chunks of 4 (for four processors)
    NumOfProcessors=4;
    for aa in range(0,len(xx)-1,4):
        for ii in range(0,NumOfProcessors):
            ##print("aa: {}  ii: {}".format(aa,ii))
            printinfo("Processes {} of Loop {}".format(ii,aa))
            x=xx[aa]
            y=yy[aa]
            window=windows[aa]
            q=Queue() # Only need to create one Queue (FIFO buffer) to hold output from each process
            # when all processes are completed, the buffer will be emptied.
            p.append(Process(target=f2,args=(x,y,window,scale, minProb,winDim,q,mysess,mymodel)))
            pp=p[-1] # get last
            printinfo("Starting process {}".format(pp))
            pp.start()
            pp.join()

        while not q.empty():
            d=q.get()
            boxes = boxes + d[0]
            probs = probs + d[1]

        p=[]  # Clear Processes    
        p=[]
        q=[]   

    return(boxes,probs)


def f2(x,y,window,scale,minProb,winDim,q,mysess,mymodel):
    processID = os.getpid()
    boxes=[]
    probs=[]
    isHOG = 0
    isCNN = 0
    isCNN_Keras=1
    (winH, winW) = window.shape[:2]
    if winW == winDim[0] and winH ==winDim[1]: # Check that window dimension is 
        if isCNN_Keras ==1:
            ### TODO  It appears that it is freezing at the prediction step                     
            printinfo("Process id: {} Starting test against CNN model".format(processID))
            window=window.reshape(-1,winH,winW,1)
            loss,prob = mymodel.evaluate([window],[1])
            print("Loss: {}  Accuracy: {}".format(loss,prob))

            if prob > minProb:
                printinfo("*** [INFO] ProcessID: {0:7d} Probability: {1:.3f}  Scale {2:.3f} ***".format(processID,prob,scale))
                # compute the (x, y)-coordinates of the bounding box using the current
                # scale of the image pyramid
                (startX, startY) = (int(scale * x), int(scale * y))
                endX = int(startX + (scale * winW))
                endY = int(startY + (scale * winH))

                # update the list of bounding boxes and probabilities
                boxes.append((startX, startY, endX, endY))
                probs.append(prob)      
    # return a tuple of the bounding boxes and probabilities            
    if q!=1:        
        q.put([boxes,probs])
        q.close()
        q=[]
    else:
        return(boxes,probs)

您是否介意详细介绍如何在流程之间共享模型和图形?模型是不可修改的,因此无法通过参数传递,我很难找到可行的解决方法。您是否介意详细介绍如何在进程之间共享模型和图形?模型是不可修改的,所以不能通过参数传递,我很难找到可行的解决方法。