Python 如何在Azure ML上部署EfficientNet?

Python 如何在Azure ML上部署EfficientNet?,python,azure,tensorflow,keras,Python,Azure,Tensorflow,Keras,我在Keras EfficientNetB4、B5和B7上进行了迁移学习。所有的EfficientNet模型都在Azure ML上进行训练,并使用model.save(文件路径,save_格式='h5')和model.save(文件路径,save_格式='tf')以两种不同的格式保存权重。我需要在Azure ML上部署经过培训的模型 但训练后的模型权重很大,EfficientNetB4保存的权重为1.64GiB,EfficientNetB5和B7权重大于2.50GiB。考虑到部署成本、可扩展性和

我在Keras EfficientNetB4、B5和B7上进行了迁移学习。所有的EfficientNet模型都在Azure ML上进行训练,并使用
model.save(文件路径,save_格式='h5')
model.save(文件路径,save_格式='tf')
以两种不同的格式保存权重。我需要在Azure ML上部署经过培训的模型

但训练后的模型权重很大,EfficientNetB4保存的权重为1.64GiB,EfficientNetB5和B7权重大于2.50GiB。考虑到部署成本、可扩展性和推理速度,部署大型模型权重是不可行的

所有模型都是在
python3.8
tensorflowgpu版本2.3
上训练的。我尝试的第一件事是使用将Keras模型(使用
.h5
格式保存)转换为
.onnx
格式

但它给出了错误,正如keras onnx对于
Tf2.3

tf.keras model eager_mode: False
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-f36257e9acf1> in <module>
      1 output_model_path = "./saveModel/Model16_EffB7_No_meta.onnx"
----> 2 onnx_model = keras2onnx.convert_keras(model, model.name)
      3 keras2onnx.save_model(onnx_model, output_model_path)

/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/main.py in convert_keras(model, name, doc_string, target_opset, channel_first_inputs, debug_mode, custom_op_conversions)
     78         parse_graph_modeless(topology, tf_graph, target_opset, input_names, output_names, output_dict)
     79     else:
---> 80         parse_graph(topology, tf_graph, target_opset, output_names, output_dict)
     81     topology.compile()
     82 

/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/parser.py in parse_graph(topo, graph, target_opset, output_names, keras_node_dict)
    837         topo.raw_model.add_output_name(str_value)
    838 
--> 839     return _parse_graph_core_v2(
    840         graph, keras_node_dict, topo, top_level, output_names
    841     ) if is_tf2 and is_tf_keras else _parse_graph_core(

/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/parser.py in _parse_graph_core_v2(graph, keras_node_dict, topology, top_scope, output_names)
    725             _on_parsing_time_distributed_layer(graph, layer_info.nodelist, layer_info.layer, model_, varset)
    726         elif layer_info.layer and get_converter(type(layer_info.layer)):
--> 727             on_parsing_keras_layer_v2(graph, layer_info, varset)
    728         else:
    729             _on_parsing_tf_nodes(graph, layer_info.nodelist, varset, topology.debug_mode)

/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/_parser_tf.py in on_parsing_keras_layer_v2(graph, layer_info, varset, prefix)
    330             iname = prefix + i_.name
    331             k2o_logger().debug('\tinput : ' + iname)
--> 332             var_type = adjust_input_batch_size(infer_variable_type(i_, varset.target_opset))
    333             i0 = varset.get_local_variable_or_declare_one(iname, var_type)
    334             operator.add_input(i0)

/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/_parser_tf.py in infer_variable_type(tensor, opset, inbound_node_shape)
     45         return BooleanTensorType(shape=tensor_shape)
     46     else:
---> 47         raise ValueError(
     48             "Unable to find out a correct type for tensor type = {} of {}".format(tensor_type, tensor.name))
     49 

ValueError: Unable to find out a correct type for tensor type = 20 of top_bn/ReadVariableOp/resource:0
文件“/azureml envs/azureml_e4e400f0d9bfa98d62da5c2ff4271f2f/lib/python3.6/site packages/cv2/_init_uuuuuu.py”,第5行
从cv2进口*
ImportError:libGL.so.1:无法打开共享对象文件:没有此类文件或目录
工人退出(pid:39)
关闭:主机
原因:工作进程无法启动。
2021-05-17T03:11:22058543482+00:00-gunicorn/饰面30
2021-05-17T03:11:22059923289+00:00-出口代码3不正常。杀人的形象。
tf.keras model eager_mode: False
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-f36257e9acf1> in <module>
      1 output_model_path = "./saveModel/Model16_EffB7_No_meta.onnx"
----> 2 onnx_model = keras2onnx.convert_keras(model, model.name)
      3 keras2onnx.save_model(onnx_model, output_model_path)

/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/main.py in convert_keras(model, name, doc_string, target_opset, channel_first_inputs, debug_mode, custom_op_conversions)
     78         parse_graph_modeless(topology, tf_graph, target_opset, input_names, output_names, output_dict)
     79     else:
---> 80         parse_graph(topology, tf_graph, target_opset, output_names, output_dict)
     81     topology.compile()
     82 

/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/parser.py in parse_graph(topo, graph, target_opset, output_names, keras_node_dict)
    837         topo.raw_model.add_output_name(str_value)
    838 
--> 839     return _parse_graph_core_v2(
    840         graph, keras_node_dict, topo, top_level, output_names
    841     ) if is_tf2 and is_tf_keras else _parse_graph_core(

/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/parser.py in _parse_graph_core_v2(graph, keras_node_dict, topology, top_scope, output_names)
    725             _on_parsing_time_distributed_layer(graph, layer_info.nodelist, layer_info.layer, model_, varset)
    726         elif layer_info.layer and get_converter(type(layer_info.layer)):
--> 727             on_parsing_keras_layer_v2(graph, layer_info, varset)
    728         else:
    729             _on_parsing_tf_nodes(graph, layer_info.nodelist, varset, topology.debug_mode)

/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/_parser_tf.py in on_parsing_keras_layer_v2(graph, layer_info, varset, prefix)
    330             iname = prefix + i_.name
    331             k2o_logger().debug('\tinput : ' + iname)
--> 332             var_type = adjust_input_batch_size(infer_variable_type(i_, varset.target_opset))
    333             i0 = varset.get_local_variable_or_declare_one(iname, var_type)
    334             operator.add_input(i0)

/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/_parser_tf.py in infer_variable_type(tensor, opset, inbound_node_shape)
     45         return BooleanTensorType(shape=tensor_shape)
     46     else:
---> 47         raise ValueError(
     48             "Unable to find out a correct type for tensor type = {} of {}".format(tensor_type, tensor.name))
     49 

ValueError: Unable to find out a correct type for tensor type = 20 of top_bn/ReadVariableOp/resource:0
2021-05-17 03:11:18.810323: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /azureml-envs/azureml_e4e400f0dbfa98d62da5c2ff4271f2f/lib:/azureml-envs/azureml_e4e400f0dbfa98d62da5c2ff4271f2f/lib:
2021-05-17 03:11:18.810546: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
File "/azureml-envs/azureml_e4e400f0d9bfa98d62da5c2ff4271f2f/lib/python3.6/site-packages/cv2/__init__.py", line 5, in <module>
    from .cv2 import *
ImportError: libGL.so.1: cannot open shared object file: No such file or directory
Worker exiting (pid: 39)
Shutting down: Master
Reason: Worker failed to boot.
2021-05-17T03:11:22,058543482+00:00 - gunicorn/finish 3 0
2021-05-17T03:11:22,059923289+00:00 - Exit code 3 is not normal. Killing image.