Python 如何在Azure ML上部署EfficientNet?
我在Keras EfficientNetB4、B5和B7上进行了迁移学习。所有的EfficientNet模型都在Azure ML上进行训练,并使用Python 如何在Azure ML上部署EfficientNet?,python,azure,tensorflow,keras,Python,Azure,Tensorflow,Keras,我在Keras EfficientNetB4、B5和B7上进行了迁移学习。所有的EfficientNet模型都在Azure ML上进行训练,并使用model.save(文件路径,save_格式='h5')和model.save(文件路径,save_格式='tf')以两种不同的格式保存权重。我需要在Azure ML上部署经过培训的模型 但训练后的模型权重很大,EfficientNetB4保存的权重为1.64GiB,EfficientNetB5和B7权重大于2.50GiB。考虑到部署成本、可扩展性和
model.save(文件路径,save_格式='h5')
和model.save(文件路径,save_格式='tf')
以两种不同的格式保存权重。我需要在Azure ML上部署经过培训的模型
但训练后的模型权重很大,EfficientNetB4保存的权重为1.64GiB,EfficientNetB5和B7权重大于2.50GiB。考虑到部署成本、可扩展性和推理速度,部署大型模型权重是不可行的
所有模型都是在python3.8
和tensorflowgpu版本2.3
上训练的。我尝试的第一件事是使用将Keras模型(使用.h5
格式保存)转换为.onnx
格式
但它给出了错误,正如keras onnx对于Tf2.3
tf.keras model eager_mode: False
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-f36257e9acf1> in <module>
1 output_model_path = "./saveModel/Model16_EffB7_No_meta.onnx"
----> 2 onnx_model = keras2onnx.convert_keras(model, model.name)
3 keras2onnx.save_model(onnx_model, output_model_path)
/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/main.py in convert_keras(model, name, doc_string, target_opset, channel_first_inputs, debug_mode, custom_op_conversions)
78 parse_graph_modeless(topology, tf_graph, target_opset, input_names, output_names, output_dict)
79 else:
---> 80 parse_graph(topology, tf_graph, target_opset, output_names, output_dict)
81 topology.compile()
82
/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/parser.py in parse_graph(topo, graph, target_opset, output_names, keras_node_dict)
837 topo.raw_model.add_output_name(str_value)
838
--> 839 return _parse_graph_core_v2(
840 graph, keras_node_dict, topo, top_level, output_names
841 ) if is_tf2 and is_tf_keras else _parse_graph_core(
/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/parser.py in _parse_graph_core_v2(graph, keras_node_dict, topology, top_scope, output_names)
725 _on_parsing_time_distributed_layer(graph, layer_info.nodelist, layer_info.layer, model_, varset)
726 elif layer_info.layer and get_converter(type(layer_info.layer)):
--> 727 on_parsing_keras_layer_v2(graph, layer_info, varset)
728 else:
729 _on_parsing_tf_nodes(graph, layer_info.nodelist, varset, topology.debug_mode)
/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/_parser_tf.py in on_parsing_keras_layer_v2(graph, layer_info, varset, prefix)
330 iname = prefix + i_.name
331 k2o_logger().debug('\tinput : ' + iname)
--> 332 var_type = adjust_input_batch_size(infer_variable_type(i_, varset.target_opset))
333 i0 = varset.get_local_variable_or_declare_one(iname, var_type)
334 operator.add_input(i0)
/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/_parser_tf.py in infer_variable_type(tensor, opset, inbound_node_shape)
45 return BooleanTensorType(shape=tensor_shape)
46 else:
---> 47 raise ValueError(
48 "Unable to find out a correct type for tensor type = {} of {}".format(tensor_type, tensor.name))
49
ValueError: Unable to find out a correct type for tensor type = 20 of top_bn/ReadVariableOp/resource:0
文件“/azureml envs/azureml_e4e400f0d9bfa98d62da5c2ff4271f2f/lib/python3.6/site packages/cv2/_init_uuuuuu.py”,第5行
从cv2进口*
ImportError:libGL.so.1:无法打开共享对象文件:没有此类文件或目录
工人退出(pid:39)
关闭:主机
原因:工作进程无法启动。
2021-05-17T03:11:22058543482+00:00-gunicorn/饰面30
2021-05-17T03:11:22059923289+00:00-出口代码3不正常。杀人的形象。
tf.keras model eager_mode: False
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-f36257e9acf1> in <module>
1 output_model_path = "./saveModel/Model16_EffB7_No_meta.onnx"
----> 2 onnx_model = keras2onnx.convert_keras(model, model.name)
3 keras2onnx.save_model(onnx_model, output_model_path)
/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/main.py in convert_keras(model, name, doc_string, target_opset, channel_first_inputs, debug_mode, custom_op_conversions)
78 parse_graph_modeless(topology, tf_graph, target_opset, input_names, output_names, output_dict)
79 else:
---> 80 parse_graph(topology, tf_graph, target_opset, output_names, output_dict)
81 topology.compile()
82
/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/parser.py in parse_graph(topo, graph, target_opset, output_names, keras_node_dict)
837 topo.raw_model.add_output_name(str_value)
838
--> 839 return _parse_graph_core_v2(
840 graph, keras_node_dict, topo, top_level, output_names
841 ) if is_tf2 and is_tf_keras else _parse_graph_core(
/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/parser.py in _parse_graph_core_v2(graph, keras_node_dict, topology, top_scope, output_names)
725 _on_parsing_time_distributed_layer(graph, layer_info.nodelist, layer_info.layer, model_, varset)
726 elif layer_info.layer and get_converter(type(layer_info.layer)):
--> 727 on_parsing_keras_layer_v2(graph, layer_info, varset)
728 else:
729 _on_parsing_tf_nodes(graph, layer_info.nodelist, varset, topology.debug_mode)
/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/_parser_tf.py in on_parsing_keras_layer_v2(graph, layer_info, varset, prefix)
330 iname = prefix + i_.name
331 k2o_logger().debug('\tinput : ' + iname)
--> 332 var_type = adjust_input_batch_size(infer_variable_type(i_, varset.target_opset))
333 i0 = varset.get_local_variable_or_declare_one(iname, var_type)
334 operator.add_input(i0)
/anaconda/envs/azureml_py38/lib/python3.8/site-packages/keras2onnx/_parser_tf.py in infer_variable_type(tensor, opset, inbound_node_shape)
45 return BooleanTensorType(shape=tensor_shape)
46 else:
---> 47 raise ValueError(
48 "Unable to find out a correct type for tensor type = {} of {}".format(tensor_type, tensor.name))
49
ValueError: Unable to find out a correct type for tensor type = 20 of top_bn/ReadVariableOp/resource:0
2021-05-17 03:11:18.810323: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /azureml-envs/azureml_e4e400f0dbfa98d62da5c2ff4271f2f/lib:/azureml-envs/azureml_e4e400f0dbfa98d62da5c2ff4271f2f/lib:
2021-05-17 03:11:18.810546: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
File "/azureml-envs/azureml_e4e400f0d9bfa98d62da5c2ff4271f2f/lib/python3.6/site-packages/cv2/__init__.py", line 5, in <module>
from .cv2 import *
ImportError: libGL.so.1: cannot open shared object file: No such file or directory
Worker exiting (pid: 39)
Shutting down: Master
Reason: Worker failed to boot.
2021-05-17T03:11:22,058543482+00:00 - gunicorn/finish 3 0
2021-05-17T03:11:22,059923289+00:00 - Exit code 3 is not normal. Killing image.