Azure machine learning service Torchvision 0.3.0用于培训反洗钱服务模型_Azure Machine Learning Service

Azure machine learning service Torchvision 0.3.0用于培训反洗钱服务模型

Azure machine learning service Torchvision 0.3.0用于培训反洗钱服务模型,azure-machine-learning-service,Azure Machine Learning Service,我正在建立一个培训反洗钱服务的形象，试图让torchvision==0.3.0进入该形象。我使用的笔记本虚拟机有torchvision 0.3.0和pytorch 1.1.0，它允许我做我想做的事情。。。但只在笔记本虚拟机上。当我向AML提交作业时，我收到一个错误：出现错误：模块“torchvision.models”没有属性“googlenet” 我已经设法在创建映像时捕获日志。这是摘录的一部分，部分显示了发生了什么： Created wheel for dill: filename=d

我正在建立一个培训反洗钱服务的形象，试图让torchvision==0.3.0进入该形象。我使用的笔记本虚拟机有torchvision 0.3.0和pytorch 1.1.0，它允许我做我想做的事情。。。但只在笔记本虚拟机上。当我向AML提交作业时，我收到一个错误：

出现错误：模块“torchvision.models”没有属性“googlenet”

我已经设法在创建映像时捕获日志。这是摘录的一部分，部分显示了发生了什么：

  Created wheel for dill: filename=dill-0.3.0-cp36-none-any.whl size=77512 sha256=b39463bd613a2337f86181d449e55c84446bb76c2fad462b0ff7ed721872f817

  Stored in directory: /root/.cache/pip/wheels/c9/de/a4/a91eec4eea652104d8c81b633f32ead5eb57d1b294eab24167

Successfully built horovod future json-logging-py psutil absl-py pathspec liac-arff dill

Installing collected packages: tqdm, ptvsd, gunicorn, applicationinsights, urllib3, idna, chardet, requests, asn1crypto, cryptography, pyopenssl, isodate, oauthlib, requests-oauthlib, msrest, jsonpickle, azure-common, PyJWT, python-dateutil, adal, msrestazure, azure-mgmt-authorization, azure-mgmt-containerregistry, pyasn1, ndg-httpsclient, pathspec, azure-mgmt-keyvault, websocket-client, docker, contextlib2, azure-mgmt-resource, backports.weakref, backports.tempfile, jeepney, SecretStorage, pytz, azure-mgmt-storage, ruamel.yaml, azure-graphrbac, jmespath, azureml-core, configparser, json-logging-py, werkzeug, click, MarkupSafe, Jinja2, itsdangerous, flask,liac-arff, pandas, dill, azureml-model-management-sdk, azureml-defaults, torchvision, cloudpickle, psutil, horovod, markdown, protobuf, grpcio, absl-py, tensorboard, future

  Found existing installation: torchvision 0.3.0

    Uninstalling torchvision-0.3.0:

      Successfully uninstalled torchvision-0.3.0

Successfully installed Jinja2-2.10.1 MarkupSafe-1.1.1 PyJWT-1.7.1 SecretStorage-3.1.1 absl-py-0.7.1 adal-1.2.2 applicationinsights-0.11.9 asn1crypto-0.24.0 azure-common-1.1.23 azure-graphrbac-0.61.1 azure-mgmt-authorization-0.60.0 azure-mgmt-containerregistry-2.8.0 azure-mgmt-keyvault-2.0.0 azure-mgmt-resource-3.1.0 azure-mgmt-storage-4.0.0 azureml-core-1.0.55 azureml-defaults-1.0.55 azureml-model-management-sdk-1.0.1b6.post1 backports.tempfile-1.0 backports.weakref-1.0.post1 chardet-3.0.4 click-7.0 cloudpickle-1.2.1 configparser-3.7.4 contextlib2-0.5.5 cryptography-2.7 dill-0.3.0 docker-4.0.2 flask-1.0.3 future-0.17.1 grpcio-1.22.0 gunicorn-19.9.0 horovod-0.16.1 idna-2.8 isodate-0.6.0 itsdangerous-1.1.0 jeepney-0.4.1 jmespath-0.9.4 json-logging-py-0.2 jsonpickle-1.2 liac-arff-2.4.0 markdown-3.1.1 msrest-0.6.9 msrestazure-0.6.1 ndg-httpsclient-0.5.1 oauthlib-3.1.0 pandas-0.25.0 pathspec-0.5.9 protobuf-3.9.1 psutil-5.6.3 ptvsd-4.3.2 pyasn1-0.4.6 pyopenssl-19.0.0 python-dateutil-2.8.0 pytz-2019.2 requests-2.22.0 requests-oauthlib-1.2.0 ruamel.yaml-0.15.89 tensorboard-1.14.0 torchvision-0.2.1 tqdm-4.33.0 urllib3-1.25.3 websocket-client-0.56.0 werkzeug-0.15.5

在不涉及太多细节的情况下，下面是我用来创建估计器的代码，然后提交作业。没什么特别的花哨

我试图通过查看日志来调试图像创建过程，这就是我捕获上面所示内容的地方。我还尝试使用python调试器连接到正在运行的进程，和/或登录到正在运行的docker容器中的bash，尝试使用python interactive查看我的问题所在。最初的问题是我不能使用torchvision.models.googlenet，因为它在使用中的版本中不适用

conda_软件包=['pytorch'，'scikit learn'，'torchvision==0.3.0'] pip_包=['TQM'，'ptvsd'] 我用这个来创建我的估计器：

pyTorchEstimator=PyTorchsource_目录='。/aml图像模型'，计算目标=ct， entry\u script='train\u network.py'，脚本参数=脚本参数，节点计数=1，每个节点的进程计数=1，康达包装=康达包装， pip_包=pip_包，使用\gpu=True，框架_版本='1.1' 并提交典型代码

考虑到我在依赖项中指定了0.3.0，我希望它能正常工作

想法？

torchvision 0.2.1是在PyTorch estimator中为torch版本1.0/1.1预先配置的。

但是，在估计器初始化之后，仍然可以覆盖torchvision

estimator.conda_dependencies.add_pip_package('torchvision==0.3.0')

另一种选择是，如果您确定所需的依赖关系，则只需使用通用估计器

conda_packages=['pytorch', 'scikit-learn', 'torchvision==0.3.0']
pip_packages=['tqdm', 'ptvsd']

estimator = Estimator(source_directory='./aml-image-models',
                      compute_target=ct,
                      entry_script='train_network.py',
                      script_params=script_params,
                      conda_packages=conda_packages,
                      pip_packages=pip_packages,
                      use_gpu=True)

嗨，比利，这并不能解决我用conda_依赖项覆盖它们的问题，因为这是我一直在尝试的。它在图像创建过程中被改写了。我可以试试通用估计器看看，谢谢！