Python unity的问题：代理学习时不使用--load进行训练_Python_Unity3d_Tensorflow_Machine Learning_Ml Agent

Python unity的问题：代理学习时不使用--load进行训练

python unity3d tensorflow machine-learning

Python unity的问题：代理学习时不使用--load进行训练,python,unity3d,tensorflow,machine-learning,ml-agent,Python,Unity3d,Tensorflow,Machine Learning,Ml Agent,嗨，我正试着做我的第一件事。以前，当我想训练我的人工智能时，我在写作 mlagents学习配置/trainer\U config.yaml--run id=Taxi-1--train 在终点站，但ai在5万步后停止了训练。然后，我试着再次训练它，用另一个 mlagents学习配置/trainer\U config.yaml--run id=Taxi-1--train 然后，我看到，如果不想让命令重新开始整个训练并继续训练以前的模型，就必须向命令添加--load。然而，当我写作的时候 mlagen

嗨，我正试着做我的第一件事。以前，当我想训练我的人工智能时，我在写作

mlagents学习配置/trainer\U config.yaml--run id=Taxi-1--train

在终点站，但ai在5万步后停止了训练。然后，我试着再次训练它，用另一个

mlagents学习配置/trainer\U config.yaml--run id=Taxi-1--train

然后，我看到，如果不想让命令重新开始整个训练并继续训练以前的模型，就必须向命令添加--load。然而，当我写作的时候

mlagents学习配置/trainer\U config.yaml——加载——运行id=Taxi-1——列车

它只执行一步，然后停止。这是它在终端中写入的内容：

INFO:mlagents.trainers:{'--curriculum': 'None',
 '--docker-target-name': 'None',
 '--env': 'None',
 '--help': False,
 '--keep-checkpoints': '5',
 '--lesson': '0',
 '--load': True,
 '--no-graphics': False,
 '--num-runs': '1',
 '--run-id': 'Taxi-1',
 '--save-freq': '50000',
 '--seed': '-1',
 '--slow': False,
 '--train': True,
 '--worker-id': '0',
 '<trainer-config-path>': 'config/trainer_config.yaml'}
INFO:mlagents.envs:Start training by pressing the Play button in the 
Unity Editor.
INFO:mlagents.envs:
'Academy' started successfully!
Unity Academy name: Academy
    Number of Brains: 2
    Number of Training Brains : 1
    Reset Parameters :

Unity brain name: CarLBrain
    Number of Visual Observations (per agent): 0
    Vector Observation space size (per agent): 12
    Number of stacked Vector Observation: 6
    Vector Action space type: continuous
    Vector Action space size (per agent): [2]
    Vector Action descriptions: , 
Unity brain name: CarPBrain
    Number of Visual Observations (per agent): 0
    Vector Observation space size (per agent): 12
    Number of stacked Vector Observation: 6
    Vector Action space type: discrete
    Vector Action space size (per agent): [10, 10]
    Vector Action descriptions: , 
INFO:mlagents.trainers:Loading Model for brain CarLBrain
INFO:tensorflow:Restoring parameters from ./models/Taxi-1- 
   0/CarLBrain/model-50001.cptk
   INFO:mlagents.envs:Hyperparameters for the PPO Trainer of brain 
   CarLBrain: 
batch_size: 1024
beta:   0.005
buffer_size:    10240
epsilon:    0.2
gamma:  0.99
hidden_units:   128
lambd:  0.95
learning_rate:  0.0003
max_steps:  5.0e4
normalize:  False
num_epoch:  3
num_layers: 2
time_horizon:   64
sequence_length:    64
summary_freq:   1000
use_recurrent:  False
summary_path:   ./summaries/Taxi-1-0_CarLBrain
memory_size:    256
use_curiosity:  False
curiosity_strength: 0.01
curiosity_enc_size: 128
model_path: ./models/Taxi-1-0/CarLBrain
INFO:mlagents.envs:Saved Model
INFO:mlagents.trainers:List of nodes to export for brain :CarLBrain
INFO:mlagents.trainers: is_continuous_control
INFO:mlagents.trainers: version_number
INFO:mlagents.trainers: memory_size
INFO:mlagents.trainers: action_output_shape
INFO:mlagents.trainers: action
INFO:mlagents.trainers: action_probs
INFO:mlagents.trainers: value_estimate
INFO:tensorflow:Restoring parameters from ./models/Taxi-1- 
0/CarLBrain/model-50002.cptk
INFO:tensorflow:Froze 17 variables.
Converted 17 variables to const ops.

INFO:mlagents.trainers:{'--课程设置：'None'，
“--docker目标名称”：“无”，
“--env”：“无”，
“帮助”：错误，
“--保留检查点”：“5”，
“--lesson”：“0”，
“--load”：True，
“--无图形”：错误，
“--num runs”：“1”，
“--运行id”：“滑行-1”，
“--保存频率”：“50000”，
“--种子”：“-1”，
“慢”：错，
“火车”：没错，
“--工作者id”：“0”，
'''config/trainer_config.yaml'}
信息：mlagents.envs：按屏幕上的播放按钮开始训练
统一编辑。
信息：mlagents.envs:
“学院”成功启动！
联合书院名称：书院
大脑数量：2
训练大脑数量：1
重置参数：
统一大脑名称：CarLBrain
视觉观察次数（每个代理）：0
向量观测空间大小（每个代理）：12
矢量叠加观测次数：6次
向量动作空间类型：连续
向量操作空间大小（每个代理）：[2]
矢量动作描述：，
统一大脑名称：CarPBrain
视觉观察次数（每个代理）：0
向量观测空间大小（每个代理）：12
矢量叠加观测次数：6次
向量动作空间类型：离散
向量操作空间大小（每个代理）：[10,10]
矢量动作描述：，
信息：mlagents.培训师：大脑加载模型
信息：tensorflow：从恢复参数。/models/Taxi-1-
0/CarLBrain/model-50001.cptk
信息：mlagents.envs:PPO大脑训练器的超参数
卡尔布莱恩：
批量大小：1024
贝塔系数：0.005
缓冲区大小：10240
ε：0.2
伽马：0.99
隐藏单位：128
羔羊：0.95
学习率：0.0003
最大步数：5.0e4
正常化：错误
新纪元：3
层数：2
时间范围：64
序列长度：64
汇总频率：1000
使用：False
摘要路径：./summaries/Taxi-1-0\u
内存大小：256
使用：False
好奇心强度：0.01
好奇号附件尺寸：128
模型路径：./models/Taxi-1-0/CarLBrain
信息：mlagents.envs:已保存的模型
信息：mlagents.trainers:brain要导出的节点列表：CarLBrain
信息：mlagents.trainers:是连续控制吗
信息：mlagents.trainers:版本号
信息：mlagents.trainers:内存大小
信息：mlagents.trainers:action\u output\u shape
信息：mlagents.trainers:action
信息：mlagents.trainers:action\u probs
信息：mlagents.trainers:value\u估算
信息：tensorflow：从恢复参数。/models/Taxi-1-
0/CarLBrain/model-50002.cptk
信息：tensorflow：冻结17个变量。
将17个变量转换为常量。

你知道我怎样才能继续训练超过50000步吗？谢谢你的帮助！请不要犹豫要求任何澄清。

尝试增加config/trainer\u config.yaml文件中的–max\u steps–参数。目前，它被设置为5.0e4，这意味着50000，因此，如果您将其设置为5.0e6，那么它应该在之前运行5000000次stopping@MichaelGlazunov成功了，谢谢！