Tensorflow 分布式张量流张量森林
请告诉我,我是分布式处理的新手,我想知道如何使用分布式tensorforest来训练tensorforest模型?我了解神经网络是如何实现的,但我不了解tensorforest,它是一个使用tensorflow框架的随机林实现,我最近深入研究了这个主题。由于Tensorflow 分布式张量流张量森林,tensorflow,machine-learning,parallel-processing,distributed-computing,random-forest,Tensorflow,Machine Learning,Parallel Processing,Distributed Computing,Random Forest,请告诉我,我是分布式处理的新手,我想知道如何使用分布式tensorforest来训练tensorforest模型?我了解神经网络是如何实现的,但我不了解tensorforest,它是一个使用tensorflow框架的随机林实现,我最近深入研究了这个主题。由于TensorForestEstimator源自tf.contrib.learn.Estimator,因此应该可以在分布式培训环境中使用它 我遇到的问题是如何正确配置设备分配。TensorForestEstimator的构造函数接受一个设备赋值
TensorForestEstimator
源自tf.contrib.learn.Estimator
,因此应该可以在分布式培训环境中使用它
我遇到的问题是如何正确配置设备分配。TensorForestEstimator
的构造函数接受一个设备赋值器
关键字参数
device\u assigner:控制如何将树分配给设备的对象实例。如果没有,将使用tensor_forest.RandomForestDeviceSigner。
文件不准确。默认值实际上是tf.contrib.framework.VariableDeviceChooser
的一个实例
代码实例化了不带参数的VariableDeviceChooser
,应该在没有参数服务器的情况下运行。这在单机环境中很好,但在分布式环境中则不然。我已尝试传递一个值VariableDeviceChooser
,该值实例化为从TF\u CONFIG
中的数据推断出的参数服务器数
这是我在培训操作期间启动会话时观察到的错误消息
File "/home/ubuntu/.pyenv/versions/cmle-1_12-py-3_5/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/home/ubuntu/.pyenv/versions/cmle-1_12-py-3_5/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1317, in _run_fn
self._extend_graph()
File "/home/ubuntu/.pyenv/versions/cmle-1_12-py-3_5/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1352, in _extend_graph
tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation device_dummy_0/Initializer/random_uniform/RandomUniform: Could not satisfy explicit device specification '' because the node {{colocation_node device_dummy_0/Initializer/random_uniform/RandomUniform}} was colocated with a group of nodes that required incompatible device '/job:ps/task:0/device:CPU:0'
Colocation Debug Info:
Colocation group had the following types and devices:
IsVariableInitialized: CPU
Assign: CPU
Identity: CPU XLA_CPU
VariableV2: CPU
Mul: CPU XLA_CPU
Add: CPU XLA_CPU
Sub: CPU XLA_CPU
RandomUniform: CPU XLA_CPU
Const: CPU XLA_CPU
Colocation members and user-requested devices:
device_dummy_0/Initializer/random_uniform/shape (Const)
device_dummy_0/Initializer/random_uniform/min (Const)
device_dummy_0/Initializer/random_uniform/max (Const)
device_dummy_0/Initializer/random_uniform/RandomUniform (RandomUniform)
device_dummy_0/Initializer/random_uniform/sub (Sub)
device_dummy_0/Initializer/random_uniform/mul (Mul)
device_dummy_0/Initializer/random_uniform (Add)
device_dummy_0 (VariableV2) /job:ps/task:0/device:CPU:0
device_dummy_0/Assign (Assign) /job:ps/task:0/device:CPU:0
device_dummy_0/read (Identity) /job:ps/task:0/device:CPU:0
report_uninitialized_variables/IsVariableInitialized_1 (IsVariableInitialized) /job:ps/task:0/device:CPU:0
report_uninitialized_variables_1/IsVariableInitialized_1 (IsVariableInitialized) /job:ps/task:0/device:CPU:0
save/Assign_1 (Assign) /job:ps/task:0/device:CPU:0
[[{{node device_dummy_0/Initializer/random_uniform/RandomUniform}} = RandomUniform[T=DT_INT32, _class=["loc:@device_dummy_0"], dtype=DT_FLOAT, seed=0, seed2=0](device_dummy_0/Initializer/random_uniform/shape)]]```