Python 在colab TPU上运行keras模型_Python_Keras_Conv Neural Network_Google Colaboratory_Tpu

Python 在colab TPU上运行keras模型

python keras google-colaboratory

Python 在colab TPU上运行keras模型,python,keras,conv-neural-network,google-colaboratory,tpu,Python,Keras,Conv Neural Network,Google Colaboratory,Tpu,我目前正在尝试使用Keras和Google ColabGPU训练卷积神经网络。我发现这讨论了增加训练模型所需的训练时间的选项。由于当前在GPU上的培训非常缓慢，我尝试实现本文中的方法。我有以下代码： sgd = optimizers.SGD(lr=0.02) model.compile(optimizer=sgd,loss='categorical_crossentropy',metrics=['accuracy']) def create_train_subsets(): X_tra

我目前正在尝试使用Keras和Google ColabGPU训练卷积神经网络。我发现这讨论了增加训练模型所需的训练时间的选项。由于当前在GPU上的培训非常缓慢，我尝试实现本文中的方法。我有以下代码：

sgd = optimizers.SGD(lr=0.02) model.compile(optimizer=sgd,loss='categorical_crossentropy',metrics=['accuracy']) def create_train_subsets(): X_train =[] y_train = [] for i in range(80): cat = i+1 path = 'train_set/by_cat/{}'.format(cat) for img in os.listdir(path): actual_image = Image.open(("train_set/by_cat/{}/{}".format(cat,img))) X_train.append(actual_image) y_train.append(cat) return X_train, y_train # This address identifies the TPU we'll use when configuring TensorFlow. x_train, y_train = create_train_subsets() TPU_WORKER = 'grpc://' + os.environ['COLAB_TPU_ADDR'] tf.logging.set_verbosity(tf.logging.INFO) tpu_model = tf.contrib.tpu.keras_to_tpu_model( model, strategy=tf.contrib.tpu.TPUDistributionStrategy( tf.contrib.cluster_resolver.TPUClusterResolver(TPU_WORKER))) history = tpu_model.fit(x_train, y_train, epochs=20, batch_size=128 * 8, validation_split=0.2) tpu_model.save_weights('./tpu_model.h5', overwrite=True) # tpu_model.evaluate(x_test, y_test, batch_size=128 * 8)
但是，此代码返回以下错误：

InvalidArgumentError: No OpKernel was registered to support Op 'ConfigureDistributedTPU' used by node ConfigureDistributedTPU (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) with these attrs: [tpu_embedding_config="", is_global_init=false, embedding_config=""] Registered devices: [CPU, XLA_CPU] Registered kernels: <no registered kernels> [[ConfigureDistributedTPU]]

InvalidArgumentError:未注册任何操作内核以支持节点ConfigureDistributedPU（在/usr/local/lib/python3.6/dist-packages/tensorflow\u-core/python/framework/ops.py:1748处定义）使用的Op'ConfigureDistributedPU'，这些属性：[tpu\u-embedding\u-config=”“，is\u-global\u-init=false，embedding\u-config=”“] 注册设备：[CPU，XLA\U CPU] 注册内核： [[ConfigureDistributedPU]]
我在网上做了广泛的搜索，但我似乎找不到任何关于它意味着什么的线索。此外，我对这个过程的理解还不够，无法自己弄清楚错误的确切含义
因此，有没有人可以帮助我理解什么是错误的，也许也知道如何解决这个问题的解决方案

提前谢谢你
首先测试TPU是否正常工作，作为一项健康检查。制作一个变量（张量）
a
和另一个
b
，并将它们添加到
tf
中得到
c
。既然您知道
a
、
b
和
c
是什么，那么就编写一个tf.function来测试
c
=
a
+
b
。如果在运行TPU时，您得到的结果是真的，这意味着tensorflow能够在Colab上使用TPU，这应该是真的，但还是最好先检查一下。谢谢您的评论。我确实收到了一条消息，它连接到了谷歌合作的TPU。经过一些额外的挖掘，我认为错误与tensorflow和keras具有不同的模型启动有关，这导致我的keras模型无法找到TPU。：）现在出现了不同的问题，但与指针的lol.Thnx错误无关。