Python 使用tf.profiler.experimental.client.trace的TensorFlow探查器提供空跟踪数据

Python 使用tf.profiler.experimental.client.trace的TensorFlow探查器提供空跟踪数据,python,tensorflow,profiler,Python,Tensorflow,Profiler,我无法使用tf.profiler.experimental.client.trace收集跟踪数据。有人能帮忙吗?下面是(CPU/GPU)的使用示例,看起来很简单 我有一个非常简单的模型,我能够使用tf.profiler.experimental.start和tf.profiler.experimental.stop从中收集跟踪数据 但是tf.profiler.experimental.client.trace给了我空的跟踪数据 我的代码如下: import tensorflow as tf im

我无法使用
tf.profiler.experimental.client.trace
收集跟踪数据。有人能帮忙吗?下面是(CPU/GPU)的使用示例,看起来很简单

我有一个非常简单的模型,我能够使用
tf.profiler.experimental.start
tf.profiler.experimental.stop
从中收集跟踪数据

但是
tf.profiler.experimental.client.trace
给了我空的跟踪数据

我的代码如下:

import tensorflow as tf
import numpy as np
                                                                                                    
def mnist_dataset(batch_size):
    (x_train, y_train), _ = tf.keras.datasets.mnist.load_data()                                              
    x_train = x_train / np.float32(255)
    y_train = y_train.astype(np.int64)
    train_dataset = tf.data.Dataset.from_tensor_slices(
        (x_train, y_train)).shuffle(60000).repeat().batch(batch_size)
    return train_dataset

batch_size = 64
dataset = mnist_dataset(batch_size)

model = tf.keras.Sequential([
    tf.keras.Input(shape=(28, 28)),
    tf.keras.layers.Reshape(target_shape=(28, 28, 1)),
    tf.keras.layers.Conv2D(32, 3, activation='relu'),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(10)
])
model.compile(
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer=tf.keras.optimizers.SGD(learning_rate=0.001),
    metrics=['accuracy'])
                                                                                           
#tf.profiler.experimental.start('./logs/tb_log')                                                                        
tf.profiler.experimental.server.start(6009)

model.fit(dataset, epochs=10, steps_per_epoch=70)

tf.profiler.experimental.client.trace('grpc://localhost:6009', './logs/tbc_log', 20000)
#tf.profiler.experimental.stop()         
代码在各个时代中运行,然后输出

2021-02-02 17:49:44.943933: I tensorflow/core/profiler/rpc/client/capture_profile.cc:198] Profiler delay_ms was 0, start_timestamp_ns set to 1612288184943887718 [2021-02-02T17:49:44.943887718+00:00]
Starting to trace for 20000 ms. Remaining attempt(s): 2
2021-02-02 17:49:44.944037: I tensorflow/core/profiler/rpc/client/remote_profiler_session_manager.cc:75] Deadline set to 2021-02-02T17:50:44.890124419+00:00 because max_session_duration_ms was 60000 and session_creation_timestamp_ns was 1612288184890124419 [2021-02-02T17:49:44.890124419+00:00]
2021-02-02 17:49:44.944197: I tensorflow/core/profiler/rpc/client/profiler_client.cc:113] Asynchronous gRPC Profile() to localhost:6009
2021-02-02 17:49:44.944316: I tensorflow/core/profiler/rpc/client/remote_profiler_session_manager.cc:96] Issued Profile gRPC to 1 clients
2021-02-02 17:49:44.944340: I tensorflow/core/profiler/rpc/client/profiler_client.cc:131] Waiting for completion.
2021-02-02 17:49:44.946274: I tensorflow/core/profiler/lib/profiler_session.cc:136] Profiler session initializing.
2021-02-02 17:49:44.947547: W tensorflow/core/profiler/lib/profiler_session.cc:144] Profiling is late (2021-02-02T17:49:44.946338176+00:00) for the scheduled start (2021-02-02T17:49:44.943887718+00:00) and will start immediately.
2021-02-02 17:49:44.947582: I tensorflow/core/profiler/lib/profiler_session.cc:155] Profiler session started.
2021-02-02 17:49:44.947660: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1365] Profiler found 2 GPUs
2021-02-02 17:49:44.949656: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcupti.so.11.0
2021-02-02 17:50:08.435260: I tensorflow/core/profiler/lib/profiler_session.cc:71] Profiler session collecting data.
2021-02-02 17:50:08.435591: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1487] CUPTI activity buffer flushed
2021-02-02 17:50:08.635192: I tensorflow/core/profiler/internal/gpu/cupti_collector.cc:228]  GpuTracer has collected 0 callback api events and 0 activity events. 
2021-02-02 17:50:08.648616: I tensorflow/core/profiler/rpc/profiler_service_impl.cc:67] Collecting XSpace to repository: ./logs/tbc_log/plugins/profile/2021_02_02_17_49_44/localhost_6009.xplane.pb
2021-02-02 17:50:08.650309: I tensorflow/core/profiler/lib/profiler_session.cc:172] Profiler session tear down.
2021-02-02 17:50:08.650676: W tensorflow/core/profiler/rpc/client/capture_profile.cc:133] No trace event is collected from localhost:6009
No trace event is collected. Automatically retrying.

2021-02-02 17:50:08.651046: I tensorflow/core/profiler/rpc/client/capture_profile.cc:198] Profiler delay_ms was 0, start_timestamp_ns set to 1612288208651017638 [2021-02-02T17:50:08.651017638+00:00]
Starting to trace for 20000 ms. Remaining attempt(s): 1
2021-02-02 17:50:08.651123: I tensorflow/core/profiler/rpc/client/remote_profiler_session_manager.cc:75] Deadline set to 2021-02-02T17:50:44.890124419+00:00 because max_session_duration_ms was 60000 and session_creation_timestamp_ns was 1612288184890124419 [2021-02-02T17:49:44.890124419+00:00]
2021-02-02 17:50:08.651274: I tensorflow/core/profiler/rpc/client/profiler_client.cc:113] Asynchronous gRPC Profile() to localhost:6009
2021-02-02 17:50:08.651391: I tensorflow/core/profiler/rpc/client/remote_profiler_session_manager.cc:96] Issued Profile gRPC to 1 clients
2021-02-02 17:50:08.651420: I tensorflow/core/profiler/rpc/client/profiler_client.cc:131] Waiting for completion.
2021-02-02 17:50:08.652492: I tensorflow/core/profiler/lib/profiler_session.cc:136] Profiler session initializing.
2021-02-02 17:50:08.652570: W tensorflow/core/profiler/lib/profiler_session.cc:144] Profiling is late (2021-02-02T17:50:08.652539729+00:00) for the scheduled start (2021-02-02T17:50:08.651017638+00:00) and will start immediately.
2021-02-02 17:50:08.652591: I tensorflow/core/profiler/lib/profiler_session.cc:155] Profiler session started.
2021-02-02 17:50:31.280828: I tensorflow/core/profiler/lib/profiler_session.cc:71] Profiler session collecting data.
2021-02-02 17:50:31.281134: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1487] CUPTI activity buffer flushed
2021-02-02 17:50:31.510697: I tensorflow/core/profiler/internal/gpu/cupti_collector.cc:228]  GpuTracer has collected 0 callback api events and 0 activity events. 
2021-02-02 17:50:31.515475: I tensorflow/core/profiler/rpc/profiler_service_impl.cc:67] Collecting XSpace to repository: ./logs/tbc_log/plugins/profile/2021_02_02_17_49_44/localhost_6009.xplane.pb
2021-02-02 17:50:31.518037: I tensorflow/core/profiler/lib/profiler_session.cc:172] Profiler session tear down.
2021-02-02 17:50:31.518440: W tensorflow/core/profiler/rpc/client/capture_profile.cc:133] No trace event is collected from localhost:6009
No trace event is collected. Automatically retrying.

2021-02-02 17:50:31.518819: I tensorflow/core/profiler/rpc/client/capture_profile.cc:198] Profiler delay_ms was 0, start_timestamp_ns set to 1612288231518793164 [2021-02-02T17:50:31.518793164+00:00]
Starting to trace for 20000 ms. Remaining attempt(s): 0
2021-02-02 17:50:31.518889: I tensorflow/core/profiler/rpc/client/remote_profiler_session_manager.cc:75] Deadline set to 2021-02-02T17:50:44.890124419+00:00 because max_session_duration_ms was 60000 and session_creation_timestamp_ns was 1612288184890124419 [2021-02-02T17:49:44.890124419+00:00]
2021-02-02 17:50:31.519021: I tensorflow/core/profiler/rpc/client/profiler_client.cc:113] Asynchronous gRPC Profile() to localhost:6009
2021-02-02 17:50:31.519124: I tensorflow/core/profiler/rpc/client/remote_profiler_session_manager.cc:96] Issued Profile gRPC to 1 clients
2021-02-02 17:50:31.519147: I tensorflow/core/profiler/rpc/client/profiler_client.cc:131] Waiting for completion.
2021-02-02 17:50:31.520067: I tensorflow/core/profiler/lib/profiler_session.cc:136] Profiler session initializing.
2021-02-02 17:50:31.520136: W tensorflow/core/profiler/lib/profiler_session.cc:144] Profiling is late (2021-02-02T17:50:31.520095781+00:00) for the scheduled start (2021-02-02T17:50:31.518793164+00:00) and will start immediately.
2021-02-02 17:50:31.520152: I tensorflow/core/profiler/lib/profiler_session.cc:155] Profiler session started.
2021-02-02 17:50:44.891412: W tensorflow/core/profiler/rpc/client/profiler_client.cc:152] Deadline exceeded: Deadline Exceeded
2021-02-02 17:50:44.891501: W tensorflow/core/profiler/rpc/client/capture_profile.cc:133] No trace event is collected from localhost:6009
2021-02-02 17:50:44.891526: W tensorflow/core/profiler/rpc/client/capture_profile.cc:145] localhost:6009 returned Deadline exceeded: Deadline Exceeded
No trace event is collected after 3 attempt(s). Perhaps, you want to try again (with more attempts?).
Tip: increase number of attempts with --num_tracing_attempts.
2021-02-02 17:50:44.891848: I tensorflow/core/profiler/lib/profiler_session.cc:172] Profiler session tear down.
Traceback (most recent call last):
  File "keras_singleworker_2.py", line 37, in <module>
2021-02-02 17:50:44.893228: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1487] CUPTI activity buffer flushed
    tf.profiler.experimental.client.trace('grpc://localhost:6009', './logs/tbc_log', 20000)
  File "/fserver/jonathanb/miniconda3/envs/tf2.4/lib/python3.8/site-packages/tensorflow/python/profiler/profiler_client.py", line 131, in trace
    _pywrap_profiler.trace(
tensorflow.python.framework.errors_impl.UnavailableError: No trace event was collected because there were no responses from clients or the responses did not have trace data.
2021-02-02 17:49:44.943933:I tensorflow/core/profiler/rpc/client/capture_profile.cc:198]探查器延迟为0,开始时间戳设置为161228818494387718[2021-02-02T17:49:44.94387718+00:00]
开始跟踪20000毫秒。剩余尝试:2
2021-02-02 17:49:44.944037:I tensorflow/core/profiler/rpc/client/remote\u profiler\u session\u manager.cc:75]截止日期设置为2021-02-02T17:50:44.890124419+00:00,因为最大会话持续时间为60000,会话创建时间戳为1612288184890124419[2021-02-02T17:49:44.890124419+00:00]
2021-02-02 17:49:44.944197:I tensorflow/core/profiler/rpc/client/profiler_client.cc:113]异步gRPC配置文件()到本地主机:6009
2021-02-02 17:49:44.944316:I tensorflow/core/profiler/rpc/client/remote\u profiler\u session\u manager.cc:96]向1个客户端发布了配置文件gRPC
2021-02-02 17:49:44.944340:I tensorflow/core/profiler/rpc/client/profiler_client.cc:131]正在等待完成。
2021-02-02 17:49:44.946274:I tensorflow/core/profiler/lib/profiler_session.cc:136]profiler session正在初始化。
2021-02-02 17:49:44.947547:W tensorflow/core/profiler/lib/profiler_session.cc:144]对于计划的启动(2021-02-02T17:49:44.946338176+00:00),配置文件延迟(2021-02-02T17:49:44.94387718+00:00),并将立即启动。
2021-02-02 17:49:44.947582:I tensorflow/core/profiler/lib/profiler_会话。cc:155]探查器会话已启动。
2021-02-02 17:49:44.947660:I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1365]profiler发现2个gpu
2021-02-02 17:49:44.949656:I tensorflow/stream_executor/platform/default/dso_loader.cc:49]已成功打开动态库libcupti.so.11.0
2021-02-02 17:50:08.435260:I tensorflow/core/profiler/lib/profiler_session.cc:71]profiler session收集数据。
2021-02-02 17:50:08.435591:I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1487]cupti活性缓冲区已刷新
2021-02-02 17:50:08.635192:I tensorflow/core/profiler/internal/gpu/cupti_collector.cc:228]GpuTracer已收集了0个回调api事件和0个活动事件。
2021-02-02 17:50:08.648616:I tensorflow/core/profiler/rpc/profiler_service_impl.cc:67]将XSpace收集到存储库:./logs/tbc_log/plugins/profile/2021_02_17_49_44/localhost_6009.xplane.pb
2021-02-02 17:50:08.650309:I tensorflow/core/profiler/lib/profiler_session.cc:172]profiler会话中断。
2021-02-02 17:50:08.650676:W tensorflow/core/profiler/rpc/client/capture_profile.cc:133]未从本地主机收集任何跟踪事件:6009
未收集任何跟踪事件。自动重试。
2021-02-02 17:50:08.651046:I tensorflow/core/profiler/rpc/client/capture_profile.cc:198]探查器延迟为0,开始时间戳设置为1612288208651017638[2021-02-02T17:50:08.651017638+00:00]
开始跟踪20000毫秒。剩余尝试:1
2021-02-02 17:50:08.651123:I tensorflow/core/profiler/rpc/client/remote\u profiler\u session\u manager.cc:75]截止日期设置为2021-02-02T17:50:44.890124419+00:00,因为最大会话持续时间为60000,会话创建时间戳为1612288184890124419[2021-02-02T17:49:44.890124419+00:00]
2021-02-02 17:50:08.651274:I tensorflow/core/profiler/rpc/client/profiler_client.cc:113]异步gRPC配置文件()到本地主机:6009
2021-02-02 17:50:08.651391:I tensorflow/core/profiler/rpc/client/remote_profiler_session_manager.cc:96]向1个客户端发布了配置文件gRPC
2021-02-02 17:50:08.651420:I tensorflow/core/profiler/rpc/client/profiler_client.cc:131]正在等待完成。
2021-02-02 17:50:08.652492:I tensorflow/core/profiler/lib/profiler_session.cc:136]profiler会话正在初始化。
2021-02-02 17:50:08.652570:W tensorflow/core/profiler/lib/profiler_session.cc:144]计划启动(2021-02-02T17:50:08.652539729+00:00)的评测延迟(2021-02-02T17:50:08.651017638+00:00),并将立即启动。
2021-02-02 17:50:08.652591:I tensorflow/core/profiler/lib/profiler_会话。cc:155]探查器会话已启动。
2021-02-02 17:50:31.280828:I tensorflow/core/profiler/lib/profiler_session.cc:71]profiler session收集数据。
2021-02-02 17:50:31.281134:I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1487]cupti活性缓冲区已刷新
2021-02-02 17:50:31.510697:I tensorflow/core/profiler/internal/gpu/cupti_collector.cc:228]GpuTracer已收集到0个回调api事件和0个活动事件。
2021-02-02 17:50:31.515475:I tensorflow/core/profiler/rpc/profiler_service_impl.cc:67]将XSpace收集到存储库:./logs/tbc_log/plugins/profile/2021_02_17_49_44/localhost_6009.xplane.pb
2021-02-02 17:50:31.518037:I tensorflow/core/profiler/lib/profiler_session.cc:172]profiler session分解。
2021-02-02 17:50:31.518440:W tensorflow/core/profiler/rpc/client/capture_profile.cc:133]未从localhost:6009收集任何跟踪事件
未收集任何跟踪事件。自动重试。
2021-02-02 17:50:31.518819:I tensorflow/core/profiler/rpc/client/capture_profile.cc:198]探查器延迟为0,开始时间戳设置为1612288231518793164[2021-02-02T17:50:31.518793164+00:00]
开始跟踪20000毫秒。剩余尝试:0
2021-02-02 17:50:31.518889:I tensorflow/core/profiler/rpc/client/remote\u profiler\u session\u manager.cc:75]截止日期设置为2021-02-02T17:50:44.890124419+00:00,因为最大会话持续时间为60000,会话创建时间戳为1612288184890124419[2