Python 如何在pytorch lightning中使用TensorBoard记录器转储混淆矩阵?

Python 如何在pytorch lightning中使用TensorBoard记录器转储混淆矩阵?,python,deep-learning,pytorch,tensorboard,pytorch-lightning,Python,Deep Learning,Pytorch,Tensorboard,Pytorch Lightning,只有州 >>> from pytorch_lightning.metrics import ConfusionMatrix >>> target = torch.tensor([1, 1, 0, 0]) >>> preds = torch.tensor([0, 1, 0, 0]) >>> confmat = ConfusionMatrix(num_classes=2) >>> confmat(preds,

只有州

>>> from pytorch_lightning.metrics import ConfusionMatrix
>>> target = torch.tensor([1, 1, 0, 0])
>>> preds = torch.tensor([0, 1, 0, 0])
>>> confmat = ConfusionMatrix(num_classes=2)
>>> confmat(preds, target)
这并没有说明如何在框架中使用度量

我的尝试(方法不完整,仅显示相关部分):

在第0个纪元之后,这将给出

    Traceback (most recent call last):
      File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 521, in train
        self.train_loop.run_training_epoch()
      File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\training_loop.py", line 588, in run_training_epoch
        self.trainer.run_evaluation(test_mode=False)
      File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 613, in run_evaluation
        self.evaluation_loop.log_evaluation_step_metrics(output, batch_idx)
      File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\evaluation_loop.py", line 346, in log_evaluation_step_metrics
        self.__log_result_step_metrics(step_log_metrics, step_pbar_metrics, batch_idx)
      File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\evaluation_loop.py", line 350, in __log_result_step_metrics
        cached_batch_pbar_metrics, cached_batch_log_metrics = cached_results.update_logger_connector()
      File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\connectors\logger_connector\epoch_result_store.py", line 378, in update_logger_connector
        batch_log_metrics = self.get_latest_batch_log_metrics()
      File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\connectors\logger_connector\epoch_result_store.py", line 418, in get_latest_batch_log_metrics
        batch_log_metrics = self.run_batch_from_func_name("get_batch_log_metrics")
      File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\connectors\logger_connector\epoch_result_store.py", line 414, in run_batch_from_func_name
        results = [func(include_forked_originals=False) for func in results]
      File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\connectors\logger_connector\epoch_result_store.py", line 414, in <listcomp>
        results = [func(include_forked_originals=False) for func in results]
      File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\connectors\logger_connector\epoch_result_store.py", line 122, in get_batch_log_metrics
        return self.run_latest_batch_metrics_with_func_name("get_batch_log_metrics",
*args, **kwargs)
      File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\connectors\logger_connector\epoch_result_store.py", line 115, in run_latest_batch_metrics_with_func_name
        for dl_idx in range(self.num_dataloaders)
      File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\connectors\logger_connector\epoch_result_store.py", line 115, in <listcomp>
        for dl_idx in range(self.num_dataloaders)
      File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\trainer\connectors\logger_connector\epoch_result_store.py", line 100, in get_latest_from_func_name
        results.update(func(*args, add_dataloader_idx=add_dataloader_idx, **kwargs))
      File "C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site-packages\pytorch_lightning\core\step_result.py", line 298, in get_batch_log_metrics
        result[dl_key] = self[k]._forward_cache.detach()
    AttributeError: 'NoneType' object has no attribute 'detach'

                                                      
回溯(最近一次呼叫最后一次):
文件“C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site packages\pytorch\u lightning\trainer\trainer.py”,第521行,列车中
self.train\u loop.run\u training\u epoch()
文件“C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site packages\pytorch\u lightning\trainer\training\u loop.py”,第588行,在run\u training\u epoch中
self.trainer.run\u评估(测试模式=假)
运行评估中的文件“C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site packages\pytorch\u lightning\trainer\trainer.py”,第613行
self.evaluation\u loop.log\u evaluation\u step\u度量(输出、批处理\u idx)
文件“C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site packages\pytorch\u lightning\trainer\evaluation\u loop.py”,第346行,在log\u evaluation\u step\u metrics中
self.\uuuu log\u result\u step\u度量(step\u log\u度量、step\u pbar\u度量、batch\u idx)
文件“C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site packages\pytorch\u lightning\trainer\evaluation\u loop.py”,第350行,在“日志\u结果\u步骤\u度量”中
缓存的\u批处理\u pbar\u度量,缓存的\u批处理\u日志\u度量=缓存的\u结果。更新\u记录器\u连接器()
文件“C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site packages\pytorch\u lightning\trainer\connectors\logger\u connector\epoch\u result\u store.py”,第378行,在更新\u logger\u connector中
batch\u log\u metrics=self.get\u latest\u batch\u log\u metrics()
文件“C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site packages\pytorch\u lightning\trainer\connectors\logger\u connector\epoch\u result\u store.py”,第418行,位于get\u latest\u batch\u log\u metrics中
batch\u log\u metrics=self.run\u batch\u from\u func\u name(“get\u batch\u log\u metrics”)
文件“C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site packages\pytorch\u lightning\trainer\connectors\logger\u connector\epoch\u result\u store.py”,第414行,从\u func\u name运行批处理
结果=[func(包括分叉的原始值=假)用于结果中的func]
文件“C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site packages\pytorch\u lightning\trainer\connectors\logger\u connector\epoch\u result\u store.py”,第414行,in
结果=[func(包括分叉的原始值=假)用于结果中的func]
文件“C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site packages\pytorch\u lightning\trainer\connectors\logger\u connector\epoch\u result\u store.py”,第122行,在get\u batch\u log\u metrics中
返回self.run\u latest\u batch\u metrics\u和\u func\u name(“get\u batch\u log\u metrics”,
*args,**kwargs)
文件“C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site packages\pytorch\u lightning\trainer\connectors\logger\u connector\epoch\u result\u store.py”,第115行,运行\u最新\u批次\u metrics\u,带有\u func\u名称
适用于范围内的dl_idx(self.num_数据加载器)
文件“C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site packages\pytorch\u lightning\trainer\connectors\logger\u connector\epoch\u result\u store.py”,第115行,in
适用于范围内的dl_idx(self.num_数据加载器)
文件“C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site packages\pytorch\u lightning\trainer\connectors\logger\u connector\epoch\u result\u store.py”,第100行,从函数名获取最新信息
results.update(func(*args,add_dataloader_idx=add_dataloader_idx,**kwargs))
文件“C:\code\EPMD\Kodex\Templates\Testing\venv\lib\site packages\pytorch\u lightning\core\step\u result.py”,第298行,在get\u batch\u log\u metrics中
结果[dl_key]=self[k]。_forward_cache.detach()
AttributeError:“非类型”对象没有属性“分离”
培训前,它确实通过了健全性验证检查

验证\u步骤\u结束
中返回时发生故障。对我来说没什么意义

使用mertics的完全相同的方法可以精确地工作


如何获得正确的混淆矩阵?

这花了很多时间才找到

这是我能粘贴的最简单的代码,它仍然是可读和可复制的

我不想把整个模型数据集和参数放在这里,因为读者对这个问题不感兴趣,它们只是噪音


也就是说,这里是创建每个历元的混淆矩阵并在Tensorboard中显示所需的代码

这是一个单帧,例如:


还有给教练的电话

logger = TensorBoardLogger(save_dir=tb_logs_folder, name='Classifier')
trainer = Trainer(deterministic=True,
                  max_epochs=10,
                  default_root_dir=classifier_checkpoints_path,
                  logger=logger,
                  gpus=1
                  )

您可以使用
self.logger.experience.add\u figure(*tag*,*figure*)
报告该图

变量
self.logger.experiment
实际上是一个
SummaryWriter
(来自PyTorch,而不是Lightning)。此类具有方法
add\u figure
()

您可以按如下方式使用它:(MNIST示例)

def验证步骤(自身、批次、批次idx):
x、 y=批次
preds=self(x)
损失=F.nll\U损失(预测值,y)
返回{'loss':loss,'preds':preds,'target':y}
def验证_epoch_end(自身、输出):
preds=torch.cat([tmp['preds']用于输出中的tmp])
targets=torch.cat([tmp['target']用于输出中的tmp])
混乱矩阵=pl.metrics.functional.Mission矩阵(预测、目标、数量类=10)
df_cm=pd.DataFrame(混乱矩阵.numpy(),索引=范围(10),列=范围(10))
plt.图(figsize=(10,7))
图=sns.热图(df\u cm,annot=True,cmap='spectrum')。获取图()
plt.关闭(图)
self.logger.experiment.add\u figure(“混淆矩阵”,fig\u,self.current\u历元)
请提供所需信息。显示中间结果与预期结果的偏差。我们应该能够将单个代码块粘贴到文件中,运行它,并重现您的问题。这也让我们可以测试任何建议
import pytorch_lightning as pl
import seaborn as sn
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image

def __init__(self, config, trained_vae, latent_dim):
    self.val_confusion = pl.metrics.classification.ConfusionMatrix(num_classes=self._config.n_clusters)
    self.logger: Optional[TensorBoardLogger] = None

def forward(self, x):
    ...
    return log_probs

def validation_step(self, batch, batch_index):
    if self._config.dataset == "mnist":
        orig_batch, label_batch = batch
        orig_batch = orig_batch.reshape(-1, 28 * 28)

    log_probs = self.forward(orig_batch)
    loss = self._criterion(log_probs, label_batch)

    self.val_confusion.update(log_probs, label_batch)
    return {"loss": loss, "labels": label_batch}

def validation_step_end(self, outputs):
    return outputs

def validation_epoch_end(self, outs):
    tb = self.logger.experiment

    # confusion matrix
    conf_mat = self.val_confusion.compute().detach().cpu().numpy().astype(np.int)
    df_cm = pd.DataFrame(
        conf_mat,
        index=np.arange(self._config.n_clusters),
        columns=np.arange(self._config.n_clusters))
    plt.figure()
    sn.set(font_scale=1.2)
    sn.heatmap(df_cm, annot=True, annot_kws={"size": 16}, fmt='d')
    buf = io.BytesIO()
    
    plt.savefig(buf, format='jpeg')
    buf.seek(0)
    im = Image.open(buf)
    im = torchvision.transforms.ToTensor()(im)
    tb.add_image("val_confusion_matrix", im, global_step=self.current_epoch)
logger = TensorBoardLogger(save_dir=tb_logs_folder, name='Classifier')
trainer = Trainer(deterministic=True,
                  max_epochs=10,
                  default_root_dir=classifier_checkpoints_path,
                  logger=logger,
                  gpus=1
                  )