Python 为什么unet有课程？_Python_Pytorch_Image Segmentation_Unity3d Unet

Python 为什么unet有课程？

python pytorch

Python 为什么unet有课程？,python,pytorch,image-segmentation,unity3d-unet,Python,Pytorch,Image Segmentation,Unity3d Unet,当我阅读UNet架构时，我发现它有n_类作为输出 import torch import torch.nn as nn import torch.nn.functional as F class double_conv(nn.Module): '''(conv => BN => ReLU) * 2''' def __init__(self, in_ch, out_ch): super(double_conv, self).__init__()

当我阅读UNet架构时，我发现它有

n_类

作为输出

import torch
import torch.nn as nn
import torch.nn.functional as F


class double_conv(nn.Module):
    '''(conv => BN => ReLU) * 2'''
    def __init__(self, in_ch, out_ch):
        super(double_conv, self).__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(in_ch, out_ch, 3, padding=1),
            nn.BatchNorm2d(out_ch),
            nn.ReLU(inplace=True),
            nn.Conv2d(out_ch, out_ch, 3, padding=1),
            nn.BatchNorm2d(out_ch),
            nn.ReLU(inplace=True)
        )

    def forward(self, x):
        x = self.conv(x)
        return x


class inconv(nn.Module):
    def __init__(self, in_ch, out_ch):
        super(inconv, self).__init__()
        self.conv = double_conv(in_ch, out_ch)

    def forward(self, x):
        x = self.conv(x)
        return x


class down(nn.Module):
    def __init__(self, in_ch, out_ch):
        super(down, self).__init__()
        self.mpconv = nn.Sequential(
            nn.MaxPool2d(2),
            double_conv(in_ch, out_ch)
        )

    def forward(self, x):
        x = self.mpconv(x)
        return x


class up(nn.Module):
    def __init__(self, in_ch, out_ch, bilinear=True):
        super(up, self).__init__()

        #  would be a nice idea if the upsampling could be learned too,
        #  but my machine do not have enough memory to handle all those weights
        if bilinear:
            self.up = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
        else:
            self.up = nn.ConvTranspose2d(in_ch//2, in_ch//2, 2, stride=2)

        self.conv = double_conv(in_ch, out_ch)

    def forward(self, x1, x2):
        x1 = self.up(x1)
        diffX = x1.size()[2] - x2.size()[2]
        diffY = x1.size()[3] - x2.size()[3]
        x2 = F.pad(x2, (diffX // 2, int(diffX / 2),
                        diffY // 2, int(diffY / 2)))
        x = torch.cat([x2, x1], dim=1)
        x = self.conv(x)
        return x


class outconv(nn.Module):
    def __init__(self, in_ch, out_ch):
        super(outconv, self).__init__()
        self.conv = nn.Conv2d(in_ch, out_ch, 1)

    def forward(self, x):
        x = self.conv(x)
        return x


class UNet(nn.Module):
    def __init__(self, n_channels, n_classes):
        super(UNet, self).__init__()
        self.inc = inconv(n_channels, 64)
        self.down1 = down(64, 128)
        self.down2 = down(128, 256)
        self.down3 = down(256, 512)
        self.down4 = down(512, 512)
        self.up1 = up(1024, 256)
        self.up2 = up(512, 128)
        self.up3 = up(256, 64)
        self.up4 = up(128, 64)
        self.outc = outconv(64, n_classes)

    def forward(self, x):
        self.x1 = self.inc(x)
        self.x2 = self.down1(self.x1)
        self.x3 = self.down2(self.x2)
        self.x4 = self.down3(self.x3)
        self.x5 = self.down4(self.x4)
        self.x6 = self.up1(self.x5, self.x4)
        self.x7 = self.up2(self.x6, self.x3)
        self.x8 = self.up3(self.x7, self.x2)
        self.x9 = self.up4(self.x8, self.x1)
        self.y = self.outc(self.x9)
        return self.y

但是为什么它有

n_类

，因为它用于图像分割

我正在尝试使用此代码进行图像去噪，但我无法确定

n_classes

参数应该是什么，因为我没有任何类

n_类

是否表示多类分割？如果是，二进制UNet分段的输出是什么？

答案 n_类是否表示多类分割

是的，如果指定

n_classes=4

，它将输出

（批次、4、宽度、高度）

形状的张量，其中每个像素可以分割为

类中的一个。此外，还应将其用于培训

如果是，二进制UNet分段的输出是什么

如果要使用二进制分段，请指定

n_classes=1

（黑色为

，白色为

）并使用

我正在尝试使用这段代码进行图像去噪，但我不知道n_classes参数应该是什么

它应该等于

n_通道

，对于RGB通常等于

，对于灰度通常等于

。如果你想教这个模型去噪图像，你应该：

向图像添加一些噪波（例如使用）
在末尾使用
```
sigmoid
```
激活，因为像素的值介于
```
0
```
和
```
1
```
之间（除非标准化）
用于培训

为什么是乙状结肠？因为

[0255]

像素范围表示为

[0,1]

像素值（至少不进行标准化）

sigmoid

正是这样做的-将值压缩到

[0,1]

范围内，因此

线性

输出（logits）的范围可以从

-inf

到

+inf

为什么不是线性输出和钳位

为了使线性层在钳位后处于

[0，1]

范围内，线性层的可能输出值必须大于

（符合目标的logits范围：

[0，+inf]

）

为什么不使用无钳位的线性输出

输出的登录必须在

[0,1]

范围内

为什么不采取其他方法呢

您可以这样做，但是

sigmoid

的思想是：

帮助神经网络（可以输出任何logit值）

sigmoid
的一阶导数是高斯标准正态分布，因此它模拟了许多现实生活中发生现象的概率（更多信息，请参见）

我多次看到这个乙状结肠的最后一层，但我无法理解它。为什么不是线性输出和钳位？为什么不使用无钳位的线性输出？为什么不采用其他方法？@Gulzar更新了我的答案，这更有意义吗？我不明白logits与线性层输出有什么关系。如果没有sigmoid，并且输出物理实际上是像素强度，我看不到通过逻辑函数的点。例如，MNIST上最基本的在线自动编码器示例也适用于线性最终层（或sigmoid），并且线性层没有消失梯度问题。所以，我仍然不明白非逻辑数据使用S形图。@Gulzar它与消失梯度无关。在分类逻辑的情况下，是线性层的输出（sigmoid将逻辑转换为概率）。它们在+/-无穷大范围内。像素值为[0,1]，使用MSE作为标准（除非像素仅为0和1，通常情况并非如此）。你的神经网络可以输出任何真实值，sigmoid将简单地将其压扁，因此它不必输出0到1之间的有界值（因为这项任务更难而且容易出错，例如，预测像素值为1.02，显然是错误的）。@Gulzar记住OP的最终目标是去噪而不是分割，在分割的情况下，PyTorch中的适当损失就足够了。
class UNet(nn.Module): def __init__(self, n_channels, n_classes):