Python 使用PyTorch加载FITS图像_Python_Pytorch_Astropy_Fits

Python 使用PyTorch加载FITS图像

python pytorch

Python 使用PyTorch加载FITS图像,python,pytorch,astropy,fits,Python,Pytorch,Astropy,Fits,我正在尝试使用PyTorch创建CNN，但我的图像需要从FITS格式导入，而不是传统的.png或.jpeg等格式是否有一种方法可以使用torch.utils.data.DataLoader轻松完成这一任务，或者在源代码中是否有一个地方可以在加载时放入一个子句来处理FITS文件我查阅了文档，发现最相关的是ToPILImage transformer，它将张量或ndarray转换为PIL图像目前，我正在使用一个图像加载例程，如下所示： import torch from torch.autog

我正在尝试使用PyTorch创建CNN，但我的图像需要从FITS格式导入，而不是传统的.png或.jpeg等格式

是否有一种方法可以使用torch.utils.data.DataLoader轻松完成这一任务，或者在源代码中是否有一个地方可以在加载时放入一个子句来处理FITS文件

我查阅了文档，发现最相关的是ToPILImage transformer，它将张量或ndarray转换为PIL图像

目前，我正在使用一个图像加载例程，如下所示：

import torch
from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torchvision.datasets as dset
import torchvision.transforms as transforms
import torchvision

batch_size = 4

transform = transforms.Compose(
                   [transforms.Resize((32,32)),
                    transforms.ToTensor(),
                    ])

trainset = dset.ImageFolder(root="Documents/Image_data",transform=transform)
train_loader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,shuffle=True)

占星术：

Pytork：

torch.utils：

更新：也许使用torchvision.datasets.DatasetFolder而不是DataLoader，在我自己的FITS处理程序中插入一个就可以了

尝试使用该类时，我遇到以下错误：

AttributeError: module 'torchvision.datasets' has no attribute 'DatasetFolder'

此时torchvision是否实际支持DatasetFolder？

您可以使用此方法将FITS图像导出为支持的任何格式：

from astropy.io import fits
import matplotlib.pyplot as plt

image_data = fits.getdata(r"/path/to/image.fits")
plt.imsave("/path/to/image.png", image_data, cmap="gray")

可以使用此方法将FITS图像导出为支持的任何格式：

from astropy.io import fits
import matplotlib.pyplot as plt

image_data = fits.getdata(r"/path/to/image.fits")
plt.imsave("/path/to/image.png", image_data, cmap="gray")

通过阅读文档和代码的一些组合，我认为您不一定要使用

ImageFolder

，因为它对FITS一无所知

相反，您应该尝试使用更通用的类（实际上是

ImageFolder

的父类）。您将向它传递一个它应该处理的扩展名列表（即

['.fits']

和一个“loader”函数，该函数接受一个fits文件，并且似乎应该返回一个

PIL.Image

您甚至可以按照的示例创建自己的子类

\u fits\u loader

的确切详细信息可能取决于fits文件的详细信息。此基本示例仅使用高级

fits.getdata（）

函数，该函数返回fits文件中的第一个图像数组（某些fits文件可能具有多个图像扩展名，或具有表格等）.因此，这部分由您决定。

通过阅读文档和代码的一些组合，我认为您不一定要使用

ImageFolder

，因为它不知道任何有关FITS的信息

相反，您应该尝试使用更通用的类（实际上是

ImageFolder

）的父类。您可以向它传递一个扩展名列表（即

['.fits']

）和一个“loader”函数，该函数接受一个fits文件，并且似乎应该返回一个

PIL.Image

您甚至可以按照的示例创建自己的子类

\u fits\u loader

的确切详细信息可能取决于fits文件的详细信息。此基本示例仅使用高级

fits.getdata（）

函数，该函数返回fits文件中的第一个图像数组（某些fits文件可能具有多个图像扩展名，或具有表格等）。所以这部分由您决定。

几周前，我遇到了与@user8188120相同的问题。在从文件夹结构中读取标签时，使用@Iguananaut的答案非常有效。如果有人偶然发现了这一点，需要从csv文件中读取，这也可能有效：

labels = []
transform = transforms.Compose([
    # here go your transforms
    ])


class MyFitsDataset(data.Dataset):
    def __init__(self, csv_path):
        # Read the csv file
        self.data_info = pd.read_csv(csv_path, header=None)
        # First column contains the image paths
        self.image_arr = np.asarray(self.data_info.iloc[:, 0])
        # the rest contain the labels
        self.label_arr = np.asarray(self.data_info.iloc[:, 1:])  # for multi-label
        self.label_arr = np.asarray(self.data_info.iloc[:, 1])  # for single-label
        labels.append(self.label_arr)
        self.data_len = len(self.data_info.index)

    def __getitem__(self, index):
        single_image_name = self.image_arr[index]

        data = pyfits.open(single_image_name, axes=2)
        data = data[0].data.astype('float32')
        data = data.reshape(IMG_WIDTH, IMG_HEIGHT, CHANNELS)

        img = transform(data)

        # Get label(class) of the image based on the pandas column
        single_image_label = self.label_arr[index]

        return (img, single_image_label)

    def __len__(self):
        return self.data_len

这也避免了使用在最新版本的PyTorch中仍然不可用的

DatasetFolder

类。我希望这对其他人有所帮助。

几周前，我遇到了与@user8188120相同的问题。当从文件夹结构中读取标签时，使用@Iguananaut的答案非常有效。如果有人无意中发现了这个问题，你可能会发现ds正在从csv文件中读取，这也可能会起作用：

labels = []
transform = transforms.Compose([
    # here go your transforms
    ])


class MyFitsDataset(data.Dataset):
    def __init__(self, csv_path):
        # Read the csv file
        self.data_info = pd.read_csv(csv_path, header=None)
        # First column contains the image paths
        self.image_arr = np.asarray(self.data_info.iloc[:, 0])
        # the rest contain the labels
        self.label_arr = np.asarray(self.data_info.iloc[:, 1:])  # for multi-label
        self.label_arr = np.asarray(self.data_info.iloc[:, 1])  # for single-label
        labels.append(self.label_arr)
        self.data_len = len(self.data_info.index)

    def __getitem__(self, index):
        single_image_name = self.image_arr[index]

        data = pyfits.open(single_image_name, axes=2)
        data = data[0].data.astype('float32')
        data = data.reshape(IMG_WIDTH, IMG_HEIGHT, CHANNELS)

        img = transform(data)

        # Get label(class) of the image based on the pandas column
        single_image_label = self.label_arr[index]

        return (img, single_image_label)

    def __len__(self):
        return self.data_len

这也避免了使用最新版本的PyTorch中仍然没有的

DatasetFolder

类。我希望这对其他人有所帮助。

一个好主意，不幸的是，我需要在存档时以FITS格式保存数据，以便在天文管道中快速方便地使用。我不确定问题出在哪里。这个答案是demonstrates从FITS文件加载数据，然后将其写入单独的“.png”文件。你根本不会丢失FITS数据。否则，我对Pytork不太熟悉，但也许有一种方法可以将其扩展为读取FITS文件。抱歉，我的观点是，与其将FITS转换为png并保存图像以加载到PyTorch中，我更专注于在不使用需要一个将图像复制为png格式的中间阶段-我相信您最近的回答是这样的。这是个好主意，不幸的是，我需要在存档时将数据保持为FITS格式，以便在天文管道中快速方便地使用。我不确定问题出在哪里。此回答演示了从FITS fi加载数据le，然后将其写入一个单独的“.png”文件。你根本不会丢失FITS数据。否则，我对Pytork不太熟悉，但也许有一种方法可以将其扩展为读取FITS文件。抱歉，我的观点是，与其将FITS转换为png并保存图像以加载到PyTorch中，我更专注于在不使用需要一个将图像复制为png格式的中间阶段-我相信您最近的回答地址。感谢您的回复。这看起来确实是一个很好的方法。但是，当尝试实现此想法时，我遇到以下错误：模块“torchvision.dataset”没有属性“DatasetFolder”。事实上，看起来这只是最近添加的：因此，如果您不能使用最新版本的软件包，您可能不得不重新设计轮子，不幸的是，它可能看起来基本相同（例如，您可以将

ImageFolder

子类化，但您必须重新实现一点

\uuu init\uu

方法）.啊，这很有道理。我想如果我只是在本地复制源代码：那么我可以调用DatasetFolder并实现上面的方法？那会更简单。你可以作为一种临时措施，