Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/322.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python MNIST数据集结构_Python_Machine Learning_Neural Network_Genetic Algorithm_Mnist - Fatal编程技术网

Python MNIST数据集结构

Python MNIST数据集结构,python,machine-learning,neural-network,genetic-algorithm,mnist,Python,Machine Learning,Neural Network,Genetic Algorithm,Mnist,我下载了一个实现遗传算法的代码。它使用默认数据集mnist。我想更改默认数据集“mnist”,但同时我想知道数据集的结构,以便可以按照mnist的方式格式化数据。例如,我还想知道,如果我已经格式化了我的数据my_own\u data\u set,那么传入函数调用network.train(my_own\u data\u set)是否有效?功能训练网络接受什么数据类型 """Entry point to evolving the neural network. Start here.""" imp

我下载了一个实现遗传算法的代码。它使用默认数据集
mnist
。我想更改默认数据集“mnist”,但同时我想知道数据集的结构,以便可以按照
mnist
的方式格式化数据。例如,我还想知道,如果我已经格式化了我的数据
my_own\u data\u set
,那么传入函数调用
network.train(my_own\u data\u set)
是否有效?功能
训练网络
接受什么数据类型

"""Entry point to evolving the neural network. Start here."""
import logging
from optimizer import Optimizer
from tqdm import tqdm

# Setup logging.
logging.basicConfig(
    format='%(asctime)s - %(levelname)s - %(message)s',
    datefmt='%m/%d/%Y %I:%M:%S %p',
    level=logging.DEBUG,
    filename='log.txt'
)

def train_networks(networks, dataset):
    """Train each network.

    Args:
        networks (list): Current population of networks
        dataset (str): Dataset to use for training/evaluating
   """
   pbar = tqdm(total=len(networks))
   for network in networks:
       network.train(dataset)
       pbar.update(1)
       pbar.close()

   def get_average_accuracy(networks):
       """Get the average accuracy for a group of networks.

   Args:
       networks (list): List of networks

   Returns:
       float: The average accuracy of a population of networks.

   """
   total_accuracy = 0
   for network in networks:
       total_accuracy += network.accuracy

   return total_accuracy / len(networks)

   def generate(generations, population, nn_param_choices, dataset):
       """Generate a network with the genetic algorithm.

   Args:
       generations (int): Number of times to evole the population
       population (int): Number of networks in each generation
       nn_param_choices (dict): Parameter choices for networks
       dataset (str): Dataset to use for training/evaluating

   """
   optimizer = Optimizer(nn_param_choices)
   networks = optimizer.create_population(population)

   # Evolve the generation.
   for i in range(generations):
       logging.info("***Doing generation %d of %d***" %
                 (i + 1, generations))

       # Train and get accuracy for networks.
       train_networks(networks, dataset)

       # Get the average accuracy for this generation.
       average_accuracy = get_average_accuracy(networks)

       # Print out the average accuracy each generation.
       logging.info("Generation average: %.2f%%" % (average_accuracy * 100))
       logging.info('-'*80)

       # Evolve, except on the last iteration.
       if i != generations - 1:
           # Do the evolution.
           networks = optimizer.evolve(networks)

    # Sort our final population.
    networks = sorted(networks, key=lambda x: x.accuracy, reverse=True)

    # Print out the top 5 networks.
    print_networks(networks[:5])

def print_networks(networks):
    """Print a list of networks.

    Args:
         networks (list): The population of networks

    """
    logging.info('-'*80)
    for network in networks:
        network.print_network()

def main():
    """Evolve a network."""
    generations = 10  # Number of times to evole the population.
    population = 20  # Number of networks in each generation.
    dataset = 'mnist'

    nn_param_choices = {
        'nb_neurons': [64, 128, 256, 512, 768, 1024],
        'nb_layers': [1, 2, 3, 4],
        'activation': ['relu', 'elu', 'tanh', 'sigmoid'],
        'optimizer': ['rmsprop', 'adam', 'sgd', 'adagrad',
                  'adadelta', 'adamax', 'nadam'],
    }

    logging.info("***Evolving %d generations with population %d***" %
             (generations, population))

    generate(generations, population, nn_param_choices, dataset)

if __name__ == '__main__':
     main()

每行是一个字符串数组,785个字符串用引号括起来,用逗号分隔。 培训集中约60000行,测试集中约10000行

这一行以标签开始,标签上的其他行在一张图片

“8”、“0”、“0”、“0”、“0”、“0”、“0”、“0”、“0”、“0”、“0”、“0”、“0”、“0”、“0”。。。“255”

因此,第一个字符串项是图像所表示的数字,其余的是784个0-255的字符串,表示图像,这是一个28x28的图像,用行端到端表示为一个长字符串数组


字符串[785]行,每行

你所说的“格式”是什么意思?@cheersmate->我的意思是我想预处理我的数据,就像MNIST在数据集中的结构一样。我想知道MNIST数据集的维度、数据类型等等。因此,在使用MNIST运行时,您可以先检查数组形状、数据类型等,然后为您自己的数据集添加一些检查?@chearsmate但数据集变量只是一个字符串,请参见上面的代码。数据集='mnist'。我不知道network.train(数据集)如何提取数据进行训练。所以我无法检查它的维度。你有源代码来了解内部发生了什么吗?