Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/339.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何从嵌套元组列表生成批量数据?_Python_Keras_Data Generation - Fatal编程技术网

Python 如何从嵌套元组列表生成批量数据?

Python 如何从嵌套元组列表生成批量数据?,python,keras,data-generation,Python,Keras,Data Generation,我已经实现了一个Keras定制数据生成器,它从表单的一对嵌套元组(文件、测试)生成正面和负面的数据示例 数据示例: [((0, 1, 2), 0), ((3, 4, 5), 0), ((12,), 1), ((0, 1, 4, 7), 1)] 批处理示例: {'files': (0, 1, 2), 'test': 0}, label=1 其中,标签1表示数据中的正示例,0表示负示例 我有以下生成数据的功能: def data_generation(self, pair

我已经实现了一个Keras定制数据生成器,它从表单的一对嵌套元组(文件、测试)生成正面和负面的数据示例

数据示例:

 [((0, 1, 2), 0), 
  ((3, 4, 5), 0), 
  ((12,), 1), 
  ((0, 1, 4, 7), 1)] 
批处理示例:

{'files': (0, 1, 2), 'test': 0}, label=1
其中,标签1表示数据中的正示例,0表示负示例

我有以下生成数据的功能:

 def data_generation(self, pairs):
    """Generate batches of samples for training"""
    batch = np.zeros((self.batch_size, 3)) # I KNOW THE PROBLEM CAN BE HERE

    # Adjust label based on task
    if self.classification:
        neg_label = 0
    else:
        neg_label = -1

    # This creates a generator
    while True:
        for idx, (file_id, test_id) in enumerate(random.sample(pairs, self.n_positive)):
            batch[idx, :] = (file_id, test_id, 1)

        # Increment idx by 1
        idx += 1

        # Add negative examples until reach batch size
        while idx < self.batch_size:

            # random selection
            random_test = random.randrange(self.nr_tests)

            # Check to make sure this is not a positive example
            if (file_id, random_test) not in self.pairs_set:
                # Add to batch and increment index
                batch[idx, :] = (file_id, random_test, neg_label)
                idx += 1

        np.random.shuffle(batch)
        yield {'file': batch[:, 0], 'test': batch[:, 1]}, batch[:, 2]

现在,我知道批处理是个问题,因为它的形式是(tuple,int,int),而且长度可变。我应该屏蔽还是填充元组?我怎样才能做到这一点?

还不完全清楚您希望得到什么样的输出。是否可以添加预期输出?预期输出是批处理示例的形式,但可以转换为tf.Tensor
Traceback:
    File "/Users/DataGenerator.py", line 83, in data_generation
        batch[idx, :] = (file_id, test_id, 1)
    ValueError: setting an array element with a sequence.