Python Keras图像预处理：元组索引超出范围_Python_Numpy_Machine Learning_Deep Learning_Keras

Python Keras图像预处理：元组索引超出范围

python numpy machine-learning deep-learning keras

Python Keras图像预处理：元组索引超出范围,python,numpy,machine-learning,deep-learning,keras,Python,Numpy,Machine Learning,Deep Learning,Keras,该脚本的目标是使用Keras现有的图像预处理模块来增强视频数据。在此原型中，样本视频被分割成一系列帧并进行处理，其中最后的步骤包括执行随机旋转、移位、剪切和缩放： from keras import backend as K from keras.preprocessing.image import random_rotation, random_shift, random_shear, random_zoom K.set_image_dim_ordering("th") import cv2

该脚本的目标是使用Keras现有的图像预处理模块来增强视频数据。在此原型中，样本视频被分割成一系列帧并进行处理，其中最后的步骤包括执行随机旋转、移位、剪切和缩放：

from keras import backend as K
from keras.preprocessing.image import random_rotation, random_shift, random_shear, random_zoom
K.set_image_dim_ordering("th")

import cv2
import numpy as np

video_file_path = "./training-data/yes/1.mov"
samples_generated_per_sample = 10
self_rows = 100
self_columns = 150
self_frames_per_sequence = 45

# haar cascades for localizing oral region
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
mouth_cascade = cv2.CascadeClassifier('haarcascade_mcs_mouth.xml')

video = cv2.VideoCapture(video_file_path)
success, frame = video.read()

frames = []
success = True

# convert to grayscale, localize oral region, equalize dimensions, 
# normalize pixels, equalize lengths, and accumulate valid frames 
while success:
  success, frame = video.read()
  if success:
    # convert to grayscale
    frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    # localize single facial region
    faces_coords = face_cascade.detectMultiScale(frame, 1.3, 5)
    if len(faces_coords) == 1:
      face_x, face_y, face_w, face_h = faces_coords[0]
      frame = frame[face_y:face_y + face_h, face_x:face_x + face_w]

      # localize oral region
      mouth_coords = mouth_cascade.detectMultiScale(frame, 1.3, 5)
      threshold = 0
      for (mouth_x, mouth_y, mouth_w, mouth_h) in mouth_coords:
        if (mouth_y > threshold):
            threshold = mouth_y
            valid_mouth_coords = (mouth_x, mouth_y, mouth_w, mouth_h)
        else:
            pass
      mouth_x, mouth_y, mouth_w, mouth_h = valid_mouth_coords
      frame = frame[mouth_y:mouth_y + mouth_h, mouth_x:mouth_x + mouth_w]

      frames.append(frame)

    # ignore multiple facial region detections
    else:
        pass

# pre-pad short sequences and equalize sequence lengths
if len(frames) < self_frames_per_sequence:
    frames = [frames[0]]*(self_frames_per_sequence - len(frames)) + frames
frames = frames[0:self_frames_per_sequence]
frames = np.asarray(frames)

rotated_frames = random_rotation(frames, rg=45)
shifted_frames = random_shift(rotated_frames, wrg=0.25, hrg=0.25)
sheared_frames = random_shear(shifted_frames, intensity=0.79)
zoomed_frames = random_zoom(sheared_frames, zoom_range=(1.25, 1.25))

从keras导入后端为K
从keras.preprocessing.image导入随机旋转、随机移位、随机剪切、随机缩放
K.设置图像尺寸顺序（“th”）
进口cv2
将numpy作为np导入
视频文件路径=“./training data/yes/1.mov”
每个样本生成的样本数=10
自整行=100
自整列=150
每个序列的自帧=45
#用于口腔区域定位的haar级联
face_cascade=cv2.CascadeClassifier（'haarcascade_frontalface_default.xml'））
mouth\u cascade=cv2.CascadeClassifier（'haarcascade\u mcs\u mouth.xml'））
视频=cv2.VideoCapture（视频文件路径）
成功，帧=video.read（）
帧=[]
成功=正确
#转换为灰度，本地化口腔区域，均衡尺寸，
#标准化像素、均衡长度和累积有效帧
在取得成功的同时：
成功，帧=video.read（）
如果成功：
#转换为灰度
frame=cv2.CVT颜色（frame，cv2.COLOR\u BGR2GRAY）
#定位单个面部区域
faces\u coords=faces\u级联。检测多尺度（帧，1.3,5）
如果len（面坐标）=1：
面x，面y，面w，面h=面坐标[0]
帧=帧[面y:面y+面h，面x:面x+面w]
#口腔区域定位
嘴坐标=嘴级联。检测多尺度（帧，1.3,5）
阈值=0
对于口中的（嘴x、嘴y、嘴w、嘴h）：
如果（嘴>阈值）：
阈值=嘴
有效的嘴坐标=（嘴x，嘴y，嘴w，嘴h）
其他：
通过
嘴x，嘴y，嘴w，嘴h=有效的嘴坐标
帧=帧[嘴y:嘴y+嘴h，嘴x:嘴x+嘴w]
frames.append（frame）
#忽略多个面部区域检测
其他：
通过
#预焊盘短序列和均衡序列长度
如果len（frames）<每个序列的自帧：
帧=[frames[0]]*（自帧每帧序列-len（帧））+帧
帧=帧[0:每个序列的自帧]
帧=np.asarray（帧）
旋转的帧=随机旋转（帧，rg=45）
移位帧=随机移位（旋转帧，wrg=0.25，hrg=0.25）
剪切框架=随机剪切（移动框架，强度=0.79）
缩放帧=随机缩放（剪切帧，缩放范围=（1.25，1.25））

运行脚本时，会出现以下错误：查看参数：

您正在以平面数组的形式提供

帧

，但它需要至少有三个轴的数组，因此默认情况下，它可以采用

行轴=1，列轴=2

。请正确指定这些参数，或提供形状正确的数组。

查看参数：

您正在以平面数组的形式提供

帧

，但它需要至少有三个轴的数组，因此默认情况下，它可以采用

行轴=1，列轴=2

。请正确指定这些参数，或提供形状正确的数组。

问题是由于帧尺寸不相等。解决方案是在应用变换之前首先均衡框架尺寸：

# pre-pad short sequences, equalize frame dimensions, and equalize sequence lengths
if len(frames) < self_frames_per_sequence:
    frames = [frames[0]]*(self_frames_per_sequence - len(frames)) + frames
frames = frames[0:self_frames_per_sequence]
frames = [cv2.resize(frame, (self_columns, self_rows)).astype('float32') for frame in frames]
frames = np.asarray(frames)

rotated_frames = random_rotation(frames, rg=45)
shifted_frames = random_shift(rotated_frames, wrg=0.25, hrg=0.25)
sheared_frames = random_shear(shifted_frames, intensity=0.79)
zoomed_frames = random_zoom(sheared_frames, zoom_range=(1.25, 1.25))

#预焊盘短序列、平衡机架尺寸和平衡序列长度
如果len（frames）<每个序列的自帧：
帧=[frames[0]]*（自帧每帧序列-len（帧））+帧
帧=帧[0:每个序列的自帧]
frames=[cv2.resize（frame，（self_列，self_行））.astype（'float32'），用于frames中的frame]
帧=np.asarray（帧）
旋转的帧=随机旋转（帧，rg=45）
移位帧=随机移位（旋转帧，wrg=0.25，hrg=0.25）
剪切框架=随机剪切（移动框架，强度=0.79）
缩放帧=随机缩放（剪切帧，缩放范围=（1.25，1.25））

问题是由于框架尺寸不相等造成的。解决方案是在应用变换之前首先均衡框架尺寸：

# pre-pad short sequences, equalize frame dimensions, and equalize sequence lengths
if len(frames) < self_frames_per_sequence:
    frames = [frames[0]]*(self_frames_per_sequence - len(frames)) + frames
frames = frames[0:self_frames_per_sequence]
frames = [cv2.resize(frame, (self_columns, self_rows)).astype('float32') for frame in frames]
frames = np.asarray(frames)

rotated_frames = random_rotation(frames, rg=45)
shifted_frames = random_shift(rotated_frames, wrg=0.25, hrg=0.25)
sheared_frames = random_shear(shifted_frames, intensity=0.79)
zoomed_frames = random_zoom(sheared_frames, zoom_range=(1.25, 1.25))

#预焊盘短序列、平衡机架尺寸和平衡序列长度
如果len（frames）<每个序列的自帧：
帧=[frames[0]]*（自帧每帧序列-len（帧））+帧
帧=帧[0:每个序列的自帧]
frames=[cv2.resize（frame，（self_列，self_行））.astype（'float32'），用于frames中的frame]
帧=np.asarray（帧）
旋转的帧=随机旋转（帧，rg=45）
移位帧=随机移位（旋转帧，wrg=0.25，hrg=0.25）
剪切框架=随机剪切（移动框架，强度=0.79）
缩放帧=随机缩放（剪切帧，缩放范围=（1.25，1.25））

Hello@Benjamin frames是在转换为numpy数组之前的2D numpy数组列表。这不是3D张量吗？找到问题了。这是由于不同的框架尺寸。平衡框架尺寸可以解决此问题。谢谢。Hello@Benjamin frames是在转换为numpy数组之前的2D numpy数组列表。这不是3D张量吗？找到问题了。这是由于不同的框架尺寸。平衡框架尺寸可以解决此问题。谢谢。当你说“框架尺寸不相等”时，你能更具体一点吗？您更改前后的尺寸是多少？@Benjamin最初，所有框架的尺寸都相同（1980 x 1080），但在应用级联分类对口腔区域进行定位和裁剪后，框架的尺寸不再相同（例如，不同程度的张口与闭口，导致高度在~85-110范围内，宽度在~135-170范围内）。将其大小调整为标准尺寸至100x150（其宽度/高度的粗略平均值）解决了这一问题。当您说“框架尺寸不相等”时，您能更具体一点吗“你改变前后的尺寸是多少？”本杰明最初，所有的框架都有