Python 如何为三维模型预处理视频
我在Keras中有一个三维模型:Python 如何为三维模型预处理视频,python,video,machine-learning,keras,artificial-intelligence,Python,Video,Machine Learning,Keras,Artificial Intelligence,我在Keras中有一个三维模型: model = Sequential( Conv3D(32, (3,3,3), activation='relu', input_shape=self.input_shape), MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)), Conv3D(64, (3,3,3), activation='relu'), MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2,
model = Sequential(
Conv3D(32, (3,3,3), activation='relu', input_shape=self.input_shape),
MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)),
Conv3D(64, (3,3,3), activation='relu'),
MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)),
Conv3D(128, (3,3,3), activation='relu'),
Conv3D(128, (3,3,3), activation='relu'),
MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)),
Conv3D(256, (2,2,2), activation='relu'),
Conv3D(256, (2,2,2), activation='relu'),
MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)),
Flatten(),
Dense(1024)),
Dropout(0.5),
Dense(1024),
Dropout(0.5)),
Dense(self.nb_classes, activation='softmax')
)
该模型基于本文
使用Conv3D对要预测的视频数据进行预处理的最佳方法是什么
我编写此函数是为了从UCF-101的每个视频中提取帧:
def frame_writer(pathIn, pathOut, class_name):
"""
This function will read videos and write frames in a new dataset
args:
pathIn -> base dataset of videos
pathOut -> destination folder for the frames ('data/path')
"""
#creating output path if it not exists
try:
if not os.path.exists(pathOut + '/' + class_name):
os.makedirs(pathOut + '/' + class_name)
else:
pass
except:
print('Invalid path!')
#getting the list containing all files from the directory
pathIn_files = glob.glob(pathIn + '\\' + class_name + '\\' + '*.avi')
video_limit = len(pathIn_files)
#iterating over all files
for i, j in zip(pathIn_files, range(len(pathIn_files))):
#getting the names from file paths
base_name = os.path.basename(pathIn_files[j])
file_name = base_name[0:-4] #taking only the file name (without extension)
#getting the frames
vidcap = cv2.VideoCapture(i)
success,image = vidcap.read()
count = 0
success = True
while success:
success,image = vidcap.read()
print ('Read a new frame: ', success)
cv2.imwrite(pathOut + '\\' + class_name + "\\%s_frame%d.jpg" % (file_name, count), image)
count += 1
print('Done!')
现在我有了这样的帧数据集:
文件夹:数据
-子文件夹:train
-子文件夹:class1
--frame1\u video1\u class1.jpg
--frame2\u video1\u class1.jpg
--frame3\u video1\u class1.jpg
--frameN\u viden\u class1.jpg
-子文件夹:class2
--frame1\u video1\u class2.jpg
--frame2_vide1_class2.jpg
--frame3\u video1\u class2.jpg
--frameN\u viden\u class2.jpg
-子文件夹:测试
-子文件夹:class1
--frame1\u video1\u class1.jpg
--frame2\u video1\u class1.jpg
--frame3\u video1\u class1.jpg
--frameN\u viden\u class1.jpg
-子文件夹:class2
--frame1\u video1\u class2.jpg
--frame2\u video1\u class2.jpg
--frame3\u video1\u class2.jpg
--frameN\u viden\u class2.jpg
所以我把所有视频中的所有帧放在一个与其类对应的文件夹中
我必须使用keras函数中的ImageDataGenerator将其传递给我的Conv3D模型
那么,在这种情况下,每次传递来自每个类的每个视频的每个帧
还是我必须用另一种方式
我只需要用这个模型预测视频
谢谢你的支持 一种方法是将所有帧放入一个大张量中,相应地标记它们,并将其作为Keras模型的输入。张量中的帧数将是批处理大小。很好!我试试这个!你有关于这个的教程链接吗?