Python 使用基于状态的标签注释视频帧_Python_Matlab_Video_Annotations_Computer Vision

Python 使用基于状态的标签注释视频帧

python matlab video computer-vision

Python 使用基于状态的标签注释视频帧,python,matlab,video,annotations,computer-vision,Python,Matlab,Video,Annotations,Computer Vision,我有一堆视频和深度贴图，展示了微软Kinect的人体姿势我可以在视频中得到人体的骨架，但我想做的是从骨架数据中识别出某个姿势为此，我需要用0或1对视频中的每个帧进行注释，对应于“坏姿势”和“好姿势”，即帧有一个二进制状态变量我希望能够在matlab中播放avi文件，然后按空格键在这两种状态之间切换，同时将状态变量添加到一个数组中，该数组给出视频中每个帧的状态 matlab中有没有一个工具可以做到这一点？否则Matlab不是一个限制，Python，C++或任何其他语言都是好的。我一直在谷歌

我有一堆视频和深度贴图，展示了微软Kinect的人体姿势

我可以在视频中得到人体的骨架，但我想做的是从骨架数据中识别出某个姿势

为此，我需要用0或1对视频中的每个帧进行注释，对应于“坏姿势”和“好姿势”，即帧有一个二进制状态变量

我希望能够在matlab中播放avi文件，然后按空格键在这两种状态之间切换，同时将状态变量添加到一个数组中，该数组给出视频中每个帧的状态

matlab中有没有一个工具可以做到这一点？否则Matlab不是一个限制，Python，C++或任何其他语言都是好的。我一直在谷歌上搜索，我发现的大部分东西都是用多边形注释单个帧。我想用视频正常帧率的一半来做这个

编辑：我使用了miindlek提供的解决方案，如果有人遇到这个问题，我决定分享一些东西。我需要在视频中看到我分配给每一帧的注释，所以我在视频的左上角画了一个小圆圈。希望这对以后的其他人有用。我还捕获了用waitKey按下的键，然后根据输出执行一些操作。这允许在注释过程中按下多个键

import numpy as np
import cv2
import os
os.chdir('PathToVideo')

# Blue cicle means that the annotation haven't started
# Green circle is a good pose
# Red is a bad pose
# White circle means we are done, press d for that

# Instructions on how to use!
# Press space to swap between states, you have to press space when the person
# starts doing poses. 
# Press d when the person finishes.
# press q to quit early, then the annotations are not saved, you should only 
# use this if you made a mistake and need to start over.

cap = cv2.VideoCapture('Video.avi')

# You can INCREASE the value of speed to make the video SLOWER
speed = 33

# Start with the beginning state as 10 to indicate that the procedure has not started
current_state = 10
saveAnnotations = True
annotation_list = []
# We can check wether the video capture has been opened
cap.isOpened()
colCirc = (255,0,0)
# Iterate while the capture is open, i.e. while we still get new frames.
while(cap.isOpened()):
    # Read one frame.
    ret, frame = cap.read()
    # Break the loop if we don't get a new frame.
    if not ret:
        break
    # Add the colored circle on the image to know the state
    cv2.circle(frame,(50,50), 50, colCirc, -1)
    # Show one frame.
    cv2.imshow('frame', frame)
    # Wait for a keypress and act on it
    k = cv2.waitKey(speed)
    if k == ord(' '):
        if current_state==0:
            current_state = 1
            colCirc = (0,0,255)
        else:
            current_state = 0
            colCirc = (0,255,0)
        if current_state == 10:
            current_state = 0
            colCirc = (0,255,0)
    if k == ord('d'):
        current_state = 11
        colCirc = (255,255,255)

    # Press q to quit
    if k == ord('q'):
        print "You quit! Restart the annotations by running this script again!"
        saveAnnotations = False
        break

    annotation_list.append(current_state)

# Release the capture and close window
cap.release()
cv2.destroyAllWindows()

# Only save if you did not quit
if saveAnnotations:
    f = open('poseAnnot.txt', 'w')
    for item in annotation_list:
        print>>f, item
    f.close()

解决任务的一种方法是将opencv库与python结合使用，如本文所述

变量

annotation\u list

包含每个帧的所有注释。要在两种模式之间切换，您必须按空格键。

您想实时运行视频文件还是以较慢的速度运行？帧速率的一半是可以的，它不需要30 fps。这太完美了！谢谢：）

import numpy as np
import cv2

cap = cv2.VideoCapture('video.avi')

current_state = False
annotation_list = []

while(True):
    # Read one frame.
    ret, frame = cap.read()
    if not ret:
        break

    # Show one frame.
    cv2.imshow('frame', frame)

    # Check, if the space bar is pressed to switch the mode.
    if cv2.waitKey(1) & 0xFF == ord(' '):
        current_state = not current_state

    annotation_list.append(current_state)

# Convert the list of boolean values to a list of int values.    
annotation_list = map(int, annotation_list)
print annotation_list

cap.release()
cv2.destroyAllWindows()