在android中为pytorch预处理视频_Android_Kotlin_Computer Vision_Pytorch_Video Processing

在android中为pytorch预处理视频

android kotlin computer-vision pytorch

在android中为pytorch预处理视频,android,kotlin,computer-vision,pytorch,video-processing,Android,Kotlin,Computer Vision,Pytorch,Video Processing,在Android Kotlin中预处理视频数据的最佳方法是什么，以便为PyTorch Android模型做准备？具体来说，我在PyTorch中有一个现成的模型，我已经将其转换为可供使用在训练过程中，该模型从手机中获取原始片段，并进行预处理（1）灰度化，（2）压缩到我指定的特定较小分辨率，（3）转换为张量以输入神经网络（或可能将压缩视频发送到远程服务器）。我使用OpenCV来实现这一点，但我想知道在Android Kotlin中实现这一点最简单的方法是什么 Python代码供参考： def s

在Android Kotlin中预处理视频数据的最佳方法是什么，以便为PyTorch Android模型做准备？具体来说，我在PyTorch中有一个现成的模型，我已经将其转换为可供使用

在训练过程中，该模型从手机中获取原始片段，并进行预处理（1）灰度化，（2）压缩到我指定的特定较小分辨率，（3）转换为张量以输入神经网络（或可能将压缩视频发送到远程服务器）。我使用OpenCV来实现这一点，但我想知道在Android Kotlin中实现这一点最简单的方法是什么

Python代码供参考：


def save_video(filename):

    frames = []

    cap = cv2.VideoCapture(filename)
    frameCount = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
    frameWidth = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    frameHeight = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

    buf_c = np.empty((frameCount, frameHeight, frameWidth, 3), np.dtype('uint8'))
    buf = np.empty((frameCount, frameHeight, frameWidth), np.dtype('uint8'))

    fc = 0
    ret = True

    # 9:16 ratio
    width = 121
    height = 216
    dim = (width, height)

    # Loop until the end of the video
    while fc < frameCount and ret:
        ret, buf_c[fc] = cap.read()

        # convert to greyscale
        buf[fc] = cv2.cvtColor(buf_c[fc], cv2.COLOR_BGR2GRAY)

        # reduce resolution
        resized = cv2.resize(buf[fc], dim, interpolation = cv2.INTER_AREA)

        frames.append(resized)
        fc += 1

    # release the video capture object
    cap.release()

    # Closes all the windows currently opened.
    cv2.destroyAllWindows()

    return frames


def保存_视频（文件名）：
帧=[]
cap=cv2.VideoCapture（文件名）
frameCount=int（cap.get（cv2.cap\u PROP\u FRAME\u COUNT））
frameWidth=int（cap.get（cv2.cap\u PROP\u FRAME\u WIDTH））
frameHeight=int（cap.get（cv2.cap\u PROP\u FRAME\u HEIGHT））
buf_c=np.empty（（frameCount，frameHeight，frameWidth，3），np.dtype（'uint8'））
buf=np.empty（（帧数、帧高、帧宽），np.dtype（'uint8'））
fc=0
ret=真
#9:16比率
宽度=121
高度=216
尺寸=（宽度、高度）
#循环直到视频结束
当fc<帧数和ret时：
ret，buf_c[fc]=上限读取（）
#转换为灰度
buf[fc]=cv2.CVT颜色（buf_c[fc]，cv2.COLOR_BGR2GRAY）
#降低分辨率
调整大小=cv2.调整大小（buf[fc]，尺寸，插值=cv2.内部区域）
frames.append（调整大小）
fc+=1
#释放视频捕获对象
第1章释放（）
#关闭当前打开的所有窗口。
cv2.destroyAllWindows（）
返回帧

您说过您的模型已转换为可用于PyTorch Mobile，因此我假设您使用TorcScript编写了模型脚本

使用TorchScript，您可以使用Torch操作编写预处理逻辑，并将其保存在脚本化模型中，如下所示：

导入火炬
导入torch.nn.功能为F
@torch.jit.script\u方法
def预处理（自我，
图片：torch.Tensor，#格式应为HxWx3
高度：int，
宽度：int）->火炬。张量：
img=映像到（自设备）
#（1）转换为灰度
img=（（img[：，：，0]+img[：，：，1]+img[：，：，2]）/3）。取消查询（-1）
#（2）调整到指定的分辨率
#模拟torchvision.transforms.ToSensor以使用插值
img=img.float（）
img=img.permute（2,0,1）。unsqueze（0）
img=F.插值（img，大小=(
高度、宽度），mode=“双三次”，align_corners=False）
img=img.挤压（0）.置换（1,2,0）
#然后把它转回到正常的图像张量
#（3）其他规范化，如平均减法和转换为BxCW格式
img-=自平均张量#平均减法
img=img.permute（2,0,1）。unsqueze（0）
返回img

因此，所有的预处理都将由

libtorch

完成，而不是

opencv