Python 如何将opencv的人脸检测边界框坐标转换为dlib的人脸检测边界框坐标?
我使用opencv的预训练dnn模型和dlib的hog模型运行了一个实时流人脸检测代码。我从几个摄像头中得到检测结果,代码打印出opencv和dlib的边界框坐标。我期待同样的结果,但我有非常不同的结果。有没有办法将opencv坐标转换为dlib坐标 我试图找到一个数学线性模型来把它们联系起来,但没有成功Python 如何将opencv的人脸检测边界框坐标转换为dlib的人脸检测边界框坐标?,python,opencv,dlib,coordinate-transformation,Python,Opencv,Dlib,Coordinate Transformation,我使用opencv的预训练dnn模型和dlib的hog模型运行了一个实时流人脸检测代码。我从几个摄像头中得到检测结果,代码打印出opencv和dlib的边界框坐标。我期待同样的结果,但我有非常不同的结果。有没有办法将opencv坐标转换为dlib坐标 我试图找到一个数学线性模型来把它们联系起来,但没有成功 import numpy as np import argparse import imutils import pickle import time import cv2 import os
import numpy as np
import argparse
import imutils
import pickle
import time
import cv2
import os
import align
import dlib
import time
import datetime
face_detector = dlib.get_frontal_face_detector()
predictor_model = "shape_predictor_68_face_landmarks.dat"
face_aligner = align.AlignDlib(predictor_model)
ap = argparse.ArgumentParser()
ap.add_argument("-d", "--detector", required=True,
help="path to OpenCV's deep learning face detector")
ap.add_argument("-m", "--embedding-model", required=True,
help="path to OpenCV's deep learning face embedding model")
ap.add_argument("-r", "--recognizer", required=True,
help="path to model trained to recognize faces")
ap.add_argument("-l", "--le", required=True,
help="path to label encoder")
ap.add_argument("-c", "--confidence", type=float, default=0.8,
help="minimum probability to filter weak detections")
args = vars(ap.parse_args())
print("[INFO] loading face detector...")
protoPath = os.path.sep.join([args["detector"], "deploy.prototxt"])
modelPath = os.path.sep.join([args["detector"],
"res10_300x300_ssd_iter_140000.caffemodel"])
detector = cv2.dnn.readNetFromCaffe(protoPath, modelPath)
print("[INFO] starting video stream...")
vs = cv2.VideoCapture(0)
time.sleep(2.0)
while True:
ret, frame = vs.read()
frame = imutils.resize(frame, width=600)
(h, w) = frame.shape[:2]
imageBlob = cv2.dnn.blobFromImage(
cv2.resize(frame, (300, 300)), 1.0, (300, 300),
(104.0, 177.0, 123.0), swapRB=False, crop=False)
detector.setInput(imageBlob)
detections = detector.forward()
for i in range(0, detections.shape[2]):
confidence = detections[0, 0, i, 2]
if confidence > args["confidence"]:
box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
(startX, startY, endX, endY) = box.astype("int")
face = frame[startY:endY, startX:endX]
(fH, fW) = face.shape[:2]
if fW < 20 or fH < 20:
continue
rgb = cv2.cvtColor(face, cv2.COLOR_BGR2RGB)
detected_faces_dlib = face_detector(rgb, 1)
detected_faces = dlib.rectangle(left=startX, top=startY, right=endX, bottom=endY)
print(detected_faces)
print(detected_faces_dlib)
我刚刚花了很多时间处理这个问题,如果您的目标是检测dnn检测器检测到的人脸上的人脸标志,那么最好的办法是使用dnn检测器中的矩形重新训练shape_predictor_68_face_landmarks.dat 作为指导,我编写了一个python脚本,它遍历ibug300训练集,重新检测人脸的边界框,重写训练集的xml文件,然后运行train_shape_predictor脚本以获得一个新的.dat文件 与尝试重塑dnn矩形以近似hog box相比,结果非常好
在进行再培训之前,有一个提示:dnn人脸检测返回矩形,其宽度和高度变化很大。这对形状预测训练不起作用。最好使用边长为~1.35*dnn矩形宽度的正方形。这似乎是一个神奇的数字,但这是dnn人脸检测矩形的平均高宽比。因此,如果我想使用opencv dnn人脸识别和dlib形状预测器,而不需要重新训练,我可以将每个dnn矩形变成正方形,然后将其传递给形状预测器吗?这是可行的吗?或者你能分享你训练过的.dat文件吗?对于其他发现这个问题的人,我在这里找到了很多代码,它们以不同的方式修改dnn矩形,以获得更好的地标:y1,x2=inty1*1.15,intx2*1.05
# take a bounding predicted by opencv and convert it
# to the dlib (top, right, bottom, left)
def bb_to_rect(bb):
top=bb[1]
left=bb[0]
right=bb[0]+bb[2]
bottom=bb[1]+bb[3]
return np.array([top, right, bottom, left])
# take a bounding predicted by dlib and convert it
# to the format (x, y, w, h) as we would normally do
# with OpenCV
def rect_to_bb(rect):
x = rect.left()
y = rect.top()
w = rect.right() - x
h = rect.bottom() - y
# return a tuple of (x, y, w, h)
return (x, y, w, h)
# take a bounding predicted by opencv and convert it
# to the dlib (top, right, bottom, left)
def bb_to_rect(bb):
top=bb[1]
left=bb[0]
right=bb[0]+bb[2]
bottom=bb[1]+bb[3]
return np.array([top, right, bottom, left])
# take a bounding predicted by dlib and convert it
# to the format (x, y, w, h) as we would normally do
# with OpenCV
def rect_to_bb(rect):
x = rect.left()
y = rect.top()
w = rect.right() - x
h = rect.bottom() - y
# return a tuple of (x, y, w, h)
return (x, y, w, h)