算法 – Viola-Jones的面部检测声称有180k的功能

我已经实现了Viola-Jones’ face detection algorithm的适应。该技术依赖于在图像中放置24×24像素的子帧,并且随后在其内部在每个可能的尺寸的每个位置放置矩形特征。

这些特征可以由两个,三个或四个矩形组成。提供以下示例。

他们声称详尽的集是超过180k(第2节):

Given that the base resolution of the detector is 24×24, the exhaustive set of rectangle features is quite large, over 180,000 . Note that unlike the Haar basis, the set of rectangle
features is overcomplete.

以下陈述没有在文章中明确说明,所以他们是我的假设:

>只有2个双矩形特征,2个三矩形特征和1个四矩形特征。这背后的逻辑是,我们观察突出显示的矩形之间的差异,而不是明确的颜色或亮度或任何类别。
>我们不能将要素类型A定义为1×1像素块;它必须至少是至少1×2像素。此外,类型D必须至少为2×2像素,并且此规则相应地适用于其他特征。
>我们不能将特征类型A定义为1×3像素块,因为中间像素不能被分割,并且从其自身减去与1×2像素块相同;此功能类型仅针对均匀宽度进行定义。此外,特征类型C的宽度必须可除以3,并且该规则相应地适用于其他特征。
>我们不能定义宽度和/或高度为0的特征。因此,我们将x和y迭代为24减去特征的大小。

基于这些假设,我计算了穷尽的集:

const int frameSize = 24;
const int features = 5;
// All five feature types:
const int feature[features][2] = {{2,1}, {1,2}, {3,1}, {1,3}, {2,2}};

int count = 0;
// Each feature:
for (int i = 0; i < features; i++) {
    int sizeX = feature[i][0];
    int sizeY = feature[i][1];
    // Each position:
    for (int x = 0; x <= frameSize-sizeX; x++) {
        for (int y = 0; y <= frameSize-sizeY; y++) {
            // Each size fitting within the frameSize:
            for (int width = sizeX; width <= frameSize-x; width+=sizeX) {
                for (int height = sizeY; height <= frameSize-y; height+=sizeY) {
                    count++;
                }
            }
        }
    }
}

其结果是162,336。

我唯一的方法,我找到近似的“超过18万”中提琴&琼斯说,放弃假设#4和在代码中引入bug。这涉及将四条线分别改为:

for (int width = 0; width < frameSize-x; width+=sizeX)
for (int height = 0; height < frameSize-y; height+=sizeY)

结果是180,625。 (注意,这将有效地防止特征从未接触子帧的右侧和/或底部。)

现在当然的问题:他们在实施中犯了一个错误?考虑表面为零的特征是否有意义?或者我看到了错误的方式?

最佳答案
仔细看看,你的代码看起来对我是正确的;这使得人们怀疑原始作者是否有一个一个一个的bug。我想有人应该看看OpenCV如何实现它!

尽管如此,一个建议,以使它更容易理解是翻转for循环的顺序先遍历所有的大小,然后循环的可能位置给定的大小:

#include <stdio.h>
int main()
{
    int i, x, y, sizeX, sizeY, width, height, count, c;

    /* All five shape types */
    const int features = 5;
    const int feature[][2] = {{2,1}, {1,2}, {3,1}, {1,3}, {2,2}};
    const int frameSize = 24;

    count = 0;
    /* Each shape */
    for (i = 0; i < features; i++) {
        sizeX = feature[i][0];
        sizeY = feature[i][1];
        printf("%dx%d shapes:\n", sizeX, sizeY);

        /* each size (multiples of basic shapes) */
        for (width = sizeX; width <= frameSize; width+=sizeX) {
            for (height = sizeY; height <= frameSize; height+=sizeY) {
                printf("\tsize: %dx%d => ", width, height);
                c=count;

                /* each possible position given size */
                for (x = 0; x <= frameSize-width; x++) {
                    for (y = 0; y <= frameSize-height; y++) {
                        count++;
                    }
                }
                printf("count: %d\n", count-c);
            }
        }
    }
    printf("%d\n", count);

    return 0;
}

与之前的162336具有相同的结果

为了验证它,我测试了4×4窗口的情况,并手动检查所有情况(易于计数,因为1×2 / 2×1和1×3 / 3×1形状是相同的只有90度旋转):

2x1 shapes:
        size: 2x1 => count: 12
        size: 2x2 => count: 9
        size: 2x3 => count: 6
        size: 2x4 => count: 3
        size: 4x1 => count: 4
        size: 4x2 => count: 3
        size: 4x3 => count: 2
        size: 4x4 => count: 1
1x2 shapes:
        size: 1x2 => count: 12             +-----------------------+
        size: 1x4 => count: 4              |     |     |     |     |
        size: 2x2 => count: 9              |     |     |     |     |
        size: 2x4 => count: 3              +-----+-----+-----+-----+
        size: 3x2 => count: 6              |     |     |     |     |
        size: 3x4 => count: 2              |     |     |     |     |
        size: 4x2 => count: 3              +-----+-----+-----+-----+
        size: 4x4 => count: 1              |     |     |     |     |
3x1 shapes:                                |     |     |     |     |
        size: 3x1 => count: 8              +-----+-----+-----+-----+
        size: 3x2 => count: 6              |     |     |     |     |
        size: 3x3 => count: 4              |     |     |     |     |
        size: 3x4 => count: 2              +-----------------------+
1x3 shapes:
        size: 1x3 => count: 8                  Total Count = 136
        size: 2x3 => count: 6
        size: 3x3 => count: 4
        size: 4x3 => count: 2
2x2 shapes:
        size: 2x2 => count: 9
        size: 2x4 => count: 3
        size: 4x2 => count: 3
        size: 4x4 => count: 1

转载注明原文:算法 – Viola-Jones的面部检测声称有180k的功能 - 代码日志