Simplifying object region representations

Optimal approach for fast object-template matching

Algorithm 2.2.8. Random sampling algorithm for computing RCVT

2.3 Experimental results

2.3.2 Simplifying object region representations

Though chamfer matching is usually employed to match object contour representations, it is also possible to apply it to region matching, e.g. for human body part matching [112]. After applying RCVT optimization to simplify regions, we obtained similar experimental results to the contour case described in section 2.3.1. Such an example for a human body silhouette is shown in Figure 2.16.

(a) (b) (c)

Figure 2.16: The result of the RCVT algorithm in 2D; (a) the object A to be simplified (walking human), (b) the initial generators chosen by naive equidistant sampling, (c) the generators achieved by the RCVT algorithm for A =B.

If the regions to be matched are approximately of the same size (whole shape matching), then it can be sufficient to execute the matching as we discussed for the contour objects. However, if we want to match object parts, as we discussed it in section 2.3.1, then we need to modify our approach. Because, with the same setup, the object part would give very good matches anywhere inside the target object, as the distance values are 0 there (or close to 0 in case of simplification).

To overcome this difficulty, we can consider a subregionA^b of the complimentA^c of the object part (its background). A natural selection for A^b is A^b= (A⊕nC)∩W for some n ∈N, and W ⊆A^c,

as shown in Figure 2.17(b). The role of W is to define an area around the object part A within its background. If desired, we can simplify A and A^b using the RCVT approach, obtaining A⁰ and Ab⁰, respectively.

Furthermore, we determine the distance maps both for the target objectS (see Figure 2.17(a)) and for A (see Figure 2.17(b)). To determine how good the fit is at a point x∈Z², we calculate

X y∈A⁰∪Ab⁰

|d_S(x+y)−d_A(y)|. (2.20) The best match is found when this sum is minimal. Notice that in this example we ignored all the geometric transformation issues discussed in section 2.1. To match body parts based on regions, another approach could consider signed distance maps [113].

(a) (b) (c)

Figure 2.17: Chamfer matching of object parts; (a) target object, and its distance map, (b) from top to bottom: head template region A (white) with a subset A^b of its complement (gray), the distance map ofA, and the simplification of Aand A, (c) best matching position for the simplified^b template.

Skeleton-based simplification

To realize a more adaptive simplification approach than the naive ones illustrated in Figure 2.16, it is also possible to focus more on some object regions during their simplification, e.g. if we expect larger uncertainty closer to the boundary of the object in the matching procedure. In this case, we can define a weight function %_e according to problem (2.4), which concentrates on the object morphological skeleton [114]. Thus, for every x∈B, we define

%e(x) = ^d(x,B^c)

d(x, sk(A)) +d(x,B^c)^, ^(2.21) where sk(A)is the skeleton ofA[114],B^cis the complement of B, anddis some common distance function. Note that for allx∈B, 0 ≤%_e(x)≤1, such that %_evanishes at the boundary of B, then monotonously increases till reaching sk(A), where it takes value 1. In this way, the weight for the points within B is adjusted according to their relative distance from the ”center” of A and from B^c. Now, to derive a weight (density) function overB, put

%(x) = ^%^e(x)

R B

%e(y)dy for x∈B. (2.22)

In practice, this weight function can be easily derived using the distance maps [90] of sk(A), andB^c to approximated. Figure 2.18 depicts such an example, whereA=B, the basic CVT case is used. We also found the pruning (removal of small branches) of the skeleton to be useful.

(a) (b) (c) (d) (e) (f)

Figure 2.18: Simplification result of an object according to a weight function concentrating on its skeleton; (a) target object A = B (sk(A) is shown by dashed line), (b) result of simplification in case of uniform weighting, (c) distance map of sk(A), (d) distance map of B^c, (e) weight function

%e (higher intensities show larger weight values), (f) result of CVT simplification using % derived from %.e

To be able to begin the template matching, first we need to extract the target object from the input image. In our system SHARE, one of the desired tasks is the detection of human (victim or firefighter) appearance within thermal videos from a rescue scenario. To reach this aim, we considered a fuzzy segmentation technique [115], which is known to be robust also for medical (e.g. CT) images. The extracted binary region usually needs some simple postprocessing to smoothen the boundary, eliminate gaps and holes, etc. These minor refinements can be achieved by some classic elements of mathematical morphology [114]. Some input test images together with the result of the fuzzy segmentation is shown in Figure 2.19.

(a) (b)

Figure 2.19: The result of fuzzy segmentation; (a) standing human pose, (b) walking human pose.

To create templates to be matched, we created simulated (artificial) data using the realistic

3D human motion software Poser^R². As it is not a crucial point in our present experiments, we ignored all the geometric transformation issues on how to align the target and template objects.

As a guideline, we recommend to consult with [101] to gather information on a robust search of the affine parameter space.

To demonstrate the main idea of our approach, we show the performance of such templates that do not completely fit the target. As our main goal is to introduce the novel idea on skeleton-based simplification and matching, the corresponding test set of two objects is just to reflect the possible improvements and relations. With the skeleton-based simplification, in the case of a regular human pose/motion (like standing/walking), we can expect better matching for more template elements, since the skeleton does not change drastically with the boundary e.g. for close phases of the same motion. The goodness of fit value of a template against the target object is calculated as the percentage of the matching pixels of the template at the best matching position. Figures 2.20 and 2.21 show the best matching positions of a standing and walking template together with their discussed simplifications, respectively. The skeleton of the original template is also marked.

(a) (b) (c) (d)

Figure 2.20: Best matching position (shown in white) for standing template using (a) original template (skeleton is marked), (b) trivial uniform simplification, (c) CVT-based uniform simplifi-cation, (d) skeleton-based simplification. Target object is shown in gray.

We can see that the skeleton-based simplification has better performance than the uniform ones, since the main skeleton of the target objects and the templates did not differ that much.

The proper quantitative results are given in Table 2.1.

Simplification

No Unifom Unifom Weighted (original) (trivial) (CVT) (skeleton) Standing pose 80,9% 81,2% 81,1% 94,0%

Walking pose 85,8% 86,2% 86,2% 94,6%

Table 2.1: Goodness of fit of simplified templates given in terms of the percentage of matching template points.

This analysis also validates the natural assumption that the uniformly sampled template be-haves similarly to the original one.

2Poser^R is a registered trademark of Smith Micro Software, Inc.

(a) (b) (c) (d)

Figure 2.21: Best matching position for walking template using (a) original template, (b) trivial uniform simplification, (c) CVT-based uniform simplification, (d) skeleton-based simplification.

In document 2015 DissertationfortheDoctoralDegreeoftheHungarianAcademyofSciences Andr´asHajdu DISCRETEGEOMETRICANDFUSIONBASEDTECHNIQUESFOROBJECTDETECTIONANDDECISIONSUPPORT (Pldal 51-55)