Information fusion - Basic concepts and notations

Basic concepts and notations

1.3 Information fusion

The results in the dissertation relating to information fusion are based on the consideration that merging the output of member components may lead to applications of higher accuracy. In our case, the member components are primarily image processing algorithms, whose outputs are merged for object detection purposes. Thus, besides introducing the general corresponding formalism, we also specialize our notation to this scenario.

As classic formulation [56], let D be a set (ensemble) of classifiers (voters) D₁,D₂,. . .,D_n, Di : Λ⊆R^m → R^M_≥0 (i = 1,. . .,n), and Ω = {ω₁,ω₂,. . .,ω_M} is a set of finite class labels.

The classifier D_i assigns the support values D_i(χ) = (d_i,1(χ),. . .,d_i,M(χ)) to a feature vector χ∈Λ describing the opinion of the classifier on what degree χ should be labeled by ω₁,. . .,ω_M, respectively. Then, in a fusion-based scenario, the final class label is determined forχby applying some rule to the individual labels supported by the classifiers D₁,. . .,D_n. Namely, as a general formulation, for each j (j =1,. . .,M) a discriminator function g_j(χ) is calculated as

g_j(χ) = F(d_1,j(χ),. . .,d_n,j(χ)), (1.1) where F is a combination function. According to the selection of the support values d_i,j(χ) and the combination function F, we can set up several decision rules and derive different ensemble classifiers like the following algebraic ones:

Notice that these properties give constraints on which class labelω_k should be selected forχ. Here, we apply the intuitive notation D(χ) =ω_k instead of D(χ) = (0,. . ., 0 non-zero support 1 is put on the k-th label, since the overall aim for any ensemble classifier is to select a single class label as a final decision.

The simple majority voting based classic ensemble classifier can be derived by restricting the support of the individual classifiers with d_1,j(χ) = 1, if the classifier D_i labels χ in the class ω_j and d_1,j(χ) = 0, otherwise. The final labeling of the ensemble is based on determining the class received the largest support in terms of the number of votes:

D_maj(χ) = ω_k ⇐⇒ g_k(χ) =max^M

From the simple majority voting model we can easily derive a weighted one with assigning weightsw_i∈R≥0 to the classifiers D_i implying the following final decision rule:

D_wmaj(χ) = ω_k ⇐⇒ g_k(χ) =max^M

In contrast to classic majority voting, here we consider each classifier output equipped with different weightswi(0≤wi≤1,i=1,. . .,n). It seems natural to give the classifiers with higher accuracies greater importance in making the final decision. Notice that the classic majority voting scheme can be considered as a special case of the weighted one, since in the majority rule the weight of each vote given by a classifier is constrained to 1 i.e.wi =1 for all i=1,. . .,n.

Several results of the dissertation correspond to ensembles A = {A₁,A₂,. . .,A_n} of object detector algorithms A_i (i =1,. . .,n). Based on the nature of the problem, we separate the cases when a single or multiple objects are to be detected in an image. Moreover, we focus on such objects that can be represented by single pixels with e.g. their centers. Thus, the object detector algorithms give their votes in terms of pixels as candidates for the location of the object. According to these considerations, we introduce the related notations as follows.

Let I be a digital image of size r×c. A candidate extractor algorithm for a single object detection scenario is defined as A˙_i : I → P({1,. . .,c} × {1,. . .,r}) with i ∈ {1,. . .,n} and A˙_i(I) = {c˙^I_i,1,c˙^I_i,2,. . .,c˙^I_i,k}, where k ∈ N and P(A) denotes the power set of a set A. Notice that this definition allows a candidate extractor algorithm to give more candidates within an image for the possible location of a single object. However, we will investigate also such cases, when each ensemble member can have only one candidate for the location of the object. Thus, when with k =1 algorithm A˙_i has a single candidate, we will write c˙^I_i instead of {c˙^I_i,1}for its candidate set.

If multiple objects may appear in the image, the definition of a candidate extractor algorithm is modified accordingly as A¨_i : I → P({1,. . .,c} × {1,. . .,r}) with A¨_i(I) ={c¨^I_i,1,c¨^I_i,2,. . .,c¨^I_i,k}, where k ∈ N. Note the difference between the notation of the candidates corresponding to the single and multiple objects detection scenarios. Namely, c˙^I_i,j for some j ≤ k is the j-th guess of A˙_i for a single object, while c¨^I_i,j (j ≤ k) predicts the appearance of a desired object in the corresponding location. Since the chapters of the dissertation separate the single and multiple object detection scenarios, we will use the simple notations A_i and c^I_i,. for both cases if it does not hurt clarity. As a further simplification of notation, we will omit the symbolI from the upper index of candidates, when only one image is concerned and write c_i,. for short.

As from the candidate extractor algorithmsA₁,A₂,. . .,A_n ensemble candidates are composed via the fusion of their candidate sets A(I) = ∪ⁿ_i₌₁A_i(I), we define a confidence measure to describe the rate of agreement of the members on the specific candidates. To do so, first we introduce a proximity relation =∼ to decide whether two candidates indicate the same object or not. With c₁,c₂ ∈ {1,. . .,c} × {1,. . .,r}, we say that c₁ ∼= c₂ if d(c₁,c₂) < T_d with some distance threshold T_d ∈ R≥0. As in our applications the objects to be detected are circular, T_d can be selected as the diameter of the desired object. Now, the confidence of the ensemble on any of its candidatesc ∈ A(I) is defined as

conf_A(c) = |{A_i ∈ A : ∃c⁰∈ A_i(I) such that c∼=^c⁰^}| /|A|. (1.8) Notice thatconf_A(c) ∈ {k/|A| : k=1,. . .,|A|}. We also classify the ensemble candidates based on the degree of confidence. Namely, the α–level candidates of A are defined as

A(I)α ={c∈ A(I) : conf_A(c) ≥α}, where 1/|A| ≤α≤1. (1.9) As specific cases, 1–level, α–level with α >1/2, and 1/|A|–level candidates are selected by each of, the majority of, and at least one of the members of the ensemble, respectively. For the latter case it should be noted that A(I) = A(I)_1/|A|. Single algorithms are formally represented by ensembles consisting of one member providing |A| = 1 and α = 1–level confidence for all the candidates.

1.3.1 Error measurement

We also must set up a framework to measure the accuracy of ensemble-based approaches discussed in the dissertation for object detection in practical problems. We start with making the decision whether the candidates found by a specific member algorithm are true or false ones regarding some ground truth usually provided in terms of manual annotations.

LetGT (I)⊆ {1, . . ., c} × {1, . . ., r}be a so called ground truth set of candidates for an image I. For the classification of each candidatec∈ A(I) of an ensembleAin the same image regarding GT(I) and confidence level 1/|A| ≤ α≤1 we apply the following:

• c is an α–true positive (T P_α), ifc∈ A(I)α and ∃c_gt ∈ GT(I) such that c ∼=^cgt;

• c is an α–false positive (F Pα), if c∈ A(I)α and @^cgt ∈ GT(I) such that c ∼=cgt.

Regarding a candidate c_gt ∈ GT(I), we apply:

• c_gt is an α–false negative (F N_α), if @^c^{∈ A}(I)α such thatc ∼=c_gt. Finally,

• each point in {1,. . ., c} × {1, . . ., r}\(GT(I)∪ A(I)α) is an α–true negative (T N_α).

The set of all true positives, false positives, false negatives, and true negatives for a given image I will be denoted by T P(I)_α,F P(I)_α,F N(I)_α, and T N(I)_α, respectively. Notice that GT(I) is usually a manually annotated set created by experts of the application field. Moreover, since performance evaluation is usually expected to be given at database level, digital images are often organized into an image database DB. Now, to calculate the performance of an ensemble A regarding some ground truth, we introduce the following classic measures at both image and image database level [57, 58]:

• Competition Performance Metrics CP M(DB) = ^X

g∈G

{SEN(DB)α: F P I(DB)α=g}

|G| , whereG=

In document 2015 DissertationfortheDoctoralDegreeoftheHungarianAcademyofSciences Andr´asHajdu DISCRETEGEOMETRICANDFUSIONBASEDTECHNIQUESFOROBJECTDETECTIONANDDECISIONSUPPORT (Pldal 24-28)