• Nem Talált Eredményt

5.2.1 3D CAD program

5.3 Transformation decomposition

5.3.1 Problem formulation

The coordinates of the features of the recognized objects are known in two coordinate systems, see Figure 5.1:

• Common projective frame of the scene, in this coordinate system every feature is described with its projective coordinates. This system is common for every object in the scene though contains no Euclidean information.

This information is the output of the projective reconstruction.

• Euclidean frame of an object (attached to the object) as stored in the object database. This information is object dependent and contains metrical data. Every object has its ows Euclidean frame.

Using this twofold description of the recognized features makes it possible to determine the relative Euclidean trans-formation (position, orientation) between object frames as occurs in the scene. If one of the frames is absolute (known in world reference frame), then it is possible to describe the scene in the absolute Euclidean coordinate frame.

The candidate collineations between the common projective space of the scene and the local frame of the recog-nized object are already determined as the output of verification step of the object recognition, see Section 4.4. From the members of the given cluster in consolidation (see Section 4.5) the collineation is updated from the candidates, therefore this is the most accurate estimation that is available. But there is no internal constraint that could be applied to the elements of the collineation during projective reconstruction. This means that the elements depend only on the data from which they are estimated, there is no inter-dependency between elements.

X

Y Z

Y X

Z

Y X Z

H

A

H

B

D

Frame A

Frame B

Common projective frame

Figure 5.1: Reference frames used in the displacement calculation

However the calculation of the Euclidean transformation between objects allows introducing additional con-straints. Let the collineations of two recognized objects be HA and HB, respectively. Let us suppose, that the collineations describe the mapping from scene frame into Euclidean object frames. In this case the displacement that describes the mapping from metric frame of the object A into metric frame of object B, can be calculated as D=HB1HA. But matrix

D4×4=s





 Ω t zT 1







describes a metric (Euclidean) transformation, therefore there are some constraints that must be fulfilled:

• TheΩshould be a rotation matrix

– Orthogonality condition:ΩΩT =ΩTΩ=I – Non-reflection condition:|Ω|= +1

• The||zT||2=0

The value of s must not be too small or too large (valid scaling)

Due to noise and other disturbances the matrixΩis not a rotation matrix. The aim is to determine the “closest”

rotation matrix R to matrixΩminimizing||R−Ω||2F such that RTRI =0 and|R|= +1, where||||F denotes the vector compatible Frobenius norm.

5.3.2 Solution techniques

The solution can be found factorizing the matrixΩ. The possible (most commonly used) matrix factorization algo-rithms that yield orthogonal matrix are the following [93]:

• Singular Value Dedomposition (SVD) givesΩ=UΛVT where U and V are orthogonal matrices. The draw-backs of the algorithm are:

– Small perturbation could yield very different orthogonal factorization (though the singular values remain stable).

– Theoretically there are infinite many ways as a rotation matrix can be composed from two other rotations.

• QR decomposition givesΩ=QR, where Q is an orthogonal and R is a lower-triangular matrix, respectively.

The drawback is that the given orthogonal matrix is basis-dependent.

• Polar decomposition givesΩ =RS, where R is an orthogonal and S is a symmetric positive definite matrix (will be discussed later in details). If|R|=−1 (reflection included), then the decomposition can be written into the following form:Ω=(−R)(I)S

Using the polar decomposition let the original displacement matrix be factorized into the following form:

D4×4=s





 I 0 zT 1







| {z }

P





 I t 0T 1







| {z }

T







R 0

0T 1







| {z }

Θ







V 0

0T 1







| {z }





 S 0 0T 1







| {z }

Σ

where s is a scale, the matrices are responsible for perspectivity (P), translation (T), rotation (Θ), mirroring (∆) and stretch (transformed shear,Σ), respectively.

Using the above decomposition the constrains can be expressed as a limit on phisically meaningful quantities.

• The perspectivity must be an identity matrix, this means that||zT||2=0.

The mirroring must be an identity matrix: V=I.

The stretch must be an identity matrix: S=I or in a more general case a diagonal matrix, yielding isotropically or anisotropically scaled Euclidean transformation.

The value of s should not be too small or too large (valid scaling).

The translational part t is unconstrained.

The output of the algorithm is the Euclidean (metric) transformation between the two reference frames of the objects A and B that describes the relative position and orientation between two recognized objects. This information can directly be used in a robot control system.

5.3.3 Polar decomposition

The polar decomposition problem can be formulated as given the matrix Q∈RN×N, find R∈RN×Nsuch that

minR ||RQ||2F such that RTRI=0 (5.1) where||||Fdenotes a vector compatible Frobenius norm, A=[aik]→ kAk2F =P

iP

ka2ik=trace(ATA).

With the notations R=(r1r2r3) the constraint is equivalent to rT1r1 =1, rT1r2=0, rT1r3 =0, rT2r2=1, rT2r3=0, rT3r3 = 1, therefore six Lagrange multiplicatorsλ111213222333 are needed which can be colledted in a symmetric matrixΛ.

The polar decomposition method factors the matrix Q as Q = RS, where R is an orthogonal matrix and S is symmetric positive definite matrix.

Simple computation shows that

(RTRI)Λ=













rT1r1−1 rT1r2 rT1r3

rT1r2 rT2r2−1 rT2r3

rT1r3 rT2r3 rT3r3−1

























λ11 λ12 λ13 λ12 λ22 λ23

λ13 λ23 λ33













=













(rT1r1−1)λ11+rT1r2λ12+rT1r3λ13 ⋆ ⋆

rT1r2λ12+(rT2r2−1)λ22+rT2r3λ23

⋆ ⋆ rT1r3λ13+rT2r3λ23+(rT3r3−1)λ33













traceh

(RTRI)Λi

= (rT1r1−1)λ11+2λ12rT1r2+2λ13rT1r3+ λ22(rT2r2−1)+2λ23rT2r3+

λ33(rT3r3−1)

Since rTirj=0 ⇔ 2rTirj=0,∀i, j, hence the constraints can be taken into consideration by using the Langrange multiplicator rule as

traceh

(RTRI)Λi andΛis symmetric.

Writing the||RQ||2Fas trace((RQ)T(RQ)) and applying the Lagrange multiplier rule yields

L=trace((RQ)T(RQ)+(RTRI)Λ) (5.2) whereΛis the Lagrange multiplier. Making the derivative operator dL() equal to the zero operator O() yields

O() = dL() = traceh

()TR+RT()−QT()−()TQ+()TRΛ+RT()Λi

= traceh

2(RTQTRT)()i

= trace [2A()] (5.3)

Since

dL()=O() ⇔ ∀X : trace(AX)=0 (5.4)

and

trace(AX)=X

i

(AX)(i,i)=X

i

eTi(AX)ei=X

i

eTi X

µ

X

ν

A(µ, ν)eµeTνXei=X

µ

X

ν

A(µ, ν)eTνXeµ

hence choosing X=eieTj it follows X

µ

X

ν

A(µ, ν)eTνeieTjeν=A( j,i)=0,∀i,jAT=0 and thus

RQ+RΛ=0 Rearranging the equation

R (I+Λ)

| {z }

symmetric

=RS=Q

In order to express the elements of S with the elements of Q (eliminate the unknowns inΛ), the following form is used:

RTR=I and R=QS1 (5.5)

A symmetric matrix has symmetric inverse, therefore

(QS1)T(QS1)=I S1QTQS1=I This gives

S2=QTQ

which is always symmetric and positive definite (note: Q is nonsingular). Using the SVD of S2and applying the fact that S2is symmetrical yields

QTQS2 S V D−−−→S2=US2ΣS2UTS2 and ΣS2 =< σ1 σ2 σ3> with σ1≥σ2≥σ3≥0 (5.6) Since S is symmetrical and positive definite hence S is the positive definite square root of S2:

S=US2Σ

1 2

S2UTS2 and Σ

1 2

S2 =<+√σ1 +√σ2 +√σ3> (5.7) On the other hand the second derivative of the objective function||RQ||2F is twice the unit matrix, 2I > 0 (posi-tive definite Hessian matrix), hence the sufficient conditon of the global optimum is also satisfied with the above S.

Therefore the global optimal solution is

R=QS1=QUS2Σ

1 2

S2UTS2 (5.8)

Note: if the SVD of the matrix is already given, then the polar decomposition of Q can be calculated as

QTQ−−−→S V D UΣUTS=UΣ12UTR=QS1=QUΣ12UT (5.9)

Q−−−→S V D UΣVTQ=(UVT)(VΣVT)=RS (5.10)

5.3.4 Alternative nonlinear method for transformation update

The original implementation of the system uses a nonlinear method to determine the best estimation of the Euclidean transformation that fulfills the constraints (in case of points only). Using the situation depicted in Figure 5.1 the Euclidean coordinates of a point of object B (with reference frame B) in the frame of object A can be expressed in two ways. The first one is the direct application of the collineation HAthat maps from projective to Euclidean frame. The second one is to transform the coordinates into the frame of object B using HBthen apply D. Using the notations of the Section 4.4 the relation can be written into equation form

HAXjDHBXj (5.11)

This equation can be rewritten into the form VA = DVB. Rescaling all of the Vi in order to represent Euclidean coordinates the displacement can be calculated in closed form using quaternions, see [61], [60].

In order to put all the results together, a refinement step is also developed. The relation (5.11) can be written as equality

nP

X

j=1

λHAXjDHBXj=0

The unknowns are the elements of D, HA, HBandλ. Constraints must be introduced for D to hold the desired form.

Using the properties of the rotation matrix contained in D, the constraints are the following:

X3 k=1

D( j,k)D(l,k) = 0 j,l=1, . . . ,3 j,l orthogonality of the rows X3

k=1

D(k,j)D(k,l) = 0 j,l=1, . . . ,3 j,l orthogonality of the columns X3

k=1

D2(4,k) = 0 first three elements of last row are zero D(4,4)−1 = 0 scaling is one

These systems of equations can be minimized with Levenberg-Marquardt method. The initial values of the unknown collineations are the output of the verification (Section 4.4) and the calculation is based on quaternions for displace-ment.

KCam

KBlock Km

KE

K0

KObj

KB KConv

KB K0 Km KE KObj

KConv KBlock

KCam DBlock,Obj

Figure 5.2: Transformation graph of the robot