** 3 | B ASIC CONCEPTS**

**3.4 Eigenvectors and eigenvalues ?**

### 1

^{Figure}

^{3}

^{.}

^{9}: Sample feature distribution

with observations belonging into two possible classes (indicated by blue and red).

(−∞,x]and(x,∞), with x either being1or3. As a first step, we need to calculate the contingency tables for the two potential locations of spitting X.

The contingency tables for the two cases are included in Table*3.6, and they*
inform us about the number of feature values that fall into a given range of
X broken down for the different class labels.

Class labelY

Table3.6: Sample dataset for illustrating feature discretization using mutual information.

We can now calculate the different mutual information scores
correspond-ing to the cases of discretizcorrespond-ing X uscorrespond-ing the criteria X ≤ ^{1}^{or X} ≤ ^{3. By}
denoting the discretized random variable that we derive from X by using a
threshold for creating the bins to be t∈ {^{1, 3}}^{by X}t, we get that

We arrived to the above results by applying Eq.(3.8)for the data derived
from the contingency tables in Table*3.6. Based on the values obtained for*
MI(X_{1},Y)and MI(X_{3},Y), we can conclude that – according to our sample

– performing discretization using the threshold t=1is a better choice. Can you find another data point from Figure3.9, using which as a boundary for discretization per-forms even better than the choice of t=1?

*3.4* Eigenvectors and eigenvalues ?

The concepts of**eigenvectors**and**eigenvalues**frequently reoccur
during our discussion of various topics later on related to e.g.
di-mensionality reduction (Chapter6) and graph-based data mining
approaches (Chapter8). In order to ease the understanding of those
parts, we revisit here these important concepts next briefly.

Given a square matrixM∈ **R**^{n}^{×}^{n}, an eigenvalue–eigenvector pair
forMsatisfies the following equality

Mx=*λx,* (3.9)

b a s i c c o n c e p t s 51

for some scalar*λ*andn-dimensional vector**x. Note that any**n×^{n}
matrix hasn(not necessarily distinct) eigenvalue–eigenvector pairs.

To this end, we shall index the different eigenvalues and eigenvectors
a matrix has as*λ*i,**x**i (1 ≤ ^{i} ≤ n). In the general case, the*λ*i values
can be complex as well, however, matrices that we consider in this
book can always be assumed to have real eigenvalues.

Intuitively, an eigenvector for some matrix Mis such a vector, the
direction of which does not get altered – modulo to reflection
per-haps – relative to its original orientation. Although the direction of
the eigenvectors remains intact, their magnitude can change. The
rate with which an eigenvector changes is exactly its corresponding
eigenvalue. This means that if an eigenvector**x**has a corresponding
eigenvector of2, then all the components of the matrix–vector
prod-uctMxwould be twice the original coordinates of the original vector
**x.**

It turns out that the eigenvalues for matrixMare such values*λ,*
which satisfy that det(M−* ^{λI}*) = 0, whereIdenotes the identity
matrix and det refers to the determinant of its argument.

We can see, that the definition of an eigenvalue according to Eq. (3.9) implies that the equation

(M−* ^{λI}*)

**x**=

**0**(3.10)

also has to hold, with**0**marking the vector of all zeros. This kind of
homogeneous system of linear equation can be trivially solved by
**x**=**0. This solution is nonetheless a trivial one, which works for any**
M. If we wish to avoid obtaining such a trivial – hence uninteresting
– solution, the rows of(M−* ^{λ}*I)should not be linearly independent.

Had(M−* ^{λ}*I)be of full rank (meaning that it consisted of linearly
independent rows), the only solution which would satisfy Eq. (3.10)
would be the trivial one. The way to ensure(M−

*λI*)not to be of full rank, is to require det(M−

*) = 0 to hold. Determinants can be viewed as polynomials, hence the eigenvalues ofMare the roots of the polynomial that we obtain from the determinant of the matrix M−*

^{λI}*λI. The polynomial that we can construct from the determinant*of matrix(M−

*)is called tha*

^{λI}**characteristic polynomial**ofM.

Essentially, the*λ*eigenvalues forMare the roots of the characteristic
polynomial derived from M.

**Example3.5.** As a concrete example, let us determine the eigenpairs of the
matrix M =

"

5 1 4 8

#

. Based on the previous discussion, the eigenvalues of M need to be such that the determinant of the matrix

M−*λI*=

"

5−* ^{λ}* 1
4 8−

^{λ}#

equals zero. Calculating the determinant of a2x2matrix is easy, as all we need to do is to multiply its elements across its diagonal and subtract the product of the elements in the off-diagonal, leaving us with the polynomial

p(*λ*) = (5−* ^{λ}*)(8−

*)−*

^{λ}^{1}∗

^{4}=

*λ*

^{2}−

^{13λ}+36.

Finding the roots of the above quadratic equation, we get that*λ*1 = 4and
*λ*_{2} =9. If we substitute back, we get that the eigenvector**x**accompanying
the eigenvalue*λ*=4has to fulfil

"

1 1 4 4

#
**x**=**0,**

which is a system of two linear equations and two unknowns.

Since we deliberately constructed our system of equation such that it
consist of linearly dependant row coefficients, we are free to choose one of the
variables in**x**to any value. Let us hence arbitrarily set x_{2}to*1.*

What it means – because of the x_{1}+x_{2} = 0and4x_{1}+4x_{2} = 0
requirements in our system of equations – is that x_{1} = −^{x}2, meaning that
the eigenvector corresponding the eigenvalue4is**x**=

" would yield a valid solution to the above linear system of equations. In order to avoid this ambiguity, a common practice is to think of and report eigenvectors such that they have a unit norm. What it means, that the canonical way of reporting the eigenvector that we just found is to devide all of its components by its norm, i.e.,√

2in the given example. What it means is that one of the eigenvalue–eigenvector pairs for matrix M is

4,

After a similar line of thought, we get for the other eigenvalue that

" means that the eigenvector corresponding to the eigenvalue9for matrix M is

, or in its unit normalized canonical form

"

0.97014 0.24254

# .

Let us now also see how can we obtain the eigenpairs of matrixM using Octave. Figure3.10and Figure3.11provides two alternative ways for doing so.

b a s i c c o n c e p t s 53

Figure3.10illustrates the usage of the convenient built-in function of Octave for calculating eigenproblems. The default behavior of the functioneigis that it returns the eigenvalues of its argument in the form of a vector. In case we also want to know the eigenvectors cor-responding to the particular eigenvalues, we can also do so, by using not only the default return value of the function, but the tuple of ma-trices it can also return. In the latter case the matrix returned second contains the eigenvalues of the argument in its main diagonal, and the matrix returned at position one contains one (unit-normalized) eigenvector in each of its columns. The eigenvalue in the same posi-tion in the second returned matrix corresponds to the eigenvalue in the same column from the first returned matrix.

M=[5 1; 4 8];

# obtaining only the eigenvalues of M eigenvals = eig(M)

>> eigenvals = 4

9

# obtaining both the eigenvectors and eigenvalues of M [eigenvecs, eigenvals] = eig(M)

>> eigenvecs =

-0.70711 -0.24254 0.70711 -0.97014 eigenvals =

Diagonal Matrix

4 0

0 9

**C****ODE SNIPPET**

Figure3.10: Eigencalculation using Octave

We provide an alternative way for calculating the eigenpairs of the
same matrixMin Figure3.11. Notice how the provided calculation
connects to the calculation provided in Example3.5. In Figure3.11,
the functionrootsis used for finding the roots of a polynomial
deter-mined by its coefficients of decreasing degree and the functionnull
finds (unit-normalized) vectors in the**null space**of its argument.
Re-call that a null space of some matrixAare such vectors**x**for which

Ax=**0**holds. Can you anticipate the values for

eig_vals,^{eig_vec1}and^{eig_vec2}
in Figure3.11? Hint: looking back
at Example3.5and Figure3.10can
help a lot in giving the right answer
(potentially modulo to a multiplier
-1foreig_vec1andeig_vec2)?

### ?

eig_vals=roots([1 -13 36]);

eig_vec1=null(M-evals(1)*eye(size(M)));

eig_vec2=null(M-evals(2)*eye(size(M)));

**C****ODE SNIPPET**

Figure3.11: An alternative way of determining the eigenpairs of matrixM without relying on the built-in Octave functioneig.

*3.5* Summary of the chapter

This chapter introduced the basic concepts and fundamental tech-niques for data representation and transformation in data mining.

At the end of the chapter, we revisited the problem of eigencalcu-lation for matrices, a technique that we will refer to multiple times throughout the remainder of the book.