• Nem Talált Eredményt

University of Miskolc

N/A
N/A
Protected

Academic year: 2022

Ossza meg "University of Miskolc"

Copied!
42
0
0

Teljes szövegt

(1)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page1of42

JJ II

J I

← - , →

Full Screen Search

Close

BASICS I.

Fegyverneki Sándor

University of Miskolc

Department of Applied Mathematics

matfs@uni-miskolc.hu

17th of February 2021.

(2)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page2of42

JJ II

J I

← - , →

Full Screen

1 Introduction

• Introduction

• Self-information

• Entropy

• Properties of entropy

• Notations

• References

(3)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page3of42

JJ II

J I

← - , →

Full Screen Search

Close

Theory of statistical communication:

Information theory, signal detecting, stochastic filtration.

(4)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page4of42

JJ II

J I

← - , →

Full Screen

Information theorists devote their efforts to quantitative examination of the following three questions:

1. What is information?

2. What are the fundamental limitations on the accuracy with which information can be transmitted?

3. What design methodologies and computational algo- rithms yield practical systems for communicating and stor- ing information that perform close to the aforementioned fundamental limits?

(5)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page5of42

JJ II

J I

← - , →

Full Screen Search

Close

Early work on statistical physics by

L. Boltzmann (1896), Szilárd L. (1929), Neumann J.

(1932).

Communication theory

H.Nyquist (1924), R.V.L. Hartley (1928).

Mathematical model of communication theory:

C.E. Shannon (1948).

quick development, problems, methods and results Parallel theory developed independently by Norbert Wiener (1948) [160,161] also played a significant part in placing communication theory. on a firm footing rooted in mathematical statistics and led to the development of the closely related discipline of cybernetics.

(6)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page6of42

JJ II

J I

← - , →

Full Screen

General Shannon’s model (one-way, noiseless):

SOURCE→CODING−→CHANNEL−→DECODING→USER

message

translation (coding) speed (capacity) translation (decoding)

(7)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page7of42

JJ II

J I

← - , →

Full Screen Search

Close

General Shannon’s model (one-way, noise):

SOURCE→CODING−→CHANNEL−→DECODING→USER

~ w NOISE source message

translation (coding)

speed (capacity) + noise (error?) translation (decoding) –correction

(8)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page8of42

JJ II

J I

← - , →

Full Screen

2 Self-information

Supplying information is equivalent to removing uncer- tainty.

That is,

information supplied = prior uncertainty - posterior uncer- tainty.

Let

X={x1, x2, . . . , xn} a finite set.

Choosing one element. How much (many) information?

(9)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page9of42

JJ II

J I

← - , →

Full Screen Search

Close

Counterfeit coin

You are given 27 coins, 26 of which have the same weight, and one of which has a lighter weight. You have a balance scale (using a pan balance).

What is the minimum number of weighings it takes to determine which coin has the lighter weight?

(10)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page10of42

JJ II

J I

← - , →

Full Screen

Note that there are three possible outcomes to each weigh- ing: left side heavier, right side heavier, or both sides equal.

In order to do the given task in as few weighings as possi- ble, we will need as much information from each weighing as possible. Hence, all three possibilities should be real- izable for each weighing (except for the final weighing in some scenarios, as we will see below).

3 weighings Generallog3n

(11)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page11of42

JJ II

J I

← - , →

Full Screen Search

Close

H.W.

(a) You are given twelve coins, eleven of which have the same weight, and one of which has a weight different from the others (either heavier or lighter, you do not know). You have a balance scale.

What is the minimum number of weighings it takes to determine which coin has the different weight, and also whether it is heavier or lighter than the rest?

(b) You are givenN coins,N−1of which have the same weight, and one of which has a weight different from the others (either heavier or lighter, you do not know). You are allowedW weighings on a balance scale.

What is the maximum value for N, as a function of W, for which you can determine which coin has the different weight, and also whether it is heavy or light?

(12)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page12of42

JJ II

J I

← - , →

Full Screen

Let

X={x1, x2, . . . , xn} a finite set.

How many binary digits are necessary to the description one element?

Hartley(1928)

I= log2n

(13)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page13of42

JJ II

J I

← - , →

Full Screen Search

Close

Consider sequences withmelements from the setX.

Number of sequences

nm. If

2k−1< nm≤2k, then the number of binary digits

k m for one element of the setX.

Thus

log2n≤ k

m <log2n+ 1 m, that is,

by increasemlog2nis arbitrary approximate.

Unit for information quantity:1bit= ln 2nat.

(14)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page14of42

JJ II

J I

← - , →

Full Screen

The elements are different.

Shannon(1948) – probability Aan event

P(A)is the probability thatAoccurs His solution:

I= log2 1 P(A) Probability field(Ω,F,P)

Classical probability field:Ω ={ω1, ω2, . . . , ωn} P({ωi}) = 1

n ∀i= 1, . . . , n.

(15)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page15of42

JJ II

J I

← - , →

Full Screen Search

Close

Required Properties: additivity, monotonicity, normaliza- tion.

Additivity Letn=N M.

X=

N

[

i=1

Ei.

Ei∩Ej =∅, |Ei|=M.

Two steps: Choose one subset after choose one element.

Idea:

I(N M) =I(N) +I(M).

(16)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page16of42

JJ II

J I

← - , →

Full Screen

Monotonicity

small probability⇒big information quantity for occuring A⊂B⇒P(A)≤P(B)⇒I(A)≥I(B), that is, if

P(A) =P(B) iff P(A)≤P(B) and P(A)≥P(B).

Therefore there exists functionf,such that I(A) =f(P(A)).

(17)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page17of42

JJ II

J I

← - , →

Full Screen Search

Close

Normalization

LetI(A) = 1,ifP(A) =1 2.

|X|= 2.

I(A) = 1bits= ln 2nats.

(18)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page18of42

JJ II

J I

← - , →

Full Screen Search

Theorem.

Iff : (0,1]→Rand (1)f(p)≥f(q),ifp≤q, (2)f(pq) =f(p) +f(q), (3)f(1

2) = 1, then

f(p) = log21 p. Proof.

x:= log21 p The statement

−x if

(19)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page19of42

JJ II

J I

← - , →

Full Screen Search

Close

Figure 1:

(20)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page20of42

JJ II

J I

← - , →

Full Screen Search

By condition (2)

f(pn) =nf(p) (∀n∈N),

if we apply the mathematical induction. From this, if p=1

2, then

f(2−n) =n.

Furthermore,

2−n= 2 n m

!m , that is,

(21)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page21of42

JJ II

J I

← - , →

Full Screen Search

Close

then

f 2 n m

!

= n m. Thus for all rational0< x

f(2−x) =x.

Ifx= 0, then 1 =f(1

220) =f(1

2)+f(20) = 1+f(1), that is f(1) = 0.

(22)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page22of42

JJ II

J I

← - , →

Full Screen Search

For all irrrationalx >0,exit m∈Nandn∈N,that n

m ≤x < n+ 1 m . Then

n

m =f 2 n m

!

≤f(2−x)≤f

2 n+ 1

m

= n+ 1 m . Ifm→ ∞,thenf(2−x) =xThus ifx≥0,then

f(p) = log21 p.

(23)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page23of42

JJ II

J I

← - , →

Full Screen Search

Close

Definition.

The mapξ: Ω→Ris called random variable, if {ξ < x}={ω|ω∈Ω, ξ(ω)< x} ∈ F, ∀x∈R.

Definition.

Distribution functionF(x) =P(ξ < x).

Definition.

Random variableξis called discrete if number of possible values is at most countable infinite.

Sequence:x1, x2, . . . . Definition.

The quantity I(ξ = x) = log2 1

P(ξ=x) is the self- information of the valuexof the random variableξ.

Definition.

(24)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page24of42

JJ II

J I

← - , →

Full Screen Search

Discrete distribution

pi=P(ξ=xi) (i= 1,2, . . .).

Theorem.

Ifp1, p2, . . . is a discrete distribution, then pi ≥0 (i= 1,2, . . .) and

X

i=1

pi= 1.

Definition.

If the random variable is finite, that is,

x1, x2, . . . , xn and pi=P(ξ=xi) (i= 1,2, . . . , n), then

E(ξ) =

n

Xxipi

(25)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page25of42

JJ II

J I

← - , →

Full Screen Search

Close

Figure 2:

(26)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page26of42

JJ II

J I

← - , →

Full Screen

3 Entropy

Definition.

Let

P ={p1, p2, . . . , pn}

a discrete distribution of the random variableξ. then H(ξ) =−

n

X

i=1

pilog2pi is called .

(27)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page27of42

JJ II

J I

← - , →

Full Screen Search

Close

Figure 3: Binary case

(28)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page28of42

JJ II

J I

← - , →

Full Screen

Figure 4: The functionxln(x)

(29)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page29of42

JJ II

J I

← - , →

Full Screen Search

Close

Note.

The probabilityp= 0 :

Extension for the functionxlog2x.By definition

x→0+0lim xlog2x= 0, azaz 0 log20 =−0 log21 0 = 0 Note.

The entropyH(ξ)is the expectation of self information.

H(ξ) =

n

X

i=1

log2 1

pi

pi, xi= log2 1 pi

. Notations:

H(ξ) =H(P) =Hn(p1, p2, . . . , pn) =H(p1, p2, . . . , pn).

(30)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page30of42

JJ II

J I

← - , →

Full Screen

Figure 5: Ternary case for entropy

(31)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page31of42

JJ II

J I

← - , →

Full Screen Search

Close

4 Properties of entropy

1.Hn(p1, p2, . . . , pn)≥0.

Proof.

H(ξ) =

n

X

i=1

log2 1

pi

pi.

∀pilog2 1 pi ≥0.

♠ 2. If pk = 1 and pi = 0 (1 ≤ i ≤ n, i 6= k), then Hn(p1, p2, . . . , pn) = 0.

3.Hn+1(p1, p2, . . . , pn,0) =Hn(p1, p2, . . . , pn).

(32)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page32of42

JJ II

J I

← - , →

Full Screen

4.Hn(p1, p2, . . . , pn)≤Hn

1 n,1

n, . . . ,1 n

= log2n.

Proof.

The function−log2xis convex. Apply Jensen’s inequality.

E(f(ξ))≥f(E(ξ)).

Classical probability field, Hartley. ♠ 5.H(ξ)is a continuous function.

6.Hn(p1, p2, . . . , pn)is symmetric in probabilities.

(33)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page33of42

JJ II

J I

← - , →

Full Screen Search

Close

7. (Branching) Ifqn =p1+p2+· · ·+pm,then

Hn+m−1(q1, q2, . . . , qn−1, p1, p2, . . . , pm) =

=Hn(q1, q2, . . . , qn) +qnHm(p1

qn,p2

qn, . . . ,pm

qn).

Proof.

(34)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page34of42

JJ II

J I

← - , →

Full Screen Search

Hn(q1, q2, . . . , qn) +qnHm(p1

qn

,p2

qn

, . . . ,pm

qn

) =

=−

n

X

i=1

qilog2qi−qn m

X

i=1

pi qn

log2 pi qn

=

=−

n

X

i=1

qilog2qi

m

X

i=1

pi(log2pi−log2qn) =

=−

n−1

X

i=1

qilog2qi−qnlog2qn

m

X

i=1

pilog2pi+ log2qn m

X

i=1

pi=

=−

n−1

Xqilog2qi

m

Xpilog2pi=

(35)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page35of42

JJ II

J I

← - , →

Full Screen Search

Close

♠ Note.

AXIOMS:

(1)H(P)is continuous in the distributionP.

(2) Ifpi= 1

n (1≤i≤n),thenH is monotone increasing in valuen.

(3) If0≤λ≤1,then

Hn+1(p1, p2, . . . , λpn,(1−λ)pn) =

=Hn(p1, p2, . . . , pn) +pnH2(λ,1−λ).

(36)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page36of42

JJ II

J I

← - , →

Full Screen Search

5 Notations

N – set of natural numbers (positive integers) R – set of real numbers

R2 –{(x, y)|x, y∈R}

A⊂B –A subset ofB

A∩B – intersection ofAandB (common part) A∪B – union ofAandB (all element in one set) A – the elements of the base set out ofA(complement) A\B – A∩B

F(a+ 0) – right side limit, that is, lim

x→a+0F(x) F(a−0) – left side limit, that is, lim F(x)

(37)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page37of42

JJ II

J I

← - , →

Full Screen Search

Close

f(·) :D→R – the mapf with domainD,the "point"

substitutes the variable.

f(D) – the range (codomain) of the mapf.

(38)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page38of42

JJ II

J I

← - , →

Full Screen

Source alphabet (abc):X ={x1, . . . , xn}(n≥2).

Set of source messages:

X =

[

k=1

Xk.

Code alphabet (abc):Y ={y1, . . . , ys} (s≥2).

Set of code messages:

Y=

[

k=1

Yk.

(39)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page39of42

JJ II

J I

← - , →

Full Screen Search

Close

6

Thank you

for your attention.

(40)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page40of42

JJ II

J I

← - , →

Full Screen Search

References

[1] J. Aczél, Z. Daróczy: On Measures of Infor- mation and Their Characterization, Academic Press, New York,1975.

[2] S. Arimoto: An algorithm for calculating the ca- pacity of an arbitrary discrete memoryless chan- nel, IEEE Trans. Inform. Theory, IT-18,1972, pp14-20.

[3] R.B. Ash:Information Theory, Interscience, New York,1965. 22

[4] J. Berstel, D. Perrin:Theory of Codes, Academic Press, New York,2002.

[5] R. Blahut: Computation of channel capacity

(41)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page41of42

JJ II

J I

← - , →

Full Screen Search

Close

[6] T.M. Cover, J.A. Thomas:Elements of informa- tion theory, Wiley, New York,1991. 22 [7] S. Guiasu: Information theory with applications,

McGRAW-HILL, New York,1977. 22

[8] M. Jimbo, K. Kunisawa: An Iteration Method for Calculating the Relative Capacity, Depart- ment of Information Sciences, Faculty os Sience and Technology, Sience University of Tokyo, Noda City Chiba 278, Japan.

[KS13] . Kelbert, Y. Suhov: Information theory and coding by example,Cambridge University Press, 2013.

[9] C.E. Shannon, W.Weaver: A Mathematical Theory of Communication, The Bell System Technical Journal, Vol. 27, pp. 379-423, 623- 656, July, October,1948.

(42)

Basics I.

Introduction Self-information Entropy Properties of entropy Notations End

Page42of42

JJ II

J I

← - , →

Full Screen

[10] Xue-Bin Liang:An Algebraic, Analytic and Al- gorithmic Investigation on the Capacity and Capacity-Achieving Input Probability Distribu- tions of Finite-Input Finite-Output Discrete Memoryless Channels, Department of Electrical and Computer Engineering Louisiana State Uni- versity, Baton Rouge, LA 70803,2004.

Hivatkozások

KAPCSOLÓDÓ DOKUMENTUMOK

The aim of the presented research is to determine whether the factor of room dimensions has a significant effect on the amount of unit flood loss on a building.. It has been found

The Maastricht Treaty (1992) Article 109j states that the Commission and the EMI shall report to the Council on the fulfillment of the obligations of the Member

In adsorption tests determination is finished when the tension of the chamber is decreased by the water adsorption of the sample to the ERH leyp} (Fig. This is foUo'wed

Lady Macbeth is Shakespeare's most uncontrolled and uncontrollable transvestite hero ine, changing her gender with astonishing rapiditv - a protean Mercury who (and

It is kno'wn [2] that the minimum number of similarity invariants is the difference between the number of the physical quantities necessary to describe the

It is also possible to deduce from the dry weight of the plant shoots that there are significant differences in the basic Physico-chemical properties of the soils, which also

The best known method to give a lower bound for the Noether number of a given finite group is to use the fact that it is greater than or equal to the Noether number of any of

The length of this vector is 3*n (n: number of landmark). As in the Kalman filter, it has estimation and correction steps. The landmark position can be found as the