0.01 0.1 1 10 100 1000

(1)

appliation

M. Telek 1

, S. Raz 2

1

Department of Teleommuniations

Tehnial University of Budapest, 1521 Budapest, Hungary

2

Erisson Researh, Hungary

telekhit.bme.hu, Sandor.Razeth.erisson.se

Abstrat

Thispaperstudiesa lassof Markovrewardmodelswherea portionoftheaumu-

lated reward is lost at the state transitions and provides the analytial desription

of this model lass in Laplae transform domain. Based on the transform domain

desriptiona numerial method is introdued to evaluate the moments of transient

umulative reward. For the speial ase when the underlying CTMC is stationary

a more eetive analysis approah is proposed. The appliabilityof partial lossre-

wardmodelsandtheproposednumerialanalysismethodsaredemonstrated viathe

performaneanalysis of a omputer systemexeuting long running bath programs

withhekpointing.

Keywords: Markov reward models, partial reward loss, reverse Markov hain,

hekpointing.

1 Introdution

Reward models have been eetively used for performane analysis of real life omputer

andommuniationsystemsforalongtime[11℄. Rewardmodelsareomposedofadisrete

stateontinuoustimestohastiproess(referredtoassystemsstateproessorbakground

proess) desribing the behaviorof the studiedsystem, and anassoiated reward funtion

desribing the performane measure of interest. It is most frequently assumed that the

bakground proess is a Continuous time Markov hain (CTMC), but there are analysis

resultsavailableforSemi-Markov[9℄andMarkovregenerative[15℄bakgroundproessesas

well. Itisaommonfeatureofthe appliedreward funtionsthatduringthesojournofthe

bakground proess in state i reward is aumulated atrate r

i (r

i

0). The diereneof

the reward funtions of dierent reward models liesinthe eet of the state transitionon

the aumulated reward. Let T

n

be thetime of the nth statetransitionof the bakground

proess. The reward models studied inthe literature so far,an belassied aording to

the eet of astate transitionon the aumulated reward (B(t)):

preemptive resume (prs): the amount of aumulated reward is not aeted by the

state transition: B(T

n

)=B(T +

n

) (Fig. 1a) [9℄;

(2)

preemptiverepeat(prt): the aumulatedrewardisompletelylost: B(T

n

)=0(Fig.

1b) [2℄;

impulse reward model: the aumulated reward is inreased by a random quantity:

B(T +

n

)=B(T

n )+D

ij

(Fig. 1) [14℄;

partialloss model: aportionoftheaumulatedreward islostatthestatetransition:

0B(T +

n

)B(T

n

) (Fig. 1d).

The twosublasses of partial lossmodels are:

partial total loss models: a fration of the total aumulated reward is lost at state

transitions, B(T +

n ) =

i B(T

n

), where 0

i

1 and i is the state of the system

between T

n 1

and T

n . [1℄

partialinremental lossmodels: theamountoflostreward isafrationofthereward

aumulated during the sojourn in the last state, B(T +

n

) = B(T +

n 1 )+

i [B(T

n )

B(T +

n 1

)℄,with 0

i

1 [1,13℄.

From the reward loss point of view the prs and the prt reward models are the two

extreme ases of rate-reward models beause the rst one represents no reward lossat all

and the seond one represents omplete loss of all previously aumulated reward. There

are analytial results available for these two extreme ases. There was anevident need to

handleintermediateasesaswell,beausereallifesystemsoftenbehavebetweenthesetwo

extremes, but analytial results were not available. The lassiation and the analytial

desription of partial loss reward models were presented in [1, 13℄ and a omputation

method was proposed for the analysis of partial inremental loss models in [13℄. In this

paperwepresenttwonewomputationallyeetiveanalysis methodsforthe evaluationof

the aumulated reward of partial inrementallossmodels.

Several eetive numerialproedures were proposed for theanalysis of reward models

mainly with CTMC bakground proess and with prs reward aumulation [5, 6, 12, 16℄.

It turned out that the analysis of the distribution of the reward measures [5, 6, 12℄ is

omputationally muh harder than the analysis of the moments of the same measures

[16℄. The moments of reward measures of prs Markov reward models an be alulated

approximatelyatthe sameomputationalost asthe transientanalysisof the bakground

CTMC. In general, the analysis of partial reward loss models is more omplex than the

analysis of prs reward models. Both numerial analysis methods presented in this paper

use the analysis of prs reward models as an elementary step of the proedure. To keep

the overall omputational ost as low as possible we alulate only the moments of the

aumulated reward and apply the eetive method presented in [16℄ for the embedded

alulation of prs models.

There are two main lasses of reward measures [8℄. From the system oriented point

of view the most signiant measure is the total amount of reward aumulated by the

system in a nite interval. This measure is often referred to as performability. From

the user oriented (or task oriented) point of view the system is regarded as a server, and

the emphasis of the analysis is on the time the system needs to aomplish an assigned

task. Consequently, the most harateristi measure beomes the ompletion time. The

numerialanalysis ofthe rst measure is onsidered inthis paper.

The aimof this paperis to present numerial methods to evaluate partial lossmodels

and to demonstrate the appliability of partial loss models in the analysis of omputer

(3)

r

i r

k

r

j

r

k

t

t j

i k Z(t)

a) preemptiveresume (prs)

t r

k

t Z(t)

B(t)

j

i k

r

i

r

k

r

j

b)preemptive repeat (prt)

t r

j

r

k

r

i

D

kj

D

jk

t D

ik r

k

Z(t) B(t)

k

j

i

) impulsereward

t r

k

r

j

r

k

t Z(t)

B(t)

r

i

k

j

i

d)partial loss

Figure 1: Change of aumulated reward atstate transitions

(4)

j [B(T

3

) B(T

2 )℄

t r

i

r

k

r

j

B(T

2 )

r

k

r

k

r

i

r

j

t T

1

T

2

T

3 Z(t)

i j k

Figure2: Reward aumulation inpartial inrementallossmodel

The rest of the paper is organized as follows. Setion 2 provides the analytial de-

sriptionof theaumulatedrewardof partialinrementallossmodelsindoubletransform

domain. Based on this double transform domain expression a numerial method is pro-

posed inSetion3for the analysisof the aumulatedreward. There isaomputationally

hard step in the proposed numerial method, a numerial integration. A speial numer-

ial analysis method is proposed in Setion 4 whih is free of the omputationally hard

numerial integration, but this speial method is appliable only when the bakground

stohasti proess is stationary. An appliation example of partial loss reward models is

presented in Setion 5. The numerial properties of the proposed numerial methods are

alsodemonstrated here. Finally, the paperis onluded inSetion 6.

2 Partial inremental loss in Markov reward models

Let fZ(t);t 0g be a ontinuous time Markov hain (CTMC) on state spae S =

f1;2;:::;Ng with generator Q = fq

ij

g and initial probability vetor . Whenever the

CTMCstays instatei, reward isaumulatedatrater

i . r

i

isanon-negativereal number.

When the CTMC undergoesa transitionfromstate i toanother state,the 1

i

fration

of the reward obtained during the lastsojourn instate iis lostand onlythe

i

frationof

the reward obtained during the lastsojourn in i remains.

i

is a real number, suh that

0

i

1. B(t) denotes the amount of aumulatedreward at time t. B(t) is righton-

tinuous, i.e., B(t)=B(t +

). Let T

n

be the time of the nth transitionin the CTMC. Then

the dynamisofthe rightontinuous proess fB(t);t 0gan bedesribed asfollows(see

Figure2):

dB(t)

dt

= r

Z(t)

for T

n

<t<T

n+1

(1)

B(T

n

) = B(T

n 1 )+

Z(T ) [B(T

n

) B(T

n 1

)℄ (2)

(5)

The state dependent distribution of the aumulated reward is dened as

P

ij

(t;w)=Pr(B(t)w;Z(t)=j jZ(0)=i)

and P(t;w)=fP

ij

(t;w)g.

Theorem 1 The following double transform domainequationholds for P(t;w):

P

(s;v)=(sI +vR

Q)

1

D(s;v) (3)

where I is theidentity matrix, denotes the Laplae transform with respet to t (!s),

denoted the Laplae-Stieltjestransform with respet to w (! v) and the diagonal matries

R

and D(s;v) are dened as R

=diaghr

i

i and D(s;v)=diag

*

s+vr

i

i +q

i

s+vr

i +q

i +

.

The proof of the Theorem isprovided inAppendix A.

Theorem 1 is a general result that provides, as a speial ase, the previously known

resultsfortheprs andfortheprt aseswhen8

i

;i2S isset to1and 0,respetively. E.g.,

when

i

=1;i2S R

beomes R=diaghr

i

iand D(s;v)vanishesin (3).

The partiallossmodels arethe transitionbetween the prs (noreward loss) andthe prt

(omplete reward loss) reward models. The numerial methods that are ommonly used

for the analysis of the prs and the prt reward models utilize the speial features of those

models and annot be appliedfor the analysis of partial lossmodels.

The behavior of the partialinremental lossmodel an be interpreted as follows. The

rewardaumulationbetween 0andt

isaordingtoatraditionalprs modelwithredued

rewardrates(

i r

i

),andfromtimet

theprsrewardaumulationgoesonwiththeoriginal

rewardrates(r

i

),wheret

(0t

<t)istheinstantofthelaststatetransitionbeforet. If

thereis nostatetransitiontilltimet, thent

=0. Unfortunately, t

is aomplexquantity

(sine it depends on the evolution of the CTMC over the whole (0;t) interval) and it is

hard to evaluatethe partial lossmodels with eetive numerial methods. The transform

domain expression in eq. (3) reets this modelinterpretation. Matrix (sI+vR

Q)

1

desribes the distribution of the reward aumulatedby a prs Markov reward modelwith

generator Q and reward rates

i r

i

and the D(s;v) diagonal matrix aptures the eet of

the \dierent"reward aumulations duringthe (t

;t) interval.

As a onsequene of this omplex behavior the mean aumulated reward at time t

annot be evaluated based on the umulative transient probabilities of the CTMC, as it

was possible for the prs reward models.

3 Numerial evaluation of the aumulated reward

Inthis setionwepropose anumerialmethodtoevaluatethe aumulatedreward ofpar-

tialinrementallossmodels. TheproposedmethodisbasedonTheorem1andtheeetive

numerialmethodpublished in[16℄ thatprovidesthe momentsof the aumulatedreward

of prs Markov reward models with a low omputationalost and memory requirement.

To obtain a numerial proedure to evaluate the aumulated reward at time t, we

inverseLaplaetransform(3)withrespet tothe timevariable(s!t). Firstweintrodue

F(s;v)=diag

*

vr

i

(1

i )

s+vr +q +

;

(6)

with respet tothe time variable is

F(t;v)=diag D

vr

i

(1

i )e

(vri+qi)t E

:

Using these matries we an perform a symboli inverse Laplae transformation of (3)

whihresults in:

P

(t;v)=e ( vR

+Q)t

Z

t

=0 e

( vR

+Q)

F(t ;v)d (4)

The momentsof the aumulatedreward is obtained from (4)as

E(B n

(t))= ( 1) n

d n

dv n

P

(t;v)

v=0 h

T

;

where is the initialprobability vetor and h is the vetor of ones. The nth derivativeof

P

(t;v)at v =0an bealulated as

d n

dv n

P

(t;v)

v=0

= d

n

dv n

e ( vR

+Q)t

v=0

Z

t

=0 n

X

`=0 n

`

!

d

`

dv

` e

( vR+Q)

v=0 d

n `

dv n `

F(t ;v)

v=0 d

(5)

where the 0th derivative is the funtion itself. Sine F(;v) is a diagonal matrix the `th

derivativeof F(;v)at v =0an bealulated in aomputationally heap way as

d

`

dv

`

F(;v)

v=0

=diag D

r

i

(1

i )`( r

i )

` 1

e q

i

E

:

Twoomputationallyexpensivesteps haveto beperformedto evaluatethe nth derivative

ofP

(t;v)atv =0basedon(5). Therst oneisthealulationoftherstnderivativesof

e

( vR+Q)

atv =0and atsometime points 2(0;t℄,and theseondone isthe numerial

integration with respet to . The numerial integration is not expensive itself, but it

requires thealulation ofthe rst step several times. Thenumerialmethodpresented in

[16℄ is an eetive way of alulating the rst n derivatives of e

( vR+Q)

at v = 0, hene

we use it for the alulation of the rst step.

The omplexity of the proposed numerialproedure ismuhhigher than the analysis

ofthe sameMarkovrewardmodelwithoutreward lossfortworeasons. Therst oneisthe

mentioned numerial integration, and the seond one is related to the omplexity of the

elementary steps of the omputation of d n

=dv n

e ( vR

+Q)t

. Basially, the rst term in (5)

providesthemomentsoftheMarkovrewardmodelofthesameCTMCwithreduedreward

rates (

i r

i

) and without reward loss. For the alulation of the moments it is enough to

alulateonlytherowsumoftherstterm,e ( vR

+Q)t

,sineitismultipliedbyh T

fromthe

right. It ismuh faster toalulatetherowsum ofe

( vR+Q)t

insteadof the alulationof

the whole matrix, beause the row sum an be obtained by vetor-matrix multipliations,

while the alulation of the whole matrix requires matrix-matrix multipliations in eah

elementary step of the omputation [16℄. Unfortunately, the seond term in (5) requires

the alulation of the whole matrix (using matrix-matrixmultipliations),beause of the

multipliation by the diagonal matrix F(t ;v) from the right. This is why we dened

(7)

in double onvolutions in the original (t;w) domain. In our approah one onvolution

is avoided due to the alulation of the moments of the aumulated reward. Sine the

alulation of the distribution of a prs Markov reward model is very expensive itself (it

is muh more expensive than to alulate its moments), a diret method to alulate the

distributionoftheaumulatedrewardbydoublenumerialonvolutionbeomesinfeasible

even for smallmodels (10states). Instead, the numerialmethod for the analysis of the

moments of the aumulated reward is appliable formodelsof 100 states.

4 Stationary analysis of aumulated reward

The previous setions provide a numerial method to alulate the moments of the au-

mulated reward of partial inremental loss models. Using that method the evaluation of

partial loss reward models is omputationally muh more expensive than the alulation

of the prs reward models of the same size.

In this setion we provide an eetive omputational approah that makes possible

to evaluate muh larger partial inremental loss models (10 6

states). This numerial

approahallows the analysisof aspeiallass ofpartiallossmodels wherethe bakground

proess is in stationary state. Note that the reward aumulation of partial inremental

lossmodels withstationarybakgroundproess has non-stationaryinrementonthe (0;t)

interval (e.g., E(B(t)) 6= 2E(B(t=2))), beause the reward aumulated in the last state

may havedierent eets onthe overall aumulated reward.

The main idea of the proposed method is to dene an equivalent prs reward model,

whose aumulated reward equals the reward aumulated by the original partial loss

reward model, and to evaluatethe aumulated reward of the equivalentmodel.

Therewardaumulationproessofapartiallossrewardmodelanbedividedintotwo

main parts as it is mentioned above. During the (0;t

) interval the system aumulates

reward atredued reward rates (

i r

i

)(withoutreward loss), andduring the(t

;t)interval

it aumulates at the original reward rate (r

i ). If t

(and Z(t

)) was known it would be

straightforward to alulatethe aumulatedreward, but t

depends in aomplex way on

the CTMC behavior overthe whole (0;t) interval. t

is not a stoppingtime.

To overome this diÆulty one an interpret the reward aumulation from time t

towards time 0. In this ase t

is simply the time instant of the rst state transition in

the reverse CTMC, and the reverse reward model is suh that it aumulates reward at

the originalrate (r

i

)in itsrst state and itaumulatesreward atthe redued rate (

i r

i )

after leaving the rst state. To apply this approah we need the generator of the reverse

CTMC.

The probabilitythat the proess is instate iat time t and instate j (j 6=i)at t+,

i.e., Pr(Z(t)=i;Z(t+) =j), an be alulated as:

Pr(Z(t)=i)Pr(Z(t+)=j jZ(t)=i)=Pr(Z(t+)=j)Pr(Z(t)=ijZ(t+)=j):

Dividingboth sides by and letting !0 we have

Pr(Z(t)=i)q

ij

=Pr(Z(t)=j)q

ji (t) ;

where q

ji

(t) is the generator of the reverse CTMC. One an see that the generator of the

(8)

and the generator of the reverse CTMC beomes time homogeneous:

q

ji

=

i

j q

ij

; (6)

where

i

is the stationary probability of state i in the original (as well as the reverse)

CTMC. The stationary probabilities an be obtained solving 8j 2 S P

i2S

i q

ij

= 0 with

the normalizing ondition P

i2S

i

= 1. The diagonal elements of the generator of the

stationaryreverse CTMC arethe sameasthe originaldiagonalelements(sinethe reverse

proess spends the same time in eah state as the original one). It is easy to hek that

matrix Q =fq

ij

g dened by (6)is a proper generator matrix.

In asetheoriginalpartiallossmodelstartsfromthe stationarystate, weandenean

equivalentprs Markov reward modelthat aumulates the same amount of reward during

the (0;t) interval as our original partialloss model using the reverse interpretation of the

rewardaumulation. Theoriginalpartiallossmodelisdened by(;Q;R ;R

)(theinitial

probabilityvetor,whihisthestationarydistributionoftheCTMC,thegeneratormatrix,

the diagonalmatrixof the reward rates,the diagonalmatrixof the redued reward rates).

Based on this desription we dene an equivalent prs Markov reward model with state

spae of 2jSj states by initialprobability vetor 0

, generator matrix Q 0

, and reward rate

matrix R 0

asfollows:

0

=f;0g; Q 0

= Q

D

Q Q

D

0 Q

; R 0

=

R 0

0 R

; (7)

Q

D

= diag hq

ii

i is the diagonal matrix omposed of the diagonal elements of Q. Eah

state of the original CTMC is represented by two states in the equivalent prs Markov

reward model. States1 tojSjrepresent the reward aumulationwith the originalreward

rate (r

i

). The equivalent model starts from this set of states aording to the stationary

distribution. StatesjSj+1to2jSjrepresenttherewardaumulationaftertherststate

transitionwith the redued reward rates. The struture of the Q 0

matrix is suh that the

equivalent proess moves from the rst set of states (states 1 to jSj) to the seond one

(states jSj+1 to 2jSj) at the rst state transition and remains there. The distribution

of the reward aumulated during the (0;t) interval by a prs Markov reward modelwith

initial probability vetor 0

, generator matrix Q 0

, and reward rate matrix R 0

is (see e.g.,

[16℄)

0

(sI 0

+vR 0

Q 0

) 1

h 0

T

(8)

where the ardinality of the identity matrix I 0

and summingvetor h 0

is 2jSj.

Theformalrelationoftheoriginalpartiallossmodelandthereverse prsMarkovreward

modelis presented in the followingtheorem.

Theorem 2 The distribution of reward aumulated by the prs Markov reward model

( 0

;Q 0

;R 0

) is idential with the distribution of reward aumulated by the partial inre-

mental loss Markov reward model(;Q;R ;R

), that is (from eq. (3) and (8)):

(sI+vR Q) 1

D(s;v)h T

= 0

(sI 0

+vR 0

Q 0

) 1

h 0T

(9)

The equivalent reward model is a prs Markov reward model. Its analysis an be per-

formed with eetive numerialmethods availablein the literature. E.g., the distribution

of the aumulated reward an bealulated using [12, 5, 6℄and its moments using [16℄.

It is easy to evaluate the limiting behavior of a partial loss model with stationary

bakground CTMC. We use the following notation. B(t) is the reward aumulated by

a stationary partial inremental loss model dened by (Q;R ;R

). B

0

(t) and B"(t) are

the rewards aumulated by stationaryprs reward models dened by (Q;R ) and (Q;R

),

respetively. The stationary distribution of the CTMC with generator Q is . For short

intervalsthe lossatthe rst transitiondoesnot play role, hene

lim

t!0

B(t)=tlim

t!0 B

0

(t)=t;

and for very long intervals the reward aumulated from the last state transition to the

end of the interval is negligiblewith respet tothe total aumulatedreward

lim

t!1

B(t)=t lim

t!1

B"(t)=t

E.g., the limiting behaviorof the mean aumulated reward an be alulated as

lim

t!0

E(B(t))

t

=lim

t!0 E(B

0

(t))

t

= X

i2S

i r

i

;

lim

t!1

E(B(t))

t

= lim

t!1

E(B"(t))

t

= X

i2S

i

i r

i :

(10)

5 Performane analysis of omputer systems with hek-

pointing

Chekpointing is a widely applied tehnique to improve the performane of omputing

servers exeuting long running bath programs in the presene of failures [4, 3, 7, 10℄.

Long runningbathprograms needtobere-exeutedinase of asystemfailurebeforethe

ompletionoftheprogram. Toreduethe extrare-exeutionworkofthesystemthe atual

state ofthe programissaved oasionallyduring theoperationaltime ofthe system. This

savedprogramstateisusedwhenafailureours. Afterafailureandthesubsequentrepair

the saved program state is reloaded and the program is re-exeuted from its saved state.

The operation of saving the urrent state of the program is referred to as hekpointing

and the reload of the saved programstate is alledrollbak.

It is a ommon feature of all hekpointing models that a portion of work exeuted

sinethelastsystemfailureislostatthenext systemfailure,henetheamountofexeuted

work an be analyzed using partial loss models. To nd the relation between the applied

hekpointingpoliyandtheparametersofthepartiallossrewardmodelisoutofthesope

ofthispaper. Here,wefollowasystemlevelapproah,whihmeansthatthe parametersof

thepartiallossmodeloftheanalyzedomputingserverareassumedtobeknown. However,

some onsiderationson the behavior of the analyzed system are provided below.

It is importantto note that our analysis approah ontains a simplifying assumption.

The portion of work lostat a system failureis arandom quantity. The analysis of partial

lossreward models with random lossratio is studied in[1℄, but unfortunately, there is no

eetive numerial method available for their analysis. This is the reason for using (state

(10)

of two majorsteps:

I. Generation of partialloss Markov reward modelbased onthe system behavior:

haraterize the state spae of the model based on the system load and the

failure proess.

evaluate the failurerate and omputing power assignedto the jobs under exe-

ution in eah system state r

i .

alulate the (optimal)hekpointing rate ineah system state.

alulate the state dependent loss ratio (the portion of work that needs to be

re-exeuted), based on the failurerate and the hekpointing rate.

II. Solutionof the obtained partialloss Markov reward model.

In the followingnumerialexample we utilizethe result of step I. and perform step II.

Consider a omputing server exeuting long running bath programs. Jobs of two

lasses arrive to the server. Class 1 (lass 2) jobs arrive aording to a Poisson proess

with rate

1 (

2

). Eahof these jobs requires anexponentiallydistributed exeutiontime

with parameter

1 (

2

) with the full omputing apaity of the server. The server has

nite apaity (N

MAX

) and the number of lass 1 (lass 2) jobs annot exeed N

1 (N

2 ),

i.e.,n

1 N

1 ,n

2 N

2 ,n

1 +n

2 N

MAX

,wheren

1 (n

2

)isthenumberoflass1(lass2)jobs

in the system. The failurerate isload dependent: (n

1

;n

2 )=!

a +!

b (n

1 +n

2

),where !

a

and !

b

are the parametersofthe load independent andload dependentparts ofthe failure

rate,respetively. Therepairtime,inludingtherollbaktime,isexponentiallydistributed

aswell. Weusestateindependentrepairrate. (Note thattheappliedmodelingapproah

an handle state dependent repair rates with the same omputational omplexity.) Job

arrival is also allowed during repair. The omputing performane of the server slightly

dereases with the number of jobs under exeution(e.g., due tothe swapping of jobs). r

a

(0r

a 1,r

a

1)istheportionoftheomputingpowerthatisutilizedforjobexeution

when there is only one job in the server. Suppose the presene of lass 1 jobs inreases

the hekpointing rate, theportionofuseful work maintainedata systemfailureinreases

withthe numberof lass1jobs.

a and

b

are usedtorepresentthe loadindependentand

load dependent part of the useful workratio, respetively.

Having these Markovian assumptions one an easily model a wide range of servie

disiplineshemes. We onsider weighted proessorsharing withstate dependent weights.

Ourserviedisiplineassignsapredenedportionoftheomputingpower,

1

(0<

1

<1)

and

2

= 1

1

, to jobs of lass 1 and lass 2, respetively. Jobs of the same lass are

exeuted at the same speed. If there are only jobs of one lass in the system, the whole

omputingapaitywillbeutilizedbythatlass. Asaspeialaseofthisserviedisipline

weobtainthepreemptivepriorityserviedisiplinewhen

2

tendsto0. Inthisaselass 1

jobs are exeuted with the whole omputingpower of the server aslong as there are lass

1 jobsin the system.

Based onthis system behavior the performane of the onsidered omputing system is

analyzedusingthe partiallossMarkovrewardmodeldened inTable1. Thestatespaeof

theCTMC isharaterizedbythenumberoflass1andlass 2jobsinthesystem andthe

operationalondition ofthe system. Theoperationalonditionan beoneof thefollowing

three: Good, To fail and Repair. We need to distinguish between the operational states

(11)

n

1

: 0 To N

1

#lass 1 jobs

n

2

: 0 To N

2

#lass 2 jobs

fGood; To fail; Repairg operationalondition

n

1 +n

2 N

MAX

Underlying CTMC

(n

1

;n

2

;Good)!(n

1 +1;n

2

;Good)=p

1

lass 1 job arrival

(n

1

;n

2

;Good)!(n

1 +1;n

2

;To fail)=q

1

(n

1

;n

2

;R epair)!(n

1 +1;n

2

;R epair)=

1

(n

1

;n

2

;Good)!(n

1

;n

2

+1;Good)=p

2

lass 2 job arrival

(n

1

;n

2

;Good)!(n

1

;n

2

+1;To fail)=q

2

(n

1

;n

2

;R epair)!(n

1

;n

2

+1;R epair)=

2

(n

1

;n

2

;Good)!(n

1 1;n

2

;Good)=p

1 n

1

1 n

1 +

2 n

2

1

lass 1 job departure

(n

1

;n

2

;Good)!(n

1 1;n

2

;To fail)=q

1 n

1

1 n

1 +

2 n

2

1

(n

1

;n

2

;Good)!(n

1

;n

2

1;Good)=p

2 n

2

1 n

1 +

2 n

2

lass 2 job departure

(n

1

;n

2

;Good)!(n

1

;n

2

1;To fail)=q

2 n

2

1 n

1 +

2 n

2

(n

1

;n

2

;To fail)!(n

1

;n

2

;R epair)=!

a +!

b (n

1 +n

2

) failure

(n

1

;n

2

;R epair)!(n

1

;n

2

;Good)=p repair

(n

1

;n

2

;R epair)!(n

1

;n

2

;To fail)=q

Reward and lossstruture

r(n

1

;n

2

;Good)=r n1+n2

a

if: n

1 +n

2

>0 reward rate

r(0;0;Good)=0

r(n

1

;n

2

;To fail)=r n1+n2

a

if: n

1 +n

2

>0

r(0;0;To fail)=0

r(n

1

;n

2

;R epair)=0

(n

1

;n

2

;Good)=1 useful workratio

(n

1

;n

2

;To fail)=

a +

b n

1

n

1 +n

2

if: n

1 +n

2

>0

(0;0;To fail)=0

(n

1

;n

2

;R epair)=0

Table 1: The partial lossMarkov reward model of the omputing system

(12)

statewhilethereissomeworklossatthedeparturefromaTo failstate. Theprobabilityof

movingtotheGoodandTo failondition(i.e.,pandq,respetively)arealulatedbasedon

thenumberofjobsinthedestinationstate. For0<n

1 +n

2

<N

MAX

&n

1

<N

1

&n

2

<N

2 :

q=1 p=

!

a +!

b (n

1 +n

2 )

1 +

2 +

1 n

1

1 n

1 +

2 n

2 +

2 n

2

1 n

1 +

2 n

2 +!

a +!

b (n

1 +n

2 )

;

for n

1

=n

2

=0:

q =1 p=

!

a

1 +

2 +!

a

;

for n

1 +n

2

=N

MAX or n

1

<N

1

&n

2

<N

2 :

q=1 p=

!

a +!

b (n

1 +n

2 )

1 n

1

1 n

1 +

2 n

2 +

2 n

2

1 n

1 +

2 n

2 +!

a +!

b (n

1 +n

2 )

;

for n

1 +n

2

<N

MAX

and n

1

=N

1

&n

2

<N

2 :

q =1 p=

!

a +!

b (n

1 +n

2 )

2 +

1 n

1

1 n

1 +

2 n

2 +

2 n

2

1 n

1 +

2 n

2 +!

a +!

b (n

1 +n

2 )

;

and for n

1 +n

2

<N

MAX

and n

1

<N

1

&n

2

=N

2 :

q =1 p=

!

a +!

b (n

1 +n

2 )

1 +

1 n

1

1 n

1 +

2 n

2 +

2 n

2

1 n

1 +

2 n

2 +!

a +!

b (n

1 +n

2 )

:

The followingset of system parameters were used for the numerialevaluation:

state spae: N

1

=3;N

2

=4;N

MAX

=6;

job arrival and omputing requirement [1/hours℄:

1

=0:4;

2

=0:4;

1

=2;

2

=1;

resoure sharing between lass 1and lass 2jobs:

1

=2=3;

2

=1=3;

failureand repair parameters [1/hours℄: !

a

=0:3;!

b

=0:03; =2;

overhead parameter: r

a

=0:98;

work lossparameters:

a

=0:6;

b

=0:05.

Thesystemperformanewasevaluatedwithtwoinitialprobabilitydistributions(Figure

3). In the rst ase the system starts from stationary state, and in the seond ase the

system starts from state (0;0;Good)with probability1. The ase when the system starts

from state (0;0;Good) was evaluated by the method presented in setion 3 and the ase

ofstationarybakgroundCTMC wasevaluatedwithbothmethods(setion3 and4). The

aurayofthe prs rewardanalysis method,whihisappliedinboth ases,was10 6

. The

numerial integration of the rst methodwas omputed over100 equidistant points. The

numerial results obtained for the stationary ase were pratially idential, hene there

(13)

ward rates, (Q;R

), (10) we have lim

t!1

E(B(t))=t=0:4718, and lim

t!1

Var(B(t))=t=

0:0548. Eah pair of mean and variane urves in Figure 3 tends to the respetive limit.

Themeanurveassoiatedwiththestationarybakgroundproess startsfromthestation-

aryaumulationrate of the prs Markov reward modelwith originalreward rates, (Q;R ),

(10).

The detailedanalysis ofa slightlylargerpartial lossMarkov reward modelofthe same

example with stationaryinitialdistribution and with N

1

=10;N

2

=20;N

MAX

=1;

1

=

0:5;

2

= 0:5 results in the urves in Figure 4. It an be seen that the transition from

the initial to the nal E(B(t))=t value takes plae between 0:1 and 10 hours, and the

Var(B(t))=t urve has a peak in this range. That is the range where the eet of the

reward loss at the rst state transition turns up. The peak of the Var(B(t))=t urve is

sharper for the smallsystem.

0 0.2 0.4 0.6 0.8

0.01 0.1 1 10 100 1000

t E(B(t))/t ^Stationary ^(0,0,Good)

1.E-05 1.E-04 1.E-03 1.E-02 1.E-01 1.E+00

0.01 0.1 1 10 100 1000

t Var(B(t))/t ^Stationary ^(0,0,Good)

Figure3: Moments of omputing system performane (57 state model)

0.25 0.3 0.35 0.4

0.01 0.1 1 10 100 1000

t E(B(t))/t

0.0001 0.001 0.01 0.1 1

0.01 0.1 1 10 100 1000 t

Var(B(t))/t

Figure4: Moments of omputing system performane (1386state model)

6 Conlusion

The paper presents two new numerial analysis methods for the analysis of partial inre-

mental loss Markov reward models. The rst one is appliable with any general initial

probability distribution, but it is omputationally more intensive. It an be applied for

models with 100 states. The seond one is appliable only for partial loss models with

stationarybakground CTMC, but itis omputationallymoreeetive. It an be applied

formodelswith10 6

states. Todemonstratetheappliabilityofpartiallossrewardmodels

and the numerial properties of the proposed analysis methods a omputing system exe-

utinglongrunningbathprogramsisanalyzed. Numerialresultsshowthatthe proposed

(14)

ofthe samemode(e.g.,with intelligentnumerialintegration)oneanfurther enhanethe

appliabilityof the method.

Aknowledgement

The authors thank the areful review of the referees. They helped to improvethe presen-

tation of the paperand to remove anannoyingnumerial error fromthe example.

Referenes

[1℄ A. Bobbio, V. G. Kulkarni, and M. Telek. Partial lossin reward models. In 2nd Int.

Conf. onMathematial Methods in Reliability,pages207{210,Bordeaux, Frane,July

2000.

[2℄ A. Bobbio and M. Telek. Task ompletion time. In Proeedings 2nd International

Workshop on Performability Modelling of Computer and Communiation Systems

(PMCCS2), 1993.

[3℄ A. Brok. An analysis of hekpointing. ICL Teh. Journal, 1(3), 1979.

[4℄ K.M. Chandy, J.C. Browne, C.W. Dissly, and W.R. Uhrig. Analyti models for roll-

bakandreoverystrategiesindatabasesystems. IEEE Trans.onSoftware Engineer-

ing, SE-1(1):100{110, 1975.

[5℄ E.deSouzaeSilva,H.R.Gail,andR.VallejosCampos.Calulatingtransientdistribu-

tionsof umulativereward. InProeedingsACM/SIGMETRICSConferene,Ottawa,

1995.

[6℄ L. Donatiello and V. Grassi. On evaluating the umulative performane distribution

of fault-tolerantomputer systems. IEEE Trans.on Computers,1991.

[7℄ E.Gelenbeand D. Derohette. Performane of rollbakreoverysystems underinter-

mittentfailures. Commun. ACM, 21(6):493{499,1978.

[8℄ V.G.Kulkarni,V.F.Niola,andK.Trivedi. Onmodelingtheperformaneandreliabil-

ityofmulti-modeomputersystems. TheJournal ofSystemsandSoftware,6:175{183,

1986.

[9℄ V.G.Kulkarni,V.F. Niola, andK.Trivedi. The ompletiontimeof ajobonamulti-

mode system. Advanes in Applied Probability, 19:932{954,1987.

[10℄ V.G.Kulkarni,V.F. Niola,and K.Trivedi. Eets ofhekpointing and queueingon

programperformane. Stohasti models,4(6):615{648,1990.

[11℄ J.F. Meyer. Onevaluatingthe performabilityofdegradablesystems. IEEE Trans.on

Computers, C-29:720{731,1980.

[12℄ H. Nabliand B. Seriola. Performability analysis: a new algorithm. IEEE Trans. on

(15)

environment and partial lossof work. In 2nd Int. Conf. on Mathematial Methods in

Reliability (MMR'2000),pages 813{816, Bordeaux, Frane,July 2000.

[14℄ S. Raz and M. Telek. Performability analysis of Markov reward models with rate

and impulse reward. In Int. Conf. on Numerial Solution of Markov Chains, pages

169{187,Zaragoza, Spain, 1999.

[15℄ M.TelekandA.Pfening. Performane analysisofMarkovRegenerativeRewardMod-

els. Performane Evaluation, 27&28:1{18,1996.

[16℄ M.Telekand S.Raz. Numerialanalysisof largeMarkovianreward models. Perfor-

mane Evaluation,36&37:95{114,Aug 1999.

A Proof of Theorem 1

Conditioningon H,the sojourn time inthe initialstate i, we have:

P

ij

(t;wjH =) = 8

>

<

>

: Æ

ij U

w

(w r

i

t) if: >t

X

k2S;k6=i q

ik

q

i P

kj

(t ;w

i r

i

) if: <t

(11)

where q

i

= q

ii

, U() is the unit step funtion and Æ

ij

is the Kroneker delta (if i = j

then Æ

ij

=1,otherwise Æ

ij

=0). Taking the Laplae-Stieltjes transform with respet to w

(!v), R e(v)0:

P

ij

(t;vjH=) = 8

>

<

>

: Æ

ij e

vr

i t

if: >t

X

k2S;k6=i q

ik

q

i e

vr

i

P

kj

(t ;v) if: <t

(12)

Unonditioningwith respet toH, based onthe sojourn time distribution instate i, (1

e q

i t

),results in:

P

ij

(t;v) = Æ

ij e

vr

i t

e q

i t

+ X

k2S;k6=i Z

t

=0 q

ik e

vr

i

e q

i

P

kj

(t ;v)d (13)

Takingthe Laplaetransform with respet to t (!s), R e(s)0,results in:

P

ij

(s;v) = Z

1

t=0 e

st

P

ij

(t;v)dt =

Z

1

t=0 e

st

Æ

ij e

(vri+qi)t

dt+ X

k2S;k6=i Z

1

t=0 e

st Z

t

=0 q

ik e

(vrii+qi)

P

kj

(t ;v)d dt =

Æ

ij

1

s+vr

i +q

i +

X

k2S;k6=i Z

1

=0 e

s

q

ik e

(vr

i

i +q

i )

Z

1

t=

e s(t )

P

kj

(t ;v) dt d =

Æ

ij

1

s+vr

i +q

i +

X

k2S;k6=i Z

1

=0 q

ik e

(s+vrii+qi)

d P

kj

(s;v) =

Æ

ij

1

s+vr

i +q

i +

X

k2S;k6=i

q

ik

s+vr

i

i +q

i P

kj (s;v)

(16)

B Proof of Theorem 2

The left hand side of eq. (9)an be rewritten as

(sI+vR

Q)

1

D(s;v)h T

=(sI+vR

Q)

1

(sI+vR

Q

D

)(sI+vR Q

D )

1

h T

(15)

For the evaluation of the right hand side of eq. (9), we use the partitioned form of

matriesI 0

;R 0

;Q 0

. That is

(sI 0

+vR 0

Q 0

)=

sI +vR Q

D

Q+Q

D

0 sI +vR

Q

; (16)

and

(sI 0

+vR 0

Q 0

) 1

=

(sI+vR Q

D )

1

(sI +vR Q

D )

1

(Q Q

D

)(sI+vR

Q)

1

0 (sI+vR

Q)

1

:

(17)

Usingthe speial struture of the initialvetor 0

we have:

0

(sI 0

+vR 0

Q 0

) 1

h 0T

=

(sI +vR Q

D )

1 h

I+(Q Q

D

)(sI+vR

Q)

1 i

h T

=

(sI +vR Q

D )

1

h

(sI+vR

Q)(sI+vR

Q)

1

+(Q Q

D

)(sI+vR

Q)

1 i

h T

=

(sI +vR Q

D )

1

(sI+vR

Q

D

)(sI +vR

Q)

1

h T

=

(18)

Let be the diagonal matrix of the stationary probabilities, i.e., = diagh

i

i. Using

this diagonal matrix = h and from eq. (6) Q = 1

Q T

. In the following steps the

diagonalmatries , R

,(sI+vR Q

D

)and(sI+vR

Q

D

)areommutedif neessary:

h (sI +vR Q

D )

1

(sI+vR

Q

D

)(sI +vR

1

Q T

) 1

h T

=

h

"

(sI+vR

1

Q T

) 1

T

(sI+vR Q

D )

1

(sI+vR

Q

D )

T

#

T

h T

=:::

Theexternaltransposevanishesduetothemultipliationby hfromleftandh T

fromright

and the seond internal transpose alsovanishesbeause itontains adiagonal matrix. In

the rst internaltranspose we interhange the order of transpose and inversion:

h

(sI +vR

1

Q T

) T

1

(sI+vR Q

D )

1

(sI +vR

Q

D ) h

T

=

h

sI +vR

Q 1

1

(sI +vR Q

D )

1

(sI+vR

Q

D )h

T

=

h

s 1

+v 1

R

Q

1

(sI+vR Q

D )

1

(sI+vR

Q

D )h

T

=

h (sI+vR

Q)

1

(sI+vR

Q

D

)(sI +vR Q

D )

1

h T

(19)

0.01 0.1 1 10 100 1000

0 0.2 0.4 0.6 0.8

0.01 0.1 1 10 100 1000

t E(B(t))/t Stationary (0,0,Good)

1.E-05 1.E-04 1.E-03 1.E-02 1.E-01 1.E+00

0.01 0.1 1 10 100 1000

t Var(B(t))/t Stationary (0,0,Good)

0.25 0.3 0.35 0.4

0.01 0.1 1 10 100 1000

t E(B(t))/t

0.0001 0.001 0.01 0.1 1

0.01 0.1 1 10 100 1000 t

Var(B(t))/t

t E(B(t))/t ^Stationary ^(0,0,Good)

t Var(B(t))/t ^Stationary ^(0,0,Good)