# Markov decision model

## Top PDF Markov decision model: ### First-order sensitivity of the optimal value in a Markov decision model with respect to deviations in the transition probability function

Already in the 1990th, Müller ( 1997a ) pointed out that the impact of the transition probabilities of a Markov decision process (MDP) on the optimal value of a corre- sponding Markov decision model (MDM) can not be ignored for practical issues. For instance, in most cases the transition probabilities are unknown and have to be estimated by statistical methods. Moreover in many applications the ‘true’ model is replaced by an approximate version of the ‘true’ model or by a variant which is sim- plified and thus less complex. The result is that in practical applications the optimal (strategy and thus the optimal) value is most often computed on the basis of transition probabilities that differ from the underlying true transition probabilities. Therefore the sensitivity of the optimal value w.r.t. deviations in the transition probabilities is obviously of interest.
Mehr anzeigen ### A Markov Decision Model for a Surveillance Application and Risk-Sensitive Markov Decision Processes

A trivial way to protect a critical infrastructure is to increase the number of sensors and to increase the size of the security staff so that the infrastructure is under complete control at any point in time. But this approach is very expensive and therefore not practicable. Increasing the number of sensors only, would lead to informational overflow since all information has to be evaluated by the rather small number of security staff. By this, the view on the essential point might be obscured. Therefore, decision support is desirable so that attention is focussed on the essentials. In this text, a mathematical model is presented from which decision support can easily be provided. The background of the present work is that the sensors and the security staff should be deployed in some kind of an intelligent manner in order to provide insight to the current threat situation or in order to prevent arising threats in the most efficient way. Here, “intelligence” comes into play by using the framework of finite-state finite-action continuous-time Markov decision processes (CMDPs). The solution of a CMDP provides optimal actions which take into account the evolution of the threat situations over the whole time horizon, which is assumed to be infinite in our case.
Mehr anzeigen

136 Mehr lesen ### Dynamic risk management with Markov decision processes

The overview on the literature has shown that there is still a lack of reasonable and practicable dynamic risk measures. Hence, in Chapter 4 a risk measure for processes by Pflug and Ruszczy´nski (2001) is generalized by constructing a dynamic version of this measure. First, we show that under certain conditions, the risk measure fulfills most of the properties introduced in the previous chapter. Since the dynamic risk measure is defined as an optimization problem, we aim to solve this and give a closed formula for the risk measure. We are able to do so in a Markovian environment by defining a related Markov decision model and applying the value iteration. In a further step, we consider the relationship of the dynamic risk measure with the stable representation results stated in Chapter 3. It turns out that this is only possible for income processes which are final values, but not for general income processes. However, one can derive a different stable representation for our dynamic risk measure. Concluding the chapter we prove some simple martingale properties which allow for a standard interpretation of the monetary values that the components of the risk measure represent.
Mehr anzeigen

148 Mehr lesen ### Dynamic risk management with Markov decision processes

The overview on the literature has shown that there is still a lack of reasonable and practicable dynamic risk measures. Hence, in Chapter 4 a risk measure for processes by Pflug and Ruszczy´nski (2001) is generalized by constructing a dynamic version of this measure. First, we show that under certain conditions, the risk measure fulfills most of the properties introduced in the previous chapter. Since the dynamic risk measure is defined as an optimization problem, we aim to solve this and give a closed formula for the risk measure. We are able to do so in a Markovian environment by defining a related Markov decision model and applying the value iteration. In a further step, we consider the relationship of the dynamic risk measure with the stable representation results stated in Chapter 3. It turns out that this is only possible for income processes which are final values, but not for general income processes. However, one can derive a different stable representation for our dynamic risk measure. Concluding the chapter we prove some simple martingale properties which allow for a standard interpretation of the monetary values that the components of the risk measure represent.
Mehr anzeigen

154 Mehr lesen ### Doping in teams: A simple decision theoretic model

teams. In reality there can be several prizes, making the model more complicated without changing the main results. Team member j gets g, a fraction of G, in case of a win. His (or her, but most and the best paid professionals in teams sports are males) market value m j is part of M i . 2 It consists of his given ### The firm relocation decision: a logit model

Hypothesis 3: Firms in the manufacturing move more frequently than those in the service sector Mobile firms are more export oriented and exhibit significantly wider spatial patterns of customer linkage (Keeble, 1978). A multinational network has a positive impact on the relocation decision. When a firm is part of a global network, production can easily be shifted within its network without incurring sunk costs when situation proves unfavourable to one of its firms. In contrast, a uni- national firm without a network always needs to incur the sunk costs, and is therefore less likely to relocate part of its activity (Pennings and Sleuwaegen, 2000). Sunk costs are known as a barrier to international relocation of a firm (Caves and Porter, 1976; Motta and Thisse, 1994). Generally speaking, a firm that is less dependent on a localized market is more likely to move. It follows:
Mehr anzeigen ### Assessing Brazilian macroeconomic dynamics using a Markov-switching DSGE model

introduction of regime shifts is related to agents’ behavior, as agents are aware of the possible regime shifts and as this information is taken into account in their expectations. Thus, the law of motion of the variables of interest depends not only on microfounded parameters, but also on the beliefs about alternative regimes ( Bianchi, 2013 ). Based on the theoretical model introduced by Liu and Mumtaz (2011) which utilizes the open-economy model proposed by Justiniano and Preston (2010) and by adopting the method formulated by Farmer et al. (2008) , this paper employs regime shifts in certain parameters, such as those of the monetary policy rule, of inflation persistence, and of volatility shocks on the Brazilian economy after the Real Plan, between 1996 and 2012. Presumably, these parameters were not constant over the analyzed period, given the changes in the Brazilian economy, such as the adoption of the inflation targeting system, replacement of the Central Bank of Brazil’s president, and the swearing-in of the new Brazilian president.
Mehr anzeigen ### Anzeige von Assisting persons with dementia during handwashing using a partially observable Markov decision process.

The size of our model puts it well beyond the reach of any exact solution tech- niques. We used a point-based approximate solution technique based on the Perseus algorithm , which solves the POMDP only for a specific set of belief points. Our approach reconstructs the Perseus algorithm, taking into account the structure in the system dynamics and rewards. That is, there are many conditional indepen- dencies between variables over time, and in the reward function, that a solution technique can leverage by representing the dynamics and rewards as algebraic deci- sion diagrams (ADDs) . Our approach makes three additional approximations. First, we put a cap on the complexity of the value function (the number of alpha vectors). Typically, this bound only causes minimal decrease in the quality of a so- lution. Second, we merge states with values that differ by less than the Bellman error . This error shrinks to zero as the computation converges, preserving op- timality. Third, we only compute over observations with a significant probability of occurrence. This fast technique is related to a general method for dealing with large observation spaces . We solved the POMDP using 150 alpha vectors and 65 iterations in 42 hours on a dual Intel 2.40GHz XEON R TM CPU with 4Gb of RAM,
Mehr anzeigen ### Risk sensitive Markov decision processes

However, the same amount of payoff might have different subjective values for dif- ferent individuals. For instance, the same amount of money might have a higher subjective value for the poor than for the rich. This subjective value is usually called utility in neoclassical economics ( Morgenstern and Neumann , 1944 ). In other cases, the outcome might not be easily quantified, e.g., happiness, satisfac- tory, sadness, pain, etc. It requires, therefore, a subjective valuation function that is individually different. In such cases, we may assume that there exists a real- valued F -measurable function that “calculates” the subjective outcome for each state. In fact, for decision makers like human beings, we can reasonably assume that there exists a neurobiological mechanism that calculates the subjective value for each outcome represented as different kinds of perceptual stimuli, though the mechanism is still a “black box” that so far has not been fully understood. There- fore, in (behavioral) economical experiments, to avoid this ambiguity and uncer- tainty, outcomes are always represented as quantifiable monetary payoffs. In this thesis, we follow the same discipline as well. Throughout this thesis, by outcome, we mean objective payoff, whereas subjective outcome is meant by utility.
Mehr anzeigen

166 Mehr lesen ### Large-scale empirical tests of the Markov Tree model

For the overconstrained least-squares MT method, we analyze the regression residuals and show that they fit a generalized hyperbolic (GH) distribution with heavier-than-Gaussian tails. We then develop computational bootstrap procedures to better understand how the MT model is likely to perform in the future. Here we add noise to market option prices, either by sampling from the best-fit GH distribution or by resampling regression residuals, and we rerun our hedging tests using this simulated option data. Using both bootstrapping approaches, we compute the distribution of the difference between the absolute errors committed by the Heston and MT models. The results show that the MT model leads to more accurate and less risky hedges.
Mehr anzeigen ### Stochastic model specification in Markov switching vector error correction models

This paper proposes a hierarchical modeling approach to perform stochastic model specification in Markov switching vector error correction models. We assume that a common distribution gives rise to the regime-specific regression coefficients. The mean as well as the variances of this distribution are treated as fully stochastic and suitable shrinkage priors are used. These shrinkage priors enable to assess which coefficients differ across regimes in a flexible manner. In the case of similar coefficients, our model pushes the respective regions of the parameter space towards the common distribution. This allows for selecting a parsimonious model while still maintaining sufficient flexibility to control for sud- den shifts in the parameters, if necessary. In the empirical application, we apply our modeling approach to Euro area data and assume that transition probabilities between expansion and recession regimes are driven by the cointegration errors. Our findings suggest that lagged cointegration errors have pre- dictive power for regime shifts and these movements between business cycle stages are mostly driven by differences in error variances.
Mehr anzeigen ### Accelerating Peak Dating in a Dynamic Factor Markov-Switching Model

, (4) and ω ω ω t denotes an (N × 1) i.i.d. innovation vector which we assume to follow a multivariate standard normal distribution. As the DFMS model contains both latent regimes and a latent factor, estimation makes use of both the Hamilton filter and the Kalman filter. Furthermore, parameter estimation requires either the use of an approximation of the likelihood or Bayesian methods, see Kim and Nelson (1999). This is because the calculation of the exact likelihood quickly becomes computationally infeasible, as the value of the factor at time t depends on all previous Markov states, a problem known as path-dependence. We follow the approach of Chauvet (1998), which makes use of the filter proposed by Kim (1994), and approximate the likelihood. Specifically, this method proposes a collapsing step to avoid the need to track an ever- increasing number of past states, such that only a modest history of states needs to be considered. To maintain sufficient accuracy one requires this history length to be at least one longer than the highest lag-order contained in the model, see Kim (1994) for further details. To obtain the parameter estimates we maximize the associated approximate log- likelihood obtained using the prediction-error decomposition, which for our setting is given by
Mehr anzeigen ### On nonlinear expectations and Markov chains under model uncertainty

On the other side, there is a large community working on similar questions related to model uncertainty in the field of imprecise probability. Here, the cen- tral objects are upper and lower previsions introduced by Walley [ 37 ]. In the sublinear case, there is a one-to-one relation between sublinear expectations and coherent upper previsions, which creates a huge intersection between the commu- nities working on nonlinear expectations and upper/lower previsions. Within the field of imprecise probability, many work has been done in the direction of defin- ing, axiomatizing, and computing transition operators of, both, discrete-time and continuous-time imprecise Markov chains, see e.g. De Bock [ 11 ], de Cooman et al. [ 12 ], Krak et al. [ 23 ], and ˇ Skulj [ 34 ],[ 35 ]. Concepts that are related to impre- cise Markov chains include Markov set-chains, see Hartfiel [ 22 ], and, in the field of mathematical finance, BSDEs on Markov chains, see Cohen and Elliott [ 8 ],[ 9 ], and Markov chains under nonlinear expectations, see Nendel [ 24 ] and Peng [ 29 ]. The aim of this paper is link and compare the concepts and results obtained in the fields of imprecise probability and mathematical finance. Since Markov chains under model uncertainty form the largest intersection between both communities, we put a special focus on the latter. The paper is organized as follows: In Section
Mehr anzeigen ### Konstruktion einer unterjährlichen Markov-Kette aus einer jährlichen Markov-Kette

In den Wirtschaftswissenschaften liegen die für Bewertungen benötigten Daten normalerweise als Jahreswerte vor, z.B. Zinssätze oder Sterblichkeiten in der Finanz- und Versicherungsmathematik. Darauf aufbauend lassen sich Markov-Ketten mit einem jährlichen Zeitraster konstruieren. Zu bewertende Zahlungen hingegen erfolgen meist unterjährlich. Der vorliegende Artikel beschäftigt sich mit der Frage, wie aus einer Markov-Kette mit jährlichem Zeitraster, eine Markov-Kette mit unterjährlichem Zeitraster konstruiert werden kann. Dabei stehen Markov- Ketten, deren Übergangsmatrizen als obere Dreiecksmatrizen gegeben sind, im Mittelpunkt des Interesses. Es werden zwei Ansätze und deren Anwendung dargestellt. Der erste Ansatz basiert auf der T-ten Wurzel der Übergangsmatrizen, der zweite Ansatz auf einer Linearisierung der Übergangsmatrizen.
Mehr anzeigen ### Growth Miracles and Failures in a Markov Switching Classification Model of Growth

The model identifies four distinct growth regimes and three clusters of countries. In the stable growth regime countries grow steadily with a long-run growth rate of around two percent. The mir- acle growth state features persistently fast growth around five to six percentage points per year and allows countries to embark on a catch-up process. The stagnation regime is characterized by zero growth on average and a low persistency of growth rates. Finally, the crisis state captures very large shocks, both positive and negative ones, although on average negative shocks are more frequent. The derived clusters of countries imply distinct patterns of growth for each of the groups. Coun- tries in the first cluster are successful countries such as the industrialized and newly industrialized countries and some steadily growing developing countries. Their growth process is characterized by steady growth and prolonged growth spurts. The second and the third cluster consist mainly of developing countries. Countries in the second cluster are moderately successful growers that spend the majority of their time in the stagnation and the stable growth regime with approximately equal weight. Growth spurts in this cluster do occur, but they are short-lived compared to those of cluster one. Finally, countries in the third cluster are characterized by disappointing growth experiences. They usually find themselves either in stagnation or in crisis and exhibit a very large volatility of growth rates. The income gap between these countries and industrial leaders continues to diverge so that the countries may be called laggards or growth failures. Conforming to conventional wisdom, the majority of African countries is associated with this cluster.
Mehr anzeigen ### Model Checking of Continuous-Time Markov Chains Against Timed Automata Specifications

1. Introduction Continuous-time Markov chains (CTMCs) are one of the most prominent models in performance and dependability analysis. They are exploited in a broad range of applications, and constitute the underlying semantical model of a plethora of modeling formalisms for real-time probabilistic systems such as Markovian queueing networks, stochastic Petri nets, stochastic variants of process algebras, and calculi for systems biology. CTMC model checking has been mainly focused on the branching-time temporal logic CSL (Continuous Stochastic Logic [3, 7]), a variant of timed CTL where the CTL universal and existential path quantifiers are replaced by a probabilistic operator. Like CTL model checking, CSL model checking of finite CTMCs proceeds by a recursive descent over the parse tree of the CSL formula. One of the key ingredients is that time-bounded reachability probabilities can be approximated arbitrarily closely by a reduction to transient analysis in CTMCs . This results in an efficient polynomial-time algorithm that has been realized in model-checking tools such as PRISM  and MRMC  and has been successfully applied to various case studies from diverse application areas.
Mehr anzeigen ### Model Checking of Continuous-Time Markov Chains Against Timed Automata Specifications

6. Conclusion This paper addressed the quantitative (and qualitative) verification of a finite CTMC C against a linear real-time specification given as a deterministic timed automaton (DTA). We studied DTA with finite and Muller acceptance criteria. The key result (for finite acceptance) is that the probability of C |= A equals the reachability probability in the embedded discrete-time Markov process of a PDP. This PDP is obtained via a standard region construction. Reachability probabilities in the thus obtained PDPs are characterized by a system of Volterra integral equations of the second type and are shown to be approx- imated by a system of PDEs. For Muller acceptance criteria, the probability of C |= A equals the reachability probability of the accepting terminal SCCs in the embedded PDP. These results apply to DTA with arbitrarily (but finitely) many clocks. For single-clock DTA with finite acceptance, Pr(C |= A) is obtained by solving a system of linear equations whose coefficients are solutions of a system of ODEs. As the coefficients are in fact transient probabilities in CTMCs, this result implies that standard algorithms for CTMC analysis suffice to verify single-clock DTA specifications.
Mehr anzeigen ### Detecting unemployment hysteresis: A simultaneous unobserved components model with Markov switching

In what follows, we describe the data and the model, explain the results and con- clude. 2 Data We use seasonally adjusted unemployment rates ranging from January 1960 to June 2015. For Germany, this is register data by the Federal Employment Agency. A worker counts as unemployed if he or she is temporarily not employed, searches for a new job subject to social security, is available for job placement efforts of the employment agency (thus not yet in measures) and registered as unemployed. U.S. unemployment provided by the Bureau of Labor Statistics is based on the self- assessment of participants in the Current Population Survey regarding their labour market related activities in the reference week. Due to these different concepts, the absolute levels of the two series should not be compared.
Mehr anzeigen ### Model Averaging in Markov-Switching Models: Predicting National Recessions with Regional Data

a single Markov chain S t drives the changes for both the y t and x t ’s variables. 9 The results show that, first, in both univariate and bivariate cases, the lowest QPS’ are obtained when using the QPS-based model-averaging scheme. This holds true for both DMA and BMA in the univariate case, and the differences are the most noticeable in the BMA context. Second, in the context of DMA, the combined weighting scheme that relies on both the QPS and the marginal likelihood is a very close second-best weighting scheme, which further emphasizes the value of the QPS to calculate models’ weights. Third, as the volatility of the series increases (i.e., for higher values of σ), the differences in terms of QPS across the weighting schemes tend to soften. This is relatively intuitive in that, given the DGP’s we consider, as the volatility of the series increases, regime shifts in the series become less apparent, and it is therefore more difficult to make inference on the regimes, which translates into higher QPS, and lower value added resulting from weighting schemes based on QPS. Overall, this simulation exercise underlines the relevance of our model-averaged scheme based on past predictive performance to classify the regimes (i.e., QPS-based). The next section evaluates the relevance of this framework from an empirical point of view, forecasting national U.S. recessions based on a set of regional indicators.
Mehr anzeigen ### Realities beyond the Grey-Markov model for forecasting international tourism demand

This study applied a forecasting method to forecast Korean tourism demand from three major source countries (Japan, China and United States), which combines the first-order one variable grey differential equation model from grey system theory and Markov chain model from stochastic process theory. Then it tested the real perform- ance of the model in terms of international tourism demand during 2013 through 2017. Even though all of the relative errors of the forecasting values are less than 5% meaning the Grey-Markov chain model gives higher precision in forecasting tourism demand, this study revealed that the real international tourism demand is not closed to the values forecasted by using the mathematical model. There are some other factors which took a role breaking the mathematical forecasting rules based basically upon historical records. This study informs international tourists’ movement in terms of Korean tourism demand is heavily influenced by political issues associated with China and Japan.
Mehr anzeigen