Taken all together, we provide a comprehensive QML framework for the estimation of income distributions using grouped data. Combining the QML functions with according prior distributions then allows for Bayesianinference using basic MCMC techniques. Our approach is efﬁcient in the sense that all available data information is included in the likelihood, whose characteristics depend on the speciﬁc DGP at hand. We further ﬁnd that the QML estimation is simple and fast to implement with standard asymptotic properties corresponding to asymptotically efﬁcient ML under the usual regularity conditions. With regard to Bayesianinference, the proposed independent MH algorithm for sampling the posterior distribution shows a high degree of efﬁciency as reﬂected by high acceptance probabilities and accordingly low numerical standard errors.
to reduce the total number of required model evaluations for generating a sufficiently large sample. For Bayesianinference of computationally expensive ODE-based transmission models, the adaptive algorithm thus offers a practical tool for efficient posterior sampling. In order to identify necessary model components of the transmission model with vacci- nation, model selection was performed by estimating the marginal likelihood of the data for each considered model (Chapter 3.3). Since the marginal likelihood can not be computed analytically for models based on nonlinear ODEs, it is assessed via a newly enhanced algorithm for pointwise estimation of the normalized posterior density based on an ex- isting MCMC sample output. This estimation approach – which was inspired by Chib and Jeliazkov (2001) – utilizes an expectation ratio identity derived from the detailed- balance condition of the preceding MCMC run, which can then be estimated by using Monte-Carlo methods. The methodological innovation presented in this thesis suggests to estimate an expectation ratio from a detailed-balance condition according to an alterna- tive proposal distribution, although its corresponding Metropolis-Hastings algorithm may differ from that one that was used for generating the available posterior sample. Within a Gaussian posterior setting we have shown, that proposal distributions which are optimal with respect to MCMC sampling do not necessarily yield optimal results for the respective marginal likelihood estimation algorithm. We thus derived, that minimal variance and bias of the marginal likelihood estimation can be achieved by choosing a state-independent pro- posal density which approximately mimics the posterior. Results from a simulation-based analysis suggest (3.3.4), that the modified algorithm using proposal functions defined via an existing posterior sample can considerably improve accuracy of the marginal likelihood estimation while also requiring fewer additional model evaluations. The improved per- formance was stable over a wide range of model dimensions and possible shapes of the posterior density.
In recent years, the inverse Ising problem, i.e., the recon- struction of couplings and external fields of an Ising model from samples of spin configurations, has attracted considerable interest in the physics community [ 1 ]. This is due to the fact that Ising models play an important role for data modeling with applications to neural spike data [ 2 , 3 ], protein structure determination [ 4 ], and gene expression analysis [ 5 ]. Much effort has been devoted to the development of algorithms for the static inverse Ising problem. This is a nontrivial task, be- cause statistically efficient, likelihood-based methods become computationally infeasible by the intractability of the partition function of the model. Hence one has to resort to either approximate inference methods or to other statistical estima- tors such as pseudolikelihood methods [ 6 ] or the interaction screening algorithm [ 7 ]. The situation is somewhat simpler for the dynamical inverse Ising problem, which recently attracted attention [ 8 – 13 ]. If one assumes a Markovian dynamics, the exact normalization of the spin transition probabilities allows for an explicit computation of the likelihood if one has a complete set of observed data over time. Nevertheless, the model parameters enter the likelihood in a fairly complex way, and the application of more advanced statistical approaches such as Bayesianinference again becomes a nontrivial task. This is especially true for the continuous time kinetic Ising model where the spins are governed by Glauber dynamics [ 14 ]. With this dynamics the likelihood contains an exponential function related to the “nonflipping” times and makes analyt- ical manipulations of the posterior distribution of parameters intractable. However, it is possible to compute the likelihood gradient to find the maximum likelihood estimate (MLE) [ 15 ]. In this paper we will show how the likelihood for the continuous time problem can be remarkably simplified by introducing a combination of two sets of auxiliary random
November 29, 2017
In order to identify structural shocks that affect economic variables, restrictions need to be imposed on the parameters of structural vector autoregressive (SVAR) models. Economic theory is the primary source of such restrictions. However, only over-identifying restrictions can be tested with statistical methods which limits the statistical validation of many just-identified SVAR models. In this study, Bayesianinference is developed for SVAR models in which the structural parameters are identified via Markov-switching heteroskedasticity. In such a model, restrictions that are just-identifying in the homoskedastic case, become over-identifying and can be tested. A set of parametric restrictions is derived under which the structural matrix is globally identified and a Savage-Dickey density ratio is used to assess the validity of the identification conditions. For that purpose, a new probability distribution is defined that generalizes the beta, F, and compound gamma distributions. As an empirical example, monetary models are compared using heteroskedasticity as an additional device for identification. The empirical results support models with money in the interest rate reaction function.
Received: 25 November 2016; Accepted: 8 May 2017; Published: 23 May 2017
Abstract: Factor modeling is a popular strategy to induce sparsity in multivariate models as they scale
to higher dimensions. We develop Bayesianinference for a recently proposed latent factor copula model, which utilizes a pair copula construction to couple the variables with the latent factor. We use adaptive rejection Metropolis sampling (ARMS) within Gibbs sampling for posterior simulation: Gibbs sampling enables application to Bayesian problems, while ARMS is an adaptive strategy that replaces traditional Metropolis-Hastings updates, which typically require careful tuning. Our simulation study shows favorable performance of our proposed approach both in terms of sampling efficiency and accuracy. We provide an extensive application example using historical data on European financial stocks that forecasts portfolio Value at Risk (VaR) and Expected Shortfall (ES).
Two problems arise in practical applications of compressed sensing: how to reconstruct an image from a fixed undersampling design, and how to choose the design in the first place. While a large amount of work was done for the former, we are not aware of much progress for the latter. Although there is substantial prior work on k-space optimisation [Greiser and von Kienlin, 2003, von Kienlin and Mejia, 1991, Spielman et al., 1995], this has been done for linear reconstruction (section 6.2.1), neglecting image sparsity (section 6.2.2). As we demon- strate here, it pays off to match the k-space trajectory to the sparse reconstruction technique. Established concepts such as the point spread function (section 6.2.3), tailored to linear recon- struction, do not capture the inherent dependence of sparse (nonlinear) estimation algorithms on the acquired signal. The latter cannot improve upon the Nyquist limit uniformly, but only for statistically sparse signals, and successful nonlinear k-space optimisation has to take this de- pendence into account. We phrase k-space optimisation as a problem of experimental design, and propose an algorithm based on Bayesianinference, where statistical sparsity characteristics of images are incorporated by way of a prior distribution. The application of this procedure to high resolution MR images becomes feasible only with the scalable inference algorithm of chapter 3.
Until the mid 1990s, there was a dearth of research on Bayesianinference on sta- ble distribution parameter estimation. Buckle (1995) is one of the earliest to imple- ment a Markov Chain Monte Carlo (MCMC) algorithm (specifically, the Gibbs sam- pler), to make parametric and predictive Bayesianinference for stable distributions, and to generate Bayesian posterior samples from the parameters of a stable distribu- tion with any prior distribution. The work by Buckle was followed by a slew of ar- ticles across disparate fields—computational statistics, finance/economics, signal pro- cessing, acoustics/speech, astronomy/astrophysics, pattern recognition, pharmacology, and genetics/biostatistics (gene expression profiling), among others. Qiou (1996) and Qiou and Ravishanker (1997, 1999) develop a sampling-based conditional Bayesian approach that simultaneously estimates the stable-law parameters and the parameters of a linear ARMA model, thus extending Buckle’s approach to time series and mul- tivariate sub-Gaussian ARMA problems. Ravishanker and Qiou (1998) further refine this research using Monte Carlo Expectation Maximization (MCEM). Godsill and Ku- ruoglu (1999) employ a hybrid rejection sampling and importance sampling scheme to implement MCMC and MCEM using a general framework involving scale mixtures of normals (SMiN). They claim their approach improves upon straightforward rejec- tion sampling and Metropolis-Hastings approaches for symmetric stable models, and find use for this technique in the field of audio signal noise reduction. Tsionas (1999) likewise uses a SMiN representation limited to symmetric stable distributions with ap- plications to econometric time series. Casarin (2004) generalized existing techniques to include Bayesianinference for mixtures of stable distributions, arguing that in some cases financial data exhibit not only heavy tails and skewness but also multimodality. Salas-Gonzalez, Kuruoglu, and Ruiz (2006a,b) employ a reversible-jump MCMC algo- rithm for parameter estimation of stable distributions involving impulsive, asymmetric, and multimodal data from the field of digital signal processing. Lombardi (2007) devel- ops a random walk Metropolis sampler using a Fast Fourier Transform of the stable-law characteristic function to approximate the likelihood function, as explained in Rachev and Mittnik (2000).
NPI is a framework of statistical theory and methods that use these A(n)-based lower and upper probabilities, and also considers several variations of A(n) which are suitable for dierent inferences. For example, NPI has been presented for Bernoulli data, multinomial data, and lifetime data with right-censored observations; NPI also enables inference for multiple future observations, with their interdependence explicitly taken into account. NPI provides a solution to some central goals formulated for objective (Bayesian) inference, which cannot be obtained when using precise probabilities (Coolen 2006). NPI is also exactly calibrated (Lawless and Fredette 2005), which is a strong consistency property, and it never leads to results that are in conict with inferences based on empirical probabilities. Inferential problems for which NPI solutions have recently been presented or are being developed include aspects of medical diagnosis with the use of ROC curves, robust classi- cation, inference on competing risks, quality control and acceptance sampling. To pick out an important application, a generalisation of the Kaplan-Meier estimator for survival func- tions (Kaplan and Meier 1958) was developed by Coolen and Yan (2004), which eectively expresses the uncertainty inherent in the estimation by lower and upper bounds. We think this is a striking example for the potential of imprecise probability methods, allowing to draw conclusions from data without the need to add overreaching assumptions.
Importance of coherence of PS points has been justified to this point; still implicit is the effect of density of these points on the final products of PSI. The PS points are regarded as the samples for the final PSI products, e.g. APS and deformation map, thus the higher the spatial frequency of these samples is, the more reliable the recovered signals will be. A trade off appears between the density of the PSs and their coherence, rendering the detection of the PSs a critical task. On the one hand the false detection of low coherent points impose vast computational burden in post process outlier detection methods to find and discard them, on the other hand discarding the coherent points, as a result of missed detection, decreases the density of the PSs and leads to information loss. It is the task of detection criterion to set a balance between these two. Improving this criterion can therefore lead to integration of more stable points in one hand or reducing the post processing effort for removal of miss detected points on the other. As discussed in previous section, SCR is the base criterion in detection. However, the two introduced methods of SCR estimation, i.e. the amplitude dispersion index and the spatial SCR estimator, are both biased and underestimate the phase error . In the latter, bias stems from the fact that the surrounding pixels in spatial estimation window are not perfect representa- tives for the clutter , while in the former it is due to simplification of statistics of the amplitude stack [3, 34]. Alternative to these two methods, the exact signal statistics can be considered to estimate the temporal SCR. Formulation and evaluation of this alterna- tive approach is expanded in section 3.3 as the first application of the Bayesianinference software.
Uncertainties can also be calculated for the network inversion. We made use of a state-of-the-art training method to approximate the network weight distribution with varia- tional inference and calculate the uncertainties in the predic- tion. Compared to other existing methods, this method has the advantage of requiring essentially the same evaluation time of a standard network evaluation. It can be, therefore, particu- larly useful when the network is used in real time systems, which bene ﬁt of the uncertainty information when using the network prediction to make further actions or take decisions. The network has been tested on data collected during several pulses at the JET tokamak, considering a wide range of plasma features and scenarios. A comparison of the net- work inferred pro ﬁles and those found with the conventional Bayesianinference shows a discrepancy in the two methods reconstructed uncertainties. This should not surprise, as they arise from two very different models with different free parameters, observed quantities, and different limitations, and therefore they are not expected to match. This discrepancy is a price that has to be paid to achieve the several orders of magnitude acceleration provided by the network. As we trained the network on a Bayesian model, we could use the same model to simulate the observations, given the network reconstructed pro ﬁles, and compare them against the mea- surements. We included in the comparison the full Bayesianinference reconstruction, which was carried out making use of the same model. The comparison was therefore fully con- sistent: the network inversion being a fast approximation of the full model one. The error in the prediction of the mea- surements is consistently larger when using the network predicted density pro ﬁles, as it might be expected from an approximated inversion. Still, the error is consistently below approximately 20% in all considered experimental cases, suggesting that the network inversion can be a reliable tool for fast analysis.
In econometrics, there is a large literature on estimation and inference in set-identified models from the frequentist perspective, including Horowitz and Manski (2000), Imbens and Manski (2004), Chernozhukov et al. (2007), Stoye (2009), Romano and Shaikh (2010), to list a few. See Canay and Shaikh (2017) for a survey of the literature. There is also a growing literature on Bayesianinference for partially identified models. Some propose posterior inference based on a single prior irrespective of the posterior sensitivity introduced by the lack of identification (Epstein and Seo (2014); Baumeister and Hamilton (2015); Gustafson (2015)). Our paper does not intend to provide any normative argument as to whether one should adopt a single prior or multiple priors in set- identified models. Our main goal is to offer new tools for inference in set-identified models, and to show that these tools have a well-defined posterior interpretation in finite samples and yield asymptotically valid frequentist inference. In parallel work, Norets and Tang (2014) and Kline and Tamer (2016) consider Bayesianinference about the identified set. Norets and Tang (2014) focus on the specific setting of dynamic discrete choice models and Kline and Tamer (2016) consider a non-standard environment where a well-defined Bayesian analysis is not possible because the likelihood for the structural parameters of interest is not available (e.g., incomplete structural models or models defined by moment inequalities). In contrast, we consider a general setting where a likelihood is available and we obtain well-defined (robust) Bayesian inferential statements by introducing the notion of ambiguity over the identified set through multiple priors. Liao and Jiang (2010), Wan (2013), and Chen et al. (forthcoming) propose using Bayesian Markov Chain Monte Carlo methods to overcome some computational challenges of the frequentist approach to inference about the identified set (e.g., for the criterion-function approach considered in Chernozhukov et al. (2007)).
A stylized fact that needs to be captured by the models is that the conditional variance can react asymmetrically to positive versus negative shocks or large versus small shocks. That is, the conditional variance may follow different regimes according to the size and signal of the shock. News have asymmetric impact in the economy and, for example, a large negative return might affect the future volatility in a different way when compared to a positive return with the same size. Engle and Ng (1993) presents a review regarding this issue. Awartani and Corradi (2005) discuss the importance of asymmetries in the prediction of an economic index. This may be accommodated by smooth transition models based on asymmetric specification of the conditional variance model. Thus, different specifications of the skedastic functions will take into account size and sign effects in the volatility. In this work we consider smooth transition Garch models and discuss inferential issues related to the smooth transition function and Bayesianinference.
In this paper, we develop and apply Bayesianinference for an extended Nelson- Siegel (1987) term structure model capturing interest rate risk. The so-called Stochastic Volatility Nelson-Siegel (SVNS) model allows for stochastic volatility in the underlying yield factors. We propose a Markov chain Monte Carlo (MCMC) algorithm to efficiently estimate the SVNS model using simulation-based inference. Applying the SVNS model to monthly U.S. zero-coupon yields, we find significant evidence for time-varying volatility in the yield factors. This is mostly true for the level and slope volatility revealing also the highest persistence. It turns out that the inclusion of stochastic volatility improves the model’s goodness-of-fit and clearly reduces the forecasting uncertainty particularly in low-volatility periods. The proposed approach is shown to work efficiently and is easily adapted to alter- native specifications of dynamic factor models revealing (multivariate) stochastic volatility.
We present an approximate Bayesianinference approach for estimating the intensity of a inhomogeneous Poisson process, where the intensity function is modelled using a Gaussian process (GP) prior via a sigmoid link function. Augmenting the model using a latent marked Poisson process and P´ olya–Gamma random variables we obtain a representation of the likelihood which is conjugate to the GP prior. We estimate the posterior using a variational free–form mean field optimisation together with the framework of sparse GPs. Furthermore, as alternative approximation we suggest a sparse Laplace’s method for the posterior, for which an efficient expectation–maximisation algorithm is derived to find the posterior’s mode. Both algorithms compare well against exact inference obtained by a Markov Chain Monte Carlo sampler and standard variational Gauss approach solving the same model, while being one order of magnitude faster. Furthermore, the performance and speed of our method is competitive with that of another recently proposed Poisson process model based on a quadratic link function, while not being limited to GPs with squared exponential kernels and rectangular domains.
This paper presents the R package MitISEM (mixture of t by importance sampling weighted expectation maximization) which provides an automatic and flexible two-stage method to approximate a non-elliptical target density kernel – typically a posterior den- sity kernel – using an adaptive mixture of Student-t densities as approximating density. In the first stage a mixture of Student-t densities is fitted to the target using an expectation maximization algorithm where each step of the optimization procedure is weighted using importance sampling. In the second stage this mixture density is a candidate density for efficient and robust application of importance sampling or the Metropolis-Hastings (MH) method to estimate properties of the target distribution. The package enables Bayesianinference and prediction on model parameters and probabilities, in particular, for mod- els where densities have multi-modal or other non-elliptical shapes like curved ridges. These shapes occur in research topics in several scientific fields. For instance, analysis of DNA data in bio-informatics, obtaining loans in the banking sector by heterogeneous groups in financial economics and analysis of education’s effect on earned income in labor economics. The package MitISEM provides also an extended algorithm, ‘sequential Mi- tISEM’, which substantially decreases computation time when the target density has to be approximated for increasing data samples. This occurs when the posterior or predictive density is updated with new observations and/or when one computes model probabilities using predictive likelihoods. We illustrate the MitISEM algorithm using three canonical statistical and econometric models that are characterized by several types of non-elliptical posterior shapes and that describe well-known data patterns in econometrics and finance. We show that MH using the candidate density obtained by MitISEM outperforms, in terms of numerical efficiency, MH using a simpler candidate, as well as the Gibbs sam- pler. The MitISEM approach is also used for Bayesian model comparison using predictive likelihoods.
inference and failures in the models upon which (optimal) infer- ence is based (cf.,  ). We argue that patients showing pathologi- cal behaviour can still perform ‘optimal’ inference based on a particular generative model of the environment, which causes their behaviour to be less well adapted compared to control sub- jects. Importantly, the inference machinery itself is not broken, but instead a maladaptively constituted generative model causes pathological behaviour – based on (optimal) Bayesianinference. This implies a focus not on what patients perform in place of Bayesianinference but rather how their particular generative model – upon which inference is based – is constituted. Put simply, if the brain truly is a hierarchical Bayesian prediction machine, one can explain both normal and pathological behaviour. Identifying speciﬁc parameters or the speciﬁc constitution of the generative model in a patient group would have substantial impact on improving diagnosis, prediction of relapse or therapeutic success in clinical practise. Note that this marks a crucial advantage over approaches that focus on the deviation from optimality: investigating the subject-speciﬁc generative model of a task – as opposed to noting a subject’s deviation from ‘optimal’ behaviour – allows us (in principle) to identify the origins of aberrant infer- ence in the cerebral hierarchy. In addition, this approach allows for a more nuanced perspective on the causes underlying observed behaviour. It is possible that some psychiatric conditions share characteristic aberrancies in expressed behaviour but that their particular causes are different. While noting the characteristic deviation from ‘optimality’ does not enable us to differentiate between different mechanisms that might cause this deviation, investigating the generative model underlying observed behaviour allows us to differentiate possible mechanisms that induce similar behaviour.
and Bayesian credible intervals do not asymptotically coincide with frequentist confidence intervals (Moon and Schorfheide 2012). Moreover, in the context of SVARs, Baumeister and Hamilton (2015) show that even priors that are ‘uniform’ over a set-identified parameter may be informative about the objects of interest, such as impulse responses. To address these issues, Giacomini and Kitagawa (2018) (GK18) propose an approach to Bayesianinference in set-identifed models that is robust to the choice of prior over the set-identified parameters. The approach considers the class of all priors over the model’s set-identified parameters that are consistent with the identifying restrictions. This generates a class of posteriors, which can be summarised by reporting the set of posterior means (an estimator of the identified set) and the associated robust credible region. GK18 provide conditions under which these quantities have valid frequentist interpretations and they apply their approach to SVARs in which the impulse responses are set-identified by imposing sign and zero restrictions.
& Tinbergen Institute & RCEA
This paper presents the R package MitISEM (mixture of t by importance sampling weighted expectation maximization) which provides an automatic and flexible two-stage method to approximate a non-elliptical target density kernel – typically a posterior density kernel – using an adaptive mixture of Student-t densities as approximating density. In the first stage a mixture of Student-t densities is fitted to the target using an expectation maximization (EM) algorithm where each step of the optimization procedure is weighted using importance sampling. In the second stage this mixture density is a candidate density for efficient and robust application of importance sampling or the Metropolis-Hastings (MH) method to estimate properties of the target distribution. The package enables Bayesianinference and prediction on model parameters and probabilities, in particular, for models where densities have multi-modal or other non-elliptical shapes like curved ridges. These shapes occur in research topics in several scientific fields. For instance, analysis of DNA data in bio- informatics, obtaining loans in the banking sector by heterogeneous groups in financial economics and analysis of education’s effect on earned income in labor economics. The package MitISEM provides also an extended algorithm, ‘sequential MitISEM’, which substantially decreases computation time when the target density has to be approximated for increasing data samples. This occurs when the posterior or predictive density is updated with new observations and/or when one computes model probabilities using predictive likelihoods. We illustrate the MitISEM algorithm using three canonical statistical and econometric models that are characterized by several types of non-elliptical posterior shapes and that describe well-known data patterns in econometrics and finance. We show that MH using the candidate density obtained by MitISEM outperforms, in terms of numerical efficiency, MH using a simpler candidate, as well as the Gibbs sampler. The MitISEM approach is also used for Bayesian model comparison using predictive likelihoods.
The cosmic microwave background radiation (CMB) is presently one of the most informa- tive data sets for cosmologists to study the physics of the early Universe. Of actual interest is in particular the verification of the existence of an inflationary phase of the Universe and investigations of the physical properties of the involved inflaton field(s). An essential quantity is thereby the primordial adiabatic scalar potential Φ. Its statistic, especially the two-point function, was determined during inflation, when the quantum fluctuations of the inflationary field were frozen during their exit of the Hubble horizon. This statistic is conserved on super-horizon scales during the epoch of reheating until the individual perturbed modes re-enter the horizon. Therefore, significant information on the inflation- ary phase is encoded in the observable quantity Φ. The processes translating the initial modes after their horizon re-entry into the observed CMB fluctuations are described by the so-called radiation transfer functions, see Refs. [4, 64]. As a consequence, many inference methods aim at constraining parameters of the early Universe involve Φ or their statistics. Therefore the CMB fluctuations provide a highly processed view on the primordial scalar potential. In this work, we attempt, however, their direct reconstruction and visualization via Bayesianinference. Once they are reconstructed a direct investigation of their statistics is possible, e.g., the inference of the primordial power spectrum, their connection to large scale structure , or primordial magnetic fields [95, 96].
Abstract. Bayesian motion control and planning is based on the idea of fusing motion ob- jectives (constraints, goals, priors, etc) using probabilistic inference techniques in a way simi- lar to Bayesian sensor fusing. This approach seems promising for tackling two fundamental problems in robotic control and planning: (1) Bayesianinference methods are an ideal candi- date for fusing many sources of information or constraints – usually employed in the sensor processing context. Complex motion is characterised by such a multitude of concurrent con- straints and tasks and the Bayesian approach provides a solution of which classical solutions (e.g., prioritised inverse kinematics) are a special case. (2) In the future we will require plan- ning methods that are not based on representing the system state as one high-dimensional state variable but rather cope with structured state representations (distributed, hierarchical, hybrid discrete-continuous) that more directly reflect and exploit the natural structure of the environment. Probabilistic inference offers methods that can in principle handle such repre- sentations. Our approach will, for the first time, allow to transfer these methods to the realm of motion control and planning.