Generic arithmetic evaluation, regression and data analysis

2.11 Major concepts of the software package

2.12.16 Generic arithmetic evaluation, regression and data analysis – lfit

Modeling of data is a prominent step in the analysis and interpretation of astronomical observations. In this section, a standalone command line driven tool, named lfit is in-troduced, designed for both interactive and batch processed regression analysis as well as generic arithmetic evaluation.

This tool is built on the top of the libpsn library⁴⁸, a collection of functions managing symbolic arithmetic expressions. This library provides both the back-end for function evalu-ation as well as analytical calculevalu-ations of partial derivatives. Partial derivatives are required by most of the regression methods (e.g. linear and non-linear least squares fitting) and un-certainty estimations (e.g. Fisher analysis). The program features many built-in functions related to special astrophysical problems. Moreover, it allows the end-user to extend the capabilities during run-time using dynamically loaded libraries.

In general, lfit is used extensively in the data reduction steps of the HATNet project.

The program acts both in the main “discovery” pipeline and it is involved in the characteri-zation of follow-up data, including photometric and radial velocity measurements. Currently, lfit implements executively the EPD algorithm (including the normal, the reconstructive and the simultaneous modes) as well as the simultaneous TFA algorithm (see e.g. Bakos,

48http://libpsn.sf.net, developed by the author

Table 2.7: Basic functions found in the built-in astronomical extension library. These functions cover the fields of simple radial velocity analysis, some aspects of light curve modelling and data reduction. These functions are a kind of “common denominators”, i.e. they do not provide a direct possibility for applications but complex functions can be built on the top of them for any particular usage. All of the functions below with the exception ofhjd()andbjd()have partial derivatives that can be evaluated analytically bylfit.

Function Description

hjd(JD, α, δ) Function that calculates the heliocentric Julian date from the Julian dayJ and the celestial coordinatesα(right ascension) andδ(declination).

bjd(JD, α, δ) Function that calculates the barycentric Julian date from the Julian dayJ and the celestial coordinatesα(right ascension) andδ(declination).

ellipticK(k) Complete elliptic integral of the first kind.

ellipticE(k) Complete elliptic integral of the second kind.

ellipticPi(k, n) Complete elliptic integral of the third kind.

eoq(λ, k, h) Eccentric offset function, ‘q‘ component. The arguments are the mean longitude λ, in radians and the Lagrangian orbital elementsk=ecos̟,h=esin̟.

eop(λ, k, h) Eccentric offset function, ‘p‘ component.

ntiu(p, z) Normalized occultation flux decrease. This function calculates the flux decrease dur-ing the eclipse of two spheres when one of the spheres has uniform flux distribution and the other one by which the former is eclipsed is totally dark. The bright source is assumed to have a unity radius while the occulting disk has a radius of p. The distance between the centers of the two disks isz.

ntiq(p, z, γ1, γ2) Normalized occultation flux decrease when eclipsed sphere has a non-uniform flux distribution modelled by quadratic limb darkening law. The limb darkening is char-acterized byγ1andγ2.

Torres, P´al et al., 2009).

User interface and built-in regression methods

Due to the high modularization and freedom in its user interface, the programlfitallows the user to compare the results of different regression analysis techniques. The program features 9 built-in algorithms at the moment, including the classic linear least squares minimization (Press et al., 1992), the non-linear methods (Levenberg-Marquard, downhill simplex, see also Press et al., 1992), various methods providing an a posteriori distribution for the adjusted parameters, such as Markov Chain Monte-Carlo (Ford, 2004), or the method of refitting to synthetic data sets (Press et al., 1992). The program is also capable to derive the covariance or correlation matrix of the parameters involving the Fisher information analysis (Finn, 1992). The comprehensive list of the supported algorithms can be found in Table 2.6.

The basic concepts of lfit is shown in Fig. 2.27 in a form of a complete example for linear regression.

2.12. IMPLEMENTATION

Built-in functions related to astronomical data analysis

The programlfit provides various built-in functions related to astronomical data analysis, especially ones that are required by exoplanetary research. All of these functions are some sort of “base functions”, with a few parameters from which one can easily form more useful ones using these capabilities of lfit. Good examples are the eccentric offset functions p(λ, k, h) and q(λ, k, h) (Sec. 4.3), that have only three parameters but the functions related to the radial velocity analysis can easily be defined using these two functions. The full list of these special functions can be found in Table 2.7. The actual implementation of the above mentioned radial velocity model functions can be found in Chapter 4, in Fig. 4.3.

Extended Markov Chain Monte-Carlo

In this section we discuss in more details one of the built-in methods, that combines a Markov Chain Monte-Carlo algorithm with the parametric derivatives of the model functions in order to yield faster convergence and more reliable results, especially in the cases of highly correlated parameters.

The main concept of the MCMC algorithm (see e.g. Ford, 2004), is to generate an a posteriori probability distribution of the adjusted parameters. It is based on random walks in the parameter space as follows. In each step, one draws an alternate parameter vector from an a priori distribution and then evaluates the merit function χ². If the value of the χ² decreases, we accept the transition (since the newly drawn parameter vector represents a better fit), otherwise the transition is accepted by a certain probability (derived from the increment in χ²). The final distribution of the parameters depends on both the a priori distribution and the probability function used when the value of the ∆χ² is positive. The main problem of the MCMC method is that the a posteriori probability distribution can only be estimated if the a priori distribution is chosen well, but initially we do not have any hint for both distributions. The idea behind MCMC is to derive multiple chains, by taking thea posteriori distribution of the previous chain as the input (a priori) distribution for the upcoming chain. In regular cases, the chains converge to a final distribution after some iterations and therefore the last one can be accepted as a final result. In the literature, several attempts are known to define an a priori transition function (see also Ford, 2004).

Here we give a simple method that not only provides a good hint for thea priori distribution but yields several independent sanity checks that are then used to verify the convergence of the chain. The transition function used by this extended Markov Chain Monte-Carlo algorithm (XMMC) is a Gaussian distribution of which covariances are derived from the Fisher covariance matrix (Finn, 1992). The sanity checks are then the following:

• The resulted parameter distribution should have nearly the same statistical covariance

as the analytical covariance⁴⁹.

• The autocorrelation lengths of the chain parameters have to be small, i.e. nearly∼1−2 steps. Chains failed to converge have significantly larger autocorrelation lengths.

• The transition probability has to be consistent with the theoretical probabilities. This theoretical probability depends only on the number of adjusted parameters.

• The statistical centroid (mode) of the distribution must agree with both the best fit parameter derived from alternate methods (such as downhill simplex) as well as the chain element with the smallest χ².

The method of XMMC has some disadvantages. First, the transition probabilities expo-nentially decrease as the number of adjusted parameters increases, therefore, the required computational time can be exceptionally high in some cases. The Gibbs sampler (used in the classic MCMC) provides roughly constant transition probability. Second, the derivation of the Fisher covariance matrix requires the knowledge of the parametric derivatives of the merit function. In the actual implementation of lfit, XMMC one can use the method of XMMC if the parametric derivatives are known in advance in an analytical form. Otherwise, the XMMC algorithm cannot be applied at all.

However, in the case of HATNet data analysis, we found the method of XMMC to be highly efficient and we used it in several analyses related to the discoveries. Moreover, the most important functions concerning to this analysis, such as light curve and radial velocity model functions have known analytic partial derivatives. These derivatives for transit light curve model functions can be found in P´al (2008). An analytic formalism for radial velocity modelling is discussed in Sec. 4.3 and some additional related details and applications are presented in P´al (2009). In this thesis (in Chapter 3) a detailed example is given on the application of the XMMC algorithm in the analysis of the HAT-P-7(b) planetary system.

In document Tools for discovering and characterizing extrasolar planets PhD Dissertation (Pldal 93-96)