Modelling local GPS/levelling geoid undulations using Support Vector Machines

(1)

Ŕ periodica polytechnica

Civil Engineering 52/1 (2008) 39–43 doi: 10.3311/pp.ci.2008-1.06 web: http://www.pp.bme.hu/ci c Periodica Polytechnica 2008 RESEARCH ARTICLE

Modelling local GPS/levelling geoid undulations using Support Vector Machines

PiroskaZaletnyik/LajosVölgyesi/BélaPaláncz

Received 2007-12-05

Abstract

Support vector machines (SVM) with wavelet kernel has been applied to the correcting gravimetric geoid using GPS/levelling data. These data were divided into a training and a validation set in order to ensure the extendability of the approximation of the corrector surface. The optimal parameters of the SVM were considered as a trade-offbetween accuracy and extendability of the solution in order to avoid overlearning. Employing 194 training points and 110 validation points, SVM provided an approximation with less than 3 cm standard deviation of the error and nearly perfect extendability.

Keywords

geoid·corrector surface·GPS·support vector regression· wavelet kernel

Acknowledgement

The authors wish to thank A. Kenyeres and the Hungarian Institute of Geodesy, Cartography and Remote Sensing for pro- viding GPS/levelling data of Hungary and also the Hungarian Research Fund OTKA project T046718.

Piroska Zaletnyik

Department of Geodesy and Surveying, Research Group of Physical Geodesy and Geodynamics of the Hungarian Academy of Sciences, BME, POBox91, H- 1521, Hungary

e-mail: zaletnyikp@hotmail.com

Lajos Völgyesi

Department of Geodesy and Surveying, Research Group of Physical Geodesy and Geodynamics of the Hungarian Academy of Sciences, BME, POBox91, H- 1521, Hungary

e-mail: volgyesi@eik.bme.hu

Béla Paláncz

Department of Photogrammetry and Geoinformatics, BME, POBox91, H-1521, Hungary

e-mail: palancz@epito.bme.hu

1 Introduction

The accuracy of the gravimetrical geoid can be significantly improved using GPS/levelling measurements. The new, adjusted geoid can be constructed as the gravimetric one plus the so called corrector surface, the difference between the gravimetric and the GPS/levelling geoid.

Recently, wide variety of higher-order parametric and non- parametric surfaces have been used as corrector surfaces, such as polynomial models by Fotopoulos and Sideris 2005 [4], spline interpolation by Featherstone 2000 [2] and Zaletnyik et al 2007 [13], least squares collocation (LSC) by Iliffe et al 2003 [5], kriging by Nahavandchi and Soltanpour 2004 [8], combined least squares adjustments by Fotopoulos 2005 [3], and various other surfaces. Most recently Zaletnyik et al. 2007 [14] em- ployed thin plate spline (TPS) surface, solving the problem via finite element method. Suffice it to say, there are numerous surface-fitting options, each with their own advantages and dis- advantages, which will not be discussed nor debated here.

Concerning application of soft computing technique Kav- zoglu and Saka 2005 [6] and Lao-Sheng Lin 2006 [7] em- ployed artificial neural network (ANN) for approximating the GPS/levelling geoid instead of the corrector surface itself. Both of them applied feed-forward ANN with the standard sigmoid activation functions and different number of hidden layers. Za- letnyik et al. 2007 [13] also used ANN but with radial bases activation function (RBF) and regularization in the training phase. Soltanpour et al 2006 [11] used second generation wavelets to approximate corrector surface directly. This technique let extend the classical wavelet approximation, which re- quires regularly spaced/sampled data, for unregularly spaced dataset, Sweldens 1997 [12].

Another soft computing technique is represented by the support vector machines (SVM), which are learning algorithms that have many applications in pattern recognition and nonlinear regression. In this study we propose to apply support vector machine with wavelet kernel for modelling the corrector surface.

(2)

2 Support Vector Machines for Regression

The problem of regression is that of finding a function which approximates mapping from an input domain to the real num- bers based on a training sample. We refer to the difference between the hypothesis output and its training value as the residual of the output, an indication of the accuracy of the fit at this point.

We must decide how to measure the importance of this accuracy, as small residuals may be inevitable while we wish to avoid large ones. The loss function determines this measure. Each choice of loss function will result in a different overall strategy for per- forming regression. For example least square regression uses the sum of the squares of the residuals.

Although several different approaches are possible, we will provide an analysis for generalization of regression by introducing a threshold test accuracy, beyond which we consider a mis- take to have been made. We therefore aim to provide a bound on the probability that a randomly drawn validation point will have accuracy less than∈. One way of visualizing this method of as- sessing performance is to consider a band of size± ∈around the hypothesis function any training points lying outside this band are considered to be training mistakes, see Fig. 1.

Fig. 1. Linear∈-insensitive loss functionL^∈(x,y,f)

Therefore we can define a so called∈-insensitive loss function. The linear∈-insensitive loss functionL^∈(x,y, f)is defined by

L^∈(x,y, f)=(|y− f(x)|)_∈=max(0, |y− f(x)| − ∈) (1) where f is a real-valued function on a domainX ⊂ <ⁿ,x∈ X andy ∈ <. Similarly the quadratic∈-insensitive loss is given by

L^∈₂(x,y, f)=(|y− f(x)|)²_∈. (2) Support vector regression (SVR) uses an admissible kernel, which satisfies the Mercer’s condition to map the data in input space to a highdimensional feature space in which we can process a regression problem in linear form. Letx ∈ <ⁿ and y ∈ <, where <ⁿ represents input space, see Cristianini and Shawe-Taylor 2003 [1]. By some nonlinear mapping8, the vec- torxis mapped into a feature space in which a linear regressor function is defined,

y= f(x, w)= hw, 8(x)i +b. (3)

We seek to estimate this f function based on independent uni- formly distributed data {{x1,y1}, ...,{xm,ym}}, by finding w which minimizing the quadratic ∈-insensitive gosses, with ∈, namely the following function should be minimize

c

m

X

i=1

L^∈₂(x_i,y_i,f)+1

2(kwk)²→min (4) wherewis weight vector andcis a dimensionless constant parameter. Considering dual representation of a linear regressor in (3), f(x)can be expressed as

f(x)=

m

X

i=1

βiy_ih8(x_i), 8(x)i +b (5) what means that the regressor can be expressed as a linear combination of the training points. Consequently using an admissible kernel, a kernel satisfying the Mercer’s condition, Paláncz et al 2005 [10], we get

f(x)=

m

X

i=1

βiy_iK(x_i, x)+b=

m

X

i=1

αiK(x_i, x)+b. (6) By using Lagrange multiplier techniques, the minimization problem of (4) leads to the following dual optimization problem

maximize W(α)=

m

P

i=1

y_iαi− ∈

m

P

i=1

|αi|

−¹₂

m

P

i,j=1

αiαj

K(x_i, x_j)+¹_cδi j

(7)

subject to

m

P

i=1

αi =0.

Let

f(x)=

m

X

i=1

αi^∗K(xi, x)+b^∗, (8) whereα^∗ is the solution of the quadratic optimization problem andb^∗is chosen so that f(xi)=yi− ∈ −^α_c^∗ⁱ for anyα^∗_i >0.

For samples are inside the∈-tube,{xi : |f(xi)−yi|<∈}, the correspondingα^∗is zero. It means we do not need these samples to describe the weight vectorw.Consequently

f(x)= X

i∈SV

α_i^∗K(x_i, x)+b^∗ (9) where

SV = {i: |f(xi)−yi| ≥ ∈}. (10) Thesex_i sample vectors,{x_i : i ∈SV}, that come with nonva- nishing coefficientα^∗are called support vectors.

3 Wavelet Kernel

In our case, we select wavelet kernel forn = 2, which provides better local characterization than other kernels, see Zhang et al. 2004 [15] and was proved to be very efficient in many regression problems, e.g. Paláncz et al. 2005 [10].

Wavelet kernel witha ∈ <¹and all compactX ⊂ <ⁿ, K(x,z)=

n

Y

i=1

cos

1.75x_i −z_i a

exp

"

−(x_i −z_i)² 2a²

#!

. (11)

(3)

4 Dataset for the numerical computations

The original gravimetric geoid was modelled via third order spline approximation, which provides a fairly good approximation, a fitting with 1 - 2 cm error in height, see Zaletnyik et al.

2007 [13].

Fig. 2. The Hungarian gravimetric geoid

For modelling the corrector surface, there are 304 GPS/levelling data available. One part of these data was em- ployed as training set (194 measurements), the other part was used as validation set (110 measurement), see Fig. 3.

Fig. 3. GPS/levelling data - the training set (circles), and the validation set (triangles)

5 Parameter Study

In order to achieve an efficient fitting, one should find the optimal parameter of the applied kernel function (a)as well as the proper values ofcand∈. Parameter investigations showed that with increasing values ofcand∈, the regression error (root mean square error,RMSE) decreases on the training and the validation set, too. In our casec=400 and∈= 10⁻³ proved to be reasonable values, while for greater values the changes of RMSE’s are negligable.

However the value of the parametera has a strong influence on the quality of the approximation. Table??shows the change

of the sum ofRMSE’s (that of training and validation set, respec- tively) as well as the ratio of these RMSE’s, namely introducing

η= R M S E_V

R M S E_t (12)

a ratio indicates how realiably can we extend our regression model for not measured data. The ideal value is 1. Ifη >>

1, then so-called overlearning effect takes place.

Tab. 1. The result of the parameter study in case ofc=400 and∈=10⁻³

a R M S E_t[cm] R M S E_V [cm] R M S E_total[cm] η

0.50 0.50 2.06 1.28 4.12

1.00 1.64 1.94 1.79 1.18

1.50 2.15 2.37 2.26 1.10

2.00 2.31 2.57 2.44 1.11

2.50 2.59 2.74 2.66 1.06

Fig. 4 shows the corrector surface in case ofa=0.5 when the total error is smallR M S E_total(=1.28) butη=4.12 is high.

Fig. 4.Corrector surface in case of typical overlearning (a=0.5)

This result indicates that one should make a trade-off between extendability (η) and regression error (R M S Etotal). In our case we selecteda=2.5, which ensures smallηand accept- ableRMSEas well as smooth regression surface, see Fig. 5.

6 The model for the corrector surface

Using these parameter values (c=400 and∈= 10⁻³m and a=2.5) the computation with the wavelet kernel Eq. (11) was carried out. We used theMathematicaimplementation of SVM regression, see Paláncz 2005 [9].

(4)

Tab. 2. Corrector Surface approximated by SVM

Training set Validation set

Method SD [cm] Min [cm] Max [cm] R M S E_V[cm] SD [cm] Min [cm] Max [cm] R M S E_V [cm] η

SVM regression 2.60 -7.74 6.47 2.59 2.75 -7.68 5.59 2.74 1.06

Fig. 5. Smooth and extendable corrector surface

The analytical form of the corrector surface is 1H = −0.0370989

−8.52042e⁻⁰^.⁰⁸⁽⁴⁷^.²⁴²⁹^−ϕ)²⁻⁰^.⁰⁸⁽¹⁶^.⁴⁴⁸³^−λ)²

·cos[0.7(47.2429−ϕ)] cos[0.7(16.4483−λ)]

−11.4187e⁻⁰^.⁰⁸⁽⁴⁷^.⁰⁹⁸⁷^−ϕ)²⁻⁰^.⁰⁸⁽¹⁶^.⁵⁵⁶²^−λ)²

·cos[0.7(47.0987−ϕ)] cos[0.7(16.5562−λ)] +...

−6.64333e⁻⁰^.⁰⁸⁽⁴⁸^.⁰¹⁰¹^−ϕ)²⁻⁰^.⁰⁸⁽²²^.⁵⁰⁹⁸^−λ)²

·cos[0.7(48.0101−ϕ)] cos[0.7(22.5098−λ)]

−4.17322e⁻⁰^.⁰⁸⁽⁴⁸^.¹²⁹⁶^−ϕ)²⁻⁰^.⁰⁸⁽²²^.⁵⁴⁹¹^−λ)²

·cos[0.7(48.1296−ϕ)] cos[0.7(22.5491−λ)]

−1.67954e⁻⁰^.⁰⁸⁽⁴⁷^.⁹¹⁷⁶^−ϕ)²⁻⁰^.⁰⁸⁽²²^.⁷⁶²^−λ)²

·cos[0.7(47.9176−ϕ)] cos[0.7(22.762−λ)]. (13) Table 1 shows that the standard deviation (SD) on the training as well as on the validation set is about 2.6 - 2.8 cm, which from practical point of view is acceptable and which is also important, that extendability coefficient is very good, near to unity. In the table RMSE- root mean squared error - is the square root of the mean of the error vector of the measurement points,

R M S E= s

MeanX

i

(1H(ϕi, λi)−1Hi)² (14)

The resulted corrector surface now is very smooth, see Fig. 5.

7 Adjusted Geoid

In order to get the corrected geoid, the corrector surface should be added to the original geoid surface, see Fig. 6.

Fig. 6. Adjusted geoid with the training and validation points

8 Conclusions

SVM with quadratic∈-insensitive loss function was applied to constructing corrector surface for gravimetrical geoid, using GPS/levelling data. Employing wavelet kernel it turned out, that only the kernel parametera has considerable influence on the quality of the approximation, while the SVM parameterscand

∈do not play important role in this case. The optimal parameters of the SVM were considered as a trade-offbetween accuracy and extendability of the solution in order to avoid overlearning.

Employing 194 training points and 110 validation points, SVM provided an approximation with less than 3 cm standard deviation of the error and nearly perfect extendability. The corrector surface can be described via analytical form and directly im- plemented in a high level language, like C, in order to get high performance evaluation.

In the future, the investigation of the application of other type of kernel can be reasonable.

References

1 Cristianini N, Shawe-Taylor J,An introduction to Support Vector Machines and other kernel - based learning methods, University Press, Cambridge, 2003.

2 Featherstone WE,Refinement of a gravimetric geoid using GPS and levelling data, J. Surv. Eng.126(2000), no. 2, 27-56.

3 Fotopoulos G,Calibration of geoid error models via a combined adjust- ment of ellipsoidal, orthometric and gravimetric geoid height data, J. of Geod.79(2005), no. 1-3, 111-123.

4 Fotopoulos G, Sideris MG,Spatial Modelling and Analysis of Adjusted Residuals over a network of GPS-levelling Benchmarks, GEOMATICA59 (2005), no. 3, 251-262.

(5)

5 Iliffe JC, Ziebart M, Cross PA, Forsberg R, Strykowski G, Tschern- ing CC,OGSM02: a new model for converting GPS-derived heights to local height datums in Great Britain and Ireland, Surv. Rev.37(2003), no. 290, 276-293.

6 Kavzoglu T, Saka MH,Modelling local GPS/levelling geoid undulations using artificial neural networks, J. Geod.78(2005), 520-527.

7 Lao-Sheng Lin,Application of a Back-Propagation Artificial Neural Net- work to Regional Grid-Based Geoid Model Generation Using GPS and Lev- elling Data, J. of Surv. Eng.133(2007), no. 2, 81-89.

8 Nahavandchi H, Soltanpour A,An attempt to define a new height datum in Norvay. The geodesy and Hydrography days 2004, 4-5 Nov. Sandnes, Nor- way.

9 Paláncz B,Support Vector Regression via Mathematica(2005), available at http://library.wolfram.com/infocenter/Mathsource/5270/.

10Paláncz B, Völgyesi L, Popper Gy,Support Vector Regression via Mathe- matica, Periodica Polytechnica, Civil Eng49(2005), no. 1, 57-84.

11Soltanpour A, Nahavandchi H, Featherstone WE,Geoid-type surface determination using wavelet-based combination of gravimetric quasi/geoid and GPS/levelling data, Geophysical Research Abstract8(2006), 04612.

12Sweldens W, The lifting scheme: a construction of second generation wavelets, SIAM J Math. Anal29(1997), no. 2, 511-546.

13Zaletnyik P, Paláncz B, Völgyesi L, Kenyeres A,Correction of the gravimetric geoid using GPS/levelling data, Geomatikai KözleményekX(2007), 231-240 (Hungarian).

14Zaletnyik P, Völgyesi L, Kirchner I, Paláncz B, Combination of GPS/levelling and gravimetric geoid by using the thin plate spline interpolation technique via finite element method, J. of Appl. Geod1(2007), 233-239.

issue 4.

15Zhang L, Zhou W, Jiao L,Wavelet Support Vector Machine, IEEE Trans.

Systems, Man and Cybernetics - Part B: Cybernetics, 2004, pp. 34-39.