• Nem Talált Eredményt

4.2 Fuzzy Association Rule Mining for Model Structure Identification . 77

4.2.2 Experimental results

In this section four examples are considered to demonstrate the performance of the new method. During the experiments, the rule mining conditions are set to the followings:

σ(minimumsupport) = n

N, (4.28)

where n is the number of features (input variables) andN denotes the number of

2 2.5 3 3.5 4 4.5 5 0.2

0.22 0.24 0.26 0.28 0.3 0.32 0.34 0.36

1 2 6 7

Ag. FCORR

Ag. FCO 1 2 5 7 2 3 5 7

2 4 5 7 1 2 4 7 1 2 7 1 2 6 1 2 5 1 2 3 5 1 6 7

Selected model structures for a nonlinear system

Figure 4.11: Selected model structures for a simple nonlinear example, the struc-tures are presented by the variables between the stars, e.g. 1267 represents the right structure with variables 1, 2, 6 and 7

Feature selection in linear/nonlinear systems

First, let’s see a simple nonlinear system, where inputs areu= [u1, u2, . . . , u7]and the output is determined by the following equation:

y =u1·u2+u6·u27 (4.29)

The values of input variables are generated randomly, then the output is calculated for each sample. The number of clusters were set to five at the output and three at the input variables. The most interesting structures are determined by the MOSS-FARM. The result of selection process are presented in Fig. 4.11. It can be seen that the new method selects the structure {1 2 6 7} (with high aggregated FCORR and FCO values). This structure consists the right variables, namelyu1, u2, u6, u7

which determine the value ofy.

Feature selection in mixed continuous and discrete data

The second example demonstrates how the developed algorithm can handle data sets

2.9 3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 0.125

0.13 0.135 0.14 0.145 0.15 0.155 0.16

1 2 3 4 5 6

Ag. FCORR

Ag. FCO

1 2 3 4 5 1 2 3 4 6 1 2 3 4

1 2 4 5 6 1 2 3 5 6

2 3 4 5 6 1 2 4 6

1 2 4 5 1 2 3 5

Selected model structures in mixed continuous and discrete data

Figure 4.12: Selected model structures for a system with mixed continuous and discrete data, the structures are presented by the variables between the stars

containing both continuous and discrete inputs. Consider the following function:

y={ x21 +ε for x2 = 1

x24+x3+ε for x2 = 0 (4.30) in which the output switches between two nonlinear functions depending on the value of the random discrete regressor x2 ∈ {0,1}. The remaining variables are randomly generated according to the uniform distributionU[0,1]. Two additional random dummy inputsx5 and x6 are included in the set of candidate inputs. The noise term ε is a normally distributed random variable: ε N(0,0.1). While a regression-tree induction method is applied to select correct variables x1 through x4, two dummy variables (x5 and x6) are also selected (see the MATLAB code in the Appendix B which is used to generate the regression tree).

In the developed algorithm, the number of clusters in continuous variables is set to five, while discrete variable clearly has two distinct values. At the output variable the number of clusters is set to three. The MOSSFARM selects the correct variablesx1 throughx4 with high values of both selection measures, see the struc-ture {1,2,3,4} at the top of Fig. 4.12.

Model order selection for a continuous polymerization reactor

2.7 2.75 2.8 2.85 2.9 2.95 3 3.05 3.1

Selected structures for polymer process

Figure 4.13: Selected model orders for the Continues Polymer Process, the model orders are presented by the orders between the stars, e.g. 145 represents the follow-ing order: yk+1 =f(yk, uk, uk−1)

a dynamic system based on data generated by a simulation model of a continuous polymerization reactor. The reaction takes place in a jacketed CSTR (Continu-ous Stirred Tank Reactor). Under some simplifying assumption, the first-principle model is given by:

˙

Dimensionless state variable x1 is the monomer concentration, and x4/x3 is the number-average molecular weight (the outputy). Process inputuis the dimension-less volumetric flow rate of the initiator. For further information on this model and its derivation, see [32]. According to [88], a uniformly distributed random input is applied over the range 0.007 to 0.015 with the sampling time of 0.2 s. With four states, a sufficient condition for representing the dynamics is a regression vector that includes three delayed inputs and outputs (all together six variables). While a regression-tree induction method (see the Appendix for more information) is ap-plied to all of the 941 data points, it selects all the six variables,x1throughx6, i.e.

0 200 400 600 800 1000 0

0.5 1 1.5

y(k)

0 200 400 600 800 1000

0 20 40 60 80 100

Time [s]

u(k)

Figure 4.14: Input-output data of the van der Vusse reactor

yk+1 =f(yk, yk−1, . . . , yk−3, uk, . . . , uk−3). In MOSSFARM the number of clusters are set to three for all regressors, and five for the output one. The first ten (by right of the FCO measure) of the selected structures are presented in Fig. 4.13. As it can be seen, the method was able to select the correct model structure {1 4 5}, namely yk+1 =f(yk, uk, uk−1)[88].

Model order selection for van der Vusse Reaction

Van der Vusse reaction is a third order exothermic process in a cooled CSTR. It has strong nonlinearity with a non-minimum-phase behavior and input multiplicity. To estimate the model orders 960 data points (Fig. 4.14) were used and the number of lagged inputs and outputs in regressor were set to three. The possible good model

Table 4.6: Summary of results of van der Vusse model identification Structure MSE of linear model MSE of NN model

{1 2 4 5} 4.45e-002 1.65e-005

{1 2 4 5 6} 9.59e-002 2.95e-004

{1 2 3 4 5 6} 1.04e-001 5.25e-004

orders are determined by the new method (clusters are set to three for regressors and eight for the output) and presented in Fig. 4.15. The best model order is given by the structure {1 2 4 5}, namelyyk+1 = f(yk, yk−1, uk, uk−1). Similar model structure

3.55 3.6 3.65 3.7 3.75 3.8 3.85 3.9 3.95 0.38

0.385 0.39 0.395 0.4 0.405 0.41

1 2 3 4 5 6

Ag. FCORR

Ag. FCO

1 2 3 4 5 1 2 4 5 6

1 2 3 4 6 1 2 3 5 6 1 3 4 5 6

2 3 4 5 6 1 2 4 5

1 2 4 6 1 3 4 5

Selected structures for Van der Vusse Reaction

Figure 4.15: Selected model orders for the van der Vusse Reaction

also high values for the applied interesting measures. To verify the result of the new model order selection strategy the selected structures ({1 2 4 5}, {1 2 4 5 6} and {1 2 3 4 5 6}) were used for linear and nonlinear (neural network based) model iden-tification. While for linear model identification the least square method is used, in non-linear case a neural network (NN) based model is applied. The number of the input neurons of NN were set to the length of structures and in the hidden layer three neurons were applied (and one for output, of course). In both linear and non-linear cases free run simulations are used to get information about the real performances of the models. The results of the identifications are depicted in Fig. 4.16 (samples between 300 and 600 seconds are selected for representation). It can be seen, that NN model gives the best results for all the three structures (both cases), but the best fitting NN model is at the structures {1 2 4 5}. Moreover, mean square errors (MSE) are calculated to the right determination of most adequate model (Table 4.6). The calculations verify that the developed method selects the correct model order for the van der Vusse reaction.

300 350 400 450 500 550 600

300 350 400 450 500 550 600

0.3

Nonlinear modell with {1 2 4 5}

y: output ym: modell output

300 350 400 450 500 550 600

0.3

Linear modell with {12 4 5 6}

y: output ym: modell output

300 350 400 450 500 550 600

0.3

Nonlinear modell with {12 4 5 6}

y: output ym: modell output

300 350 400 450 500 550 600

0.3 ym: modell output

300 350 400 450 500 550 600

0.3

Nonlinear modell with {12 3 4 5 6}

y: output ym: modell output

Figure 4.16: Results of linear and nonlinear model identifications