A Case Study: Product transition qualication

3.3 Evaluation and application

3.3.3 A Case Study: Product transition qualication

DABCGDABC-DABCDABCGDABCGAGDA

(3.10)

Eq. 3.10 shows that 4 of 28 episodes are identical (originally 27, but there was a gap injection to get an equal length of both sequences), that is true for the shape of segments. Further analysis showed that 9 episodes dier at least one fuzzy attribute if the parameters are the following: thresholds of magnitude are 0.1 for medium and 0.3 for large changes, episodes longer than 10 data points are medium-sized and after 30 points duration, they are marked to be long.

As one can see, with equal number of segments, episode sequences can be aligned with fewer transformation steps: four gaps are injected, while DTW warps 8 segments. The converted episode based time warping works identical to Fig. 3.10, that suggests not using PLA for segmentation, as it is a corner-stone of the whole algorithm. Although it is indexable, but a symbolic, higher order representation of a trend is more understandable for the user.

These results are just partially comparable but prove that if a priori knowl-edge is incorporated into the algorithm it can improve the accuracy in contrast to a generally applied technique.

3.3.3 A Case Study: Product transition qualication

While the previous sections presented the numeric performance and classi-cation ability of symbolic sequence alignment, we also want to characterize its applicability and interpretability on industrial data as the basic algorithm was originally inspired by technological data analysis. In this section, the pre-sented algorithm was tested on product changing strategies of the Hungarian propylene polymer plant introduced in Chapter 2 and Section A.1.1.

Our case study analyzes historical product transitions in order to qualify whether current practice fullls expectations, i.e. whether the same transitions are performed in the same way, or moreover whether the transitions can be grouped.

To answer these questions, a total set of 31 transitions were compared

based on the score of alignment of 2-hour-trends. These 31 transitions cover a 2-month-production with two homo- and copolymer production cycles (15 and 16 transitions, respectively).

While our algorithm segments only 1-D trends, there are several ways to handle 5-D product transitions:

1. Apply PCA to project transition data into 1-D signals in an unsupervised way;

2. Apply PCA and a priori knowledge to select most informative process variable and neglect others;

3. Apply our algorithm to each dimension and calculate overall average similarity.

Note that the implemented algorithm covers all three cases and their appli-cation depends on users' a priori knowledge and intuition or additional analysis of process data.

As it will be detailed later on, we have chosen Case 2 and Case 3 and applied them in a simultaneous way in order to utilize system knowledge and apply the same classication method as in Section 3.3.1 to verify our decisions.

Two groups were set:

1. Rising MI group: H-H transitions (with C-H op. mode change) 2. Decreasing MI group: C-C transitions (with H-C op. mode change) Operational mode changes were grouped as well based on MI change during operational mode change. As group labels were set up, the data set was divided into two sets: the rst production cycle (15 transitions) was applied for training and the second one (16 transitions) for testing.

As setting parameters of the algorithm, bound levels were left at default value of [5% 15% 5% 30%] while σ was tuned on training set in order to achieve minimal misclassication rate. In the gures, transitions are noted as a 4-digit-code: {H, C}x{H, C}y, i.e. starting product type and number, resulting product type and number, e.g. 'H6C1'.

The following steps were made in order to select an appropriate base for transition qualication:

1. Quadratic reconstruction error (P₅

i=1kX−Xkb ) of the 5-D to 1-D pro-jection was calculated for each dimension and summarized: the average was 1030 and 5 out of 16 transitions were misclassied (σ = 7). Re-sults meant signicant information loss and a relative poor classication result, thus it was clear that further information is needed to choose informative variables on product transitions.

2. As new prior information, we can use that it is known that reactor tem-perature is tried to keep constant during transitions (temtem-perature values do not carry product specic information in their shape) thus these pro-cess variables could be left out from further analysis.

3. Mainly the catalyst inlet to reactors is responsible for productivity. Av-erage quadratic reconstruction error for the 3-D to 1-D projection hap-pened to be 326 which is satisfactory but still 5 out of 16 transitions were misclassied (σ = 6). It origins from the fact that operators tune this variable in order to control production speed while minimizing o-spec products during transitions thus there is no real connection between tran-sition group and catalyst feed. To prove this, catalyst feed was separated from hydrogen inlets and in the following it will be analyzed simultane-ously with them. Figure 3.11 presents a good example for two transitions of the same type and dierences in the process variables during transi-tions.

4. Although the avg. reconstruction error was only 65.4 and 3 out of 16 transitions were misclassied (σ = 7), instead of using a 2-D to 1-D projection, sum of hydrogen feeds to the reactors was calculated and used for analysis as a 1-D signal for transition qualication. In this case no PCA was needed and the 3/16 misclassication rate could be kept as well.

5. Applying the classication test for catalyst feed as a stand-alone signal, this misclassication rate happened to be 6/16. It is clear that hydrogen feed to the reactors is inevitable for product transition qualication but catalyst feed is another key factor in operators' transition strategy which cannot be neglected.

At the end of the preprocessing step, from the 5 process variables two

−5

Figure 3.11: Two transitions of type H3H4. The plotted variables are: hy-drogen feed to reactor R201, hyhy-drogen feed to reactor R202, catalyst feed to reactor R201 and projected 1-D signal, respectively.

separate but important variables were selected for qualication: (i) sum of hydrogen feed and (ii) catalyst feed.

All the 31 transitions for both variables are pairwise compared based on their alignment score. Results are shown in a dendrogram in Figure 3.12.

In Fig. 3.12(a), a complete linkage clustering of the 31 transitions is shown based on only the sum of hydrogen feeds while Fig. 3.12(b) presents the case when the similarity scores are summarized for the two signal. The two groups of transitions can be unambiguously separated when only hydrogen feed was considered while taking catalyst feed into account means a clear disturbance.

To visualize these results in a better way all the 31 transitions were mapped into 2-D by metric MDS considering scores of hydrogen feed (Fig. 3.13(a)) and both variables (Fig. 3.13(b)).

The conclusion of these visualizations is that recently there is no conse-quent transition strategy of the operators: hydrogen feeds are increased and decreased according to the target product's MI value but the way how it is done diers for each transition (shape of trends dier even for the same type of transitions). Moreover, as an other degree of freedom, catalyst feed during transitions seems to be out of transition scope for operators as it is not changed according to transition type.

C7C8

Figure 3.12: Hierarchical linkage clustering of product transitions visualized in a dendrogram considering (a) only hydrogen feed and (b) simultaneously hydrogen and catalyst feeds.

−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8

Sammon Stress is 0.14171

(a) Sammon Stress is 0.12031

(b)

Figure 3.13: Two-dimensional multidimensional scaling for visualization of product transitions in case of (a) only hydrogen feed and (b) simultaneously hydrogen and catalyst feeds. Group labels are noted by color code: blue for H-H type and green for C-C type transitions.

3.4 Conclusion

This chapter presented a practical tool that is able to extract useful informa-tion from quantitative historical process time series in a qualitative way. For decreasing the large amount of data and for resulting in a qualitative descrip-tion of trends, it applies triangular episode segmentadescrip-tion. Episode sequences are optimally aligned by a powerful technique, pairwise sequence alignment.

The algorithm uses a non-metric measure to estimate time series similarity. It can handle multidimensional and noisy input data as well by applying princi-pal component analysis and Gaussian-ltering techniques. It has been shown that this tool is able to nd similarities in two trends and compare them based on their alignment score value. Towards this goal, one is able to search for similarities in trends, hence classify process trends based on their shape en-coded into episode sequence. The presented algorithm can help plant engineers and operators to manage production in a reproducible way and qualify their transition strategy results.

Future work will concentrate on ne-tuning of similarity measure and mod-ication of the algorithm to be able to work online using streaming data.

Chapter 4 Model parameter estimation, experiment design

As it was emphasized in Chapter 1, process models play an important role in computer aided process engineering. Although the structure of these models are a priori known, model parameters should be estimated based on exper-iments. The accuracy of the estimated parameters largely depends on the information content of the experimental data presented to the parameter iden-tication algorithm. Optimal experiment design (OED) can maximize the condence on the model parameters.

This Chapter proposes a new additive sequential evolutionary experiment design approach to maximize the amount of information content of experi-ments. The main idea is to use the identied models to design new exper-iments to gradually improve the model accuracy while keeping the collected information from previous experiments (which can be stored in the process data warehouse). This scheme requires an eective optimization algorithm, hence the other contribution of the algorithm is the incorporation of Evolu-tionary Strategy (ES) into the iterative scheme of optimal experiment design (AS-OED).

4.1 Optimal Experiment Design

Consider the following general class of process models:

dx(t)

dt = f(x(t),u(t),p) (4.1)

y(t) = g(x(t)), (4.2)

where u is the vector of the manipulated inputs, y is the output (vector), x represents the state of the system and p denotes the model parameters. The p parameters are unknown and needed to be estimated based on data taken from experiments.

Before any further step, structural identiability of parameters needs to be veried. It is a minimal, necessary condition for achieving a successful estima-tion of model parameters from perfect input-output data. Global identiability for a system with a transfer function G(s,p)means that G(s,p)≡G(s,p∗) is only solvable ifp=p∗. The model is locally identiable whenpis constrained in the neighborhood of p∗ [172]. The structural identiability of models with the structure above has already been investigated in the literature and it was shown that global identiability can be proven by dierential algebraic method i.e. by rearranging the model to linear regression problem [172, 173, 174].

The estimation of these parameters is based on the minimization of the square error between the output of the system and the output of the model:

minp

J_mse(u(t),p) = 1 t_exp

Z _t_exp

t=0

(e^T(t)·Q(t)·e(t))dt)

(4.3)

e(t) = ˜y(u(t))−y(u(t),p) (4.4) in which ˜y(u(t)) is the output of the system for a certain u(t) input prole, and y(u(t)) is the output of the model for the same u(t)input prole with p parameters, Q is a user supplied square weighting matrix that represents the variance measurement error.

The basic element of the experiment design methodology is the Fisher infor-mation matrix F, which combines information on (i) the output measurement error and (ii) the sensitivity of the model outputs ywith respect to the model parameters:

F(p⁰,u(t)) = 1 t_exp

Z _t_exp

t=0

(∂y

∂p(u(t),p)|_p=p⁰)^T ·Q(t)·(∂y

∂p(u(t),p)|_p=p⁰)dt(4.5) The sensitivities are calculated based on the partial derivatives of the model parameters. As the true parameters p∗ are unknown during experiment de-sign, the derivatives are calculated near to the so-called nominal parameters p^o, which can be given by some initial guess, extracted from literature or esti-mated from the previous experiments. The optimal design criterion aims the

minimization of a scalar function of theFmatrix. Several optimal criterion ex-ist, from which D-optimal and E-optimal criteria are presented here suggested by Bernaerts et al. [144]:

- D-optimal criterion maximizes the determinant of the covariance matrix and thus maximizes the volume of the joint condence region:

J_D = max

u(t)(det(F)) (4.6)

- E-optimal criterion minimizes the condition number of F, i.e. the ratio of the largest to the smallest eigenvalue of the Fisher matrix:

J_E = min

u(t)

λmax(F)

λ_min(F (4.7)

These values correspond to the uncertainty of the parameter estimation problem. Fig. 4.1 and 4.2 illustrate the eect of the input prole on the model output in case of the case study that will be presented later in Section 4.3. The contour plots show the square error of model output with respect to its parameters around the p^o nominal parameters: when an E-optimal input prole is used, then the parameter uncertainty region is smaller. This means that, if p^o nominal parameters are close to the optimal p^o parameters, the parameter estimation based on this prole (Fig. 4.1) most likely results in accurate parameters than the estimation based on a manually selected prole (Fig. 4.2).

0 5 10 15 20 25 30 35 40

0 0.1 0.2

µµµµ_m/µµµµ_m⁰ Ks/Ks

0.99 0.995 1 1.005 1.01

0.9 1 1.1

10 20 30

Figure 4.1: Contour plots of the identication cost with respect to parameters for an E-optimized feeding prole.

0 5 10 15 20 25 30 35 40 0

0.1 0.2

µµµµ_m/µµµµ_m⁰ Ks/Ks

0.99 0.995 1 1.005 1.01

0.9 1 1.1

100 200 300 400 500

Figure 4.2: Contour plots of the identication cost with respect to parameters for a manually selected feeding prole.

These gures suggest that when one has only a draft estimate on the pa-rameters of a complex dynamical model, one should use it to design an u(t) input prole for a parameter estimation procedure rather than use data taken from a non-optimized input prole for the identication.

However, if thep^onominal parameters are far from thep∗true parameters, convergence cannot be guaranteed after the rst optimal design. Hence, an it-erative design schema is needed to obtain convergence from p^o top∗(Fig.4.3).

Both the parameter estimation and the experiment design steps of this iter-ative schema represent a complex nonlinear optimization problem, hence the eectiveness of the applied optimization algorithms have great inuence on the performance of the whole procedure. The classical solution is to use non-linear least squares (NLS) algorithm for parameter estimation (Eq. 4.3), and sequential quadratic programming (SQP) for the experiment design (Eq. 4.7).

Here, the application of evolutionary strategy (ES) is proposed for this pur-pose, which can be used for non-linear optimization problems. Evolutionary Strategy is a stochastic optimization algorithm that uses the model of natural selection. The advantage of ES is that it has proved particularly successful in problems that are highly nonlinear, that are stochastic and/or poorly un-derstood [175]. A description of the ES algorithm applied in my work can be found in the Appendix (Section A.3.1).

The Evolutionary Strategy function in Matlab environment has three im-portant parameters that one has to adjust carefully in order to nd the most reliable solution with the least computation time:

Initial parameters

Experiment design

Experiment

Parameter estimation

More experiment?

end p⁰

u(t)

)

~ t( y

p⁰ yes

Figure 4.3: Design schema of the classical OED method.

- number of generations,

- size of a population in a generation and

- number of individuals with the best t values, which appear unchanged in the next generation.

It is shown later on in Section 4.3.1 that the application of ES results in more reliable parameter estimation.

The other main drawback of the classical iterative approach is that the Fisher information matrix F contains information only about the current ex-periment regardless of the information content of the previous exex-periments, and the parameters are identied from this experiment. It can be benecial to include the information from previous experiments within the parameter esti-mation and the experiment design steps as well, thus an additive sequential version of the above iterative schema has been developed, which is introduced in the following section.

4.2 Additive Sequential Optimal Experiment

In document Folyamat-szimulációs és adatbányászati eszközök integrált alkalmazása (Pldal 90-101)