Tézisek - Folyamatadatok szabálykeresésen alapuló elemzése

1. A folyamat adattárház kiépítés hatékonyan támogatja a komplex többter-mékes technológiák monitorozását és elemzését. (2. fejezet)

(a) Meghatároztam melyek a fõ lépések és feltételek egy új, az elemzés, illetve a monitorozás funkcióját biztosító információs rendszer kidol-gozásához a mûködõ, komplex, többtermékes technológiák esetében.

A releváns információk kinyerésére feltáró jellegû statisztikai és adat-bányászati eszközöket fejlesztettem ki.

(b) Az információs rendszerek létrehozásához egy olyan kétszintû struk-túrát határoztam meg, ahol az elsõ (üzemeltetési) szinten az üzemeltetéshez kapcsolódó elemek és köztük zajló folyamatok helyezkednek el, a má-sodik (elemzési) szinten, pedig egy folyamat adattárház az információs rendszer centrális eleme. Az adattárház függetlenül üzemel az üzemeltetési szint adatbázisaitól, és kizárólag feldolgozott, megbízható folyamat ada-tokat tartalmaz. Az információs rendszer további elemei az adattárház adatainak elemzésére kidolgozott elemzõ eszközök, alkalmazások.

(c) Az információs rendszer általános sémáját egy többtermékes polipropilén technológia esetében egy prototípus rendszer formájában megvalósítot-tam. A létrehozott rendszer a polipropilén gyártások különbözõ sta-tisztikai és adatbányászati elemzését teszi lehetõvé. (Kapcsolódó pub-likációk: 5, 14, 24)

2. A felügyelt csoportosításon alapuló döntési fa indukció kompakt fuzzy osztályozót eredményez. (3. fejezet)

Kidolgoztam egy olyan új eljárást, mely felügyelt csoportosítással a folytonos attribútumokat úgy diszkretizálja, hogy azok felhasználásával hatékony dön-tési fa alapú osztályozók állíthatóak elõ. A kidolgozott diszkretizálás, szabály indukció és szabálytisztítási lépések eredményeként egy jól használható os-ztályozási szabály struktúrát kapunk. Az ilyen szabályokat tartalmazó fuzzy osztályozók nagyon kis méretûek, ugyanakkor osztályozási pontosságuk is versenyképes a szakirodalomban publikált értékekkel. (Kapcsolódó publiká-ciók: 4, 17, 18)

3. A fuzzy asszociációs szabálybányászat hatékonyan alkalmazható folya-mat adatok elemzésére. (4. fejezet)

(a) Asszociációs szabálykereséssel kompakt és pontos fuzzy osztályozókat állíthatunk elõ.

Az értelmezhetõség és a pontosság is kritikus jelentõséggel bír sok os-ztályozási feladat esetében. Az ismert associatív osztályozók nagy pon-tossággal osztályoznak, azonban teljesítményük túl nagy szabálybázi-son alapul. Ezekkel szemben egy olyan új eljárást dolgoztam ki, amel-lyel nagyon kis méretû szabálybázisból is lehetséges nagy pontosságú fuzzy osztályozókat elõállítani. Éppen ezért a módszer hatékonyan elõsegíti az adatok közötti kapcsolatok, illetve a becslési mechanizmus megértését is. (Kapcsolódó publikációk: 3, 4, 8, 9, 10, 11, 13, 20, 21)

(b) Fuzzy asszociációs szabálykeresés alkalmazható a fontos attribú-tumok kiválasztására, illetve modell struktúra meghatározására.

Adatvezérelt modell identifikációhoz nélkülözhetetlen egy hatékony mod-ell struktúra meghatározó eljárás. E feladatra egy olyan módszert dol-goztam ki, amely alkalmas a fontos változók kiválasztására nemlineáris (dinamikus) modellek elõállításához, folytonos és diszkrét értékû bemenet-kimenet adatok esetében is. A módszer asszociációs szabálykeresésen alapul, a fontos változók (illetve a modell struktúra) meghatározása két szabály érdekességi mutató figyelembe vételével történik. A módszer

hatékonyan alkalmazható nemlineáris rendszerek esetén annak meghatározására, hogy mely bemeneti változók az igazán meghatározóak a kimenetre

vonatkozóan, illetve erõsen nemlineáris dinamikus rendszereknél ki-jelöli a helyes modell struktúrát is, így egy hatékony eszköz a folyamat-modellezéshez. (Kapcsolódó publikációk: 1, 6, 7, 15, 16, 23)

(c) Egy új eljárást dolgoztam ki a feltárt asszociációs szabályok inter-aktív adatbányászatára, illetve vizualizációs elemzésére.

Habár a fuzzy asszociációs szabálykeresés egy széles körben használt hatékony adatbányászati eszköz az adatokban rejlõ hasznos informá-ciók kinyerésére, általában a legtöbb módszer hatalmas mennyiségben állít elõ szabályokat, generálva ezzel egy újabb adatbányászati prob-lémát a felhasználó számára. Az elõállt szabálybázis a sok szabály miatt áttekinthetetlen a hagyományos (pl. egyszerû szabálylista) elemzések során. Egy olyan új eljárást dolgoztam ki, ahol a feltárt gyakori elemhal-mazokat (illetve szabályokat) két dimenzióban ábrázolva az

összefüg-A kapcsolatok informatív ábrázoláshoz egy új távolság mértéket definiál-tam, amely két szabály relatív információ tartalmát határozza meg. E mértéket felhasználva a feltárt szabályok egyszerûen Sammon mappin-gal vagy Multidimensional Scaling-gel ábrázolhatóak két dimenzióban, így az új módszer egy hatékony eszköz alapja a szabályok vizualizációs elemzésére. (Kapcsolódó publikációk: 2, 12)

Tézisekhez kapcsolódó publikációk

Cikk nemzetközi folyóiratban

1. F. P. Pach, A. Gyenesei and J. Abonyi, MOSSFARM: Model Structure Se-lection by Fuzzy Association Rule Mining, Journal of Intelligent and Fuzzy Systems, Under review, impact: 0.283

2. F. P. Pach, A. Gyenesei and J. Abonyi, Visualization of Fuzzy Association Rules, Journal of Visual Languages and Computing, Manuscript needs major revisions, impact: 0.619

3. F. P. Pach, A. Gyenesei and J. Abonyi, Compact fuzzy association rule based classifier, Expert Systems with Applications, In press, Available online 16 April 2007 impact: 0.957

4. F. P. Pach and J. Abonyi, Association rule and decision tree based methods for fuzzy rule base generation, Enformatika (Transactions on Engineering, Computing and Technology), Volume 13, 2006 45-50

5. F. P. Pach, B. Feil, S. Nemeth, P. Arva and J. Abonyi, Process data ware-housing based operator support system for complex production technologies, IEEE Transactions on Systems Man and Cybernetics, Part A: Systems and Humans, Special Issue on Emerging Approaches to Integrating Distributed, Heterogenous and Complex Systems, 2006, Volume 36, Number 1, 136-153, impact: 0.555

6. F. P. Pach, F. Szeifert, S. Nemeth, P. Arva, J. Abonyi, Fuzzy association rule mining for data-driven analysis of dynamical systems, Hungarian Journal of Industrial Chemistry, Special Issue on Recent advances in Computer Aided Process Engineering, 2005, Volume 33 (1-2), 57-67, impact: 0.12

Idegen nyelvû könyvfejezet

7. F. P. Pach, A. Gyenesei, P. Arva, and J. Abonyi, Fuzzy association rule min-ing for model structure identification, Applications of Soft Computmin-ing: Recent Trends, Part VI Identification and Forecasting, Springer, 2006, 261-271

8. Pach F. P., Gyenesei A., Adatbányászat, a hatékonyság eszköze (Gyakorlati útmutató kezdõknek és haladóknak) 6. fejezet: Gyakori elemhalmazok és as-szociációs szabályok feltárása, Computerbooks, 2006, 237-272

Cikk hazai folyóiratban

9. Pach F.P., Gyenesei A., Németh S., Árva P., Abonyi J. Fuzzy association rule mining for the analysis of historical process data, Acta Agraria Kaposvárien-sis, Volume 10 (3), 2006, 89-107

10. Pach F.P., Gyenesei A., Németh S., Árva P., Abonyi J. A partícionálás hatása fuzzy asszociatív osztályozók teljesítményére, Acta Agraria Kaposváriensis, Volume 10 (3), 2006, 109-120

Konferencia elõadás

11. F. D. Tamás, F. P. Pach, J. Abonyi, Analysis of trace elements in clinker based on supervised clustering and fuzzy association rule mining, 12th International Congress on the Chemistry of Cement, M3-01.2, Montreal, Canada, 2007 12. Peter Matyus, F. P. Pach, Janos Abonyi, Attila Gyenesei, Visualization of

fuzzy association rules representing high-dimensional problems, 11th IPMU International Conference, Paris, 2006

13. F. P. Pach, B. Balasko, S. Nemeth, P. Arva. and J. Abonyi, Black-Box and first principle model based optimization of operating technologies, 5th Sympo-sium on Mathematical Modelling (Mathmod), Vienna, 2006, vol. 1: abstract in p. 284, vol. 2: CD

14. F. P. Pach, A. Gyenesei, P. Arva, and J. Abonyi, Fuzzy association rule min-ing for model structure identification, 10th Online World Conference on Soft Computing in Industrial Applications, Sept, 2005

15. F. D. Tamas, F. Peter Pach, J. Abonyi and A. M. Esteves, Analysis of trace elements in clinker based on supervised clustering and fuzzy decision tree in-duction, 6th International Congress, Global Construction: Ultimate Concrete Opportunities, Dundee, Scotland, 2005

16. F. P. Pach, J. Abonyi, S. Nemeth, P. Arva, Supervised clustering and fuzzy de-cision tree induction for the identification of compact classifiers, 5th

Interna-tional Symposium of Hungarian Researchers on ComputaInterna-tional Intelligence, November, 2004, Budapest, 267-277

Egyéb

17. F. P. Pach and J. Abonyi, Association rule and decision tree based methods for fuzzy rule base generation, Invited plenary talk, 3rd International Conference on Computer Science (ICCS), Budapest, 2006

18. F. P. Pach, F. Szeifert, S. Nemeth, P. Arva, and J. Abonyi, Association rule mining for data-driven analysis of dynamical systems, 10th International Work-shop on Chemical Engineering Mathematics, Budapest, 2005

19. F. P Pach, J. Abonyi, Fuzzy association rule based modeling for control, 35.

Mûszaki Kémiai Napok, Veszprém, 2007

20. Pach F.P., Gyenesei A., Németh S., Árva P., Abonyi J, A partícionálás hatása fuzzy asszociatív osztályozók teljesítményére, V. Alkalmazott Informatika Konferencia, Kaposvár, 2006

21. Pach F.P., Németh S., Árva P., Abonyi J. Szabálybázis alapú módszerek polimer-izációs technológiák elemzésére, 34. Mûszaki Kémiai Napok, Veszprém, 2006, 291-294

22. F. P. Pach, A. Gyenesei, S. Nemeth, P. Arva, and J. Abonyi. Asszociációs szabályok alkalmazása hisztorikus folyamat adatok elemzésére, IV. Alkalma-zott Informatika Konferencia, Kaposvár, 2005

23. F. P. Pach, Sz. Nagy, S. Németh, P. Árva, J. Abonyi, Association rule mining in process engineering, 33. Days of Chemical Engineering, Veszpém, 2005, 225-230

24. F. P. Pach, J. Abonyi, M. Németh, S. Németh, P. Árva, Web access of process data warehouse, 32. Days of Chemical Engineering, Veszpém, 2004

Appendix A

The proposed supervised fuzzy clustering algorithm [2]

To represent theA_i,j(x_j,k)fuzzy set, Gaussian membership functions are proposed

A_i,j(x_j,k) = exp

where v_j,i represents the center and σ_i,j² stands for the variance of the Gaussian function. The parameters of the fuzzy model can be obtained by the following algorithm:

Initialization Given a set of data specify R, choose a termination tolerance ² >

0, and a fuzzy exponent m. Initialize the U = [µ_i,k]_R×N partition matrix randomly, whereµ_i,k denotes the membership that the z_k = {x_k, y_k}data is generated by theith cluster.

Repeat forl = 1,2, . . .

Step 1 Calculate the parameters of the clusters

• Calculate the centers and standard deviation of the Gaussian member-ship functions:

• Estimate the consequent probability parameters,

p(ci|rj) =

• A priori probability of the cluster and the weight (impact) of the rules are expressed as:

P(r_i) = 1 N

XN k=1

³ µ^(l−1)_i,k

´_m

, w_i =P(r_i) Yn j=1

q 1

2πσ_i,j² (6.4)

Step 2 Compute the distance measureD²_i,k(zk, ri)by 1

D_i,k² (z_k, r_i) =P(ri) Yn j=1

exp Ã

−1 2

(xj,k −vi,j)² σ_i,j²

| {z }

Gath-Geva clustering

P(cj =yk|ri) (6.5)

This distance measure consists of two terms. The first term is based on the geometrical distance between the vi cluster centers and the xk observation vector, while the second is based on the probability that ther_i-th cluster de-scribes the density of the class of thek-th data,P(c_j =y_k|r_i)It is interesting to note that this distance measure only slightly differs from the unsupervised Gath–Geva clustering algorithm which can also be interpreted in a proba-bilistic framework [39]. However, the novelty of the proposed approach is the second term, which allows the use of class labels.

Step 3 Update the partition matrix

µ^(l)_i,k = 1

PR j=1

(D_i,k(z_k, r_i)/D_j,k(z_k, r_j))^2/(m−1)

, 1≤i≤R, 1≤k ≤N (6.6)

until||U^(l)−U^(l−1)||< ².

Appendix B

Regression tree

The regression tree is a binary tree in which the splits in non-terminal nodes are based on the values of the individual input variables. This method can handle both continuous and unordered categorical variables. 10-fold cross validation is applied to estimate the best level of pruning (i.e., the level producing the smallest tree that is within one standard error of the minimum-cost tree). The following MATLAB code is used to generate the tree:

T =treef it(X, y);

[c, s, n, best] = treetest(T, cross, X, y);

T =treeprune(T, level, best);

All the above functions are available in the Statistics toolbox of the MATLAB.

Appendix C

Sammon Mapping

The Sammon mapping is a multi-dimensional scaling method [92, 80]. It is a well-known procedure for mapping data from a highn-dimensional space onto a lower q-dimensional space by findingN points in theq-dimensional data space, such a way that the interpoint distances d^∗_ij = d^∗(y_i,y_j)in the q-dimensional space approxi-mate the corresponding interpoint distances d_ij = d(x_i,x_j)in the n-dimensional space. This is achieved by minimizing an error criterion, called the Sammon’s stress,E:

The minimization of E is an optimization problem in Nq variables yil, i = 1,2, . . . , N l = 1,2, . . . , q, asyi = [yi1, . . . , yiq]^T. Sammon applied the method of steepest descent to minimizing this function. Introduce the estimate ofy_ilat thet-th iteration

where α is a nonnegative scalar constant (recommendedα ' 0.3−0.4), i.e., the step size for gradient search in the direction of

∂E(t)

1 + d_ki−d^∗_ki d_ki

¶¸

(6.9)

It is not necessary to maintain λ for a successful solution of the optimization problem, since the minimization of

NP−1 i=1

PN j=i+1

¡d_ij−d^∗_ij¢₂

/d_ij gives the same result.

When the gradient-descent method is applied to search for the minimum of Sam-mon’s stress, a local minimum in the error surface can be reached. Therefore a significant number of runs with different random initializations may be necessary.

Nevertheless, the initialization ofycan be based on information which is obtained from the data, such as the first and second norms of the feature vectors or the prin-cipal axes of the covariance matrix of the data [68].

For the analysis of large data sets, the application of Sammon mapping becomes impractical because in each iterationN ×(N −1)/2distances as well as the error derivatives must be calculated [29]. To avoid this problem often re-sampled data or cluster centers are mapped to get some insight about the hidden structure of the data. These approaches can be also considered as the application of Sammon map-ping as a second-order data mining tool.

Multidimensional scaling

Multidimensional scaling (MDS) refers to a group of methods, composing by widely used unsupervised data visualization techniques. The classical MDS discover the underlying structure of data set by preserving similarity information (pair wise distance) among the data objects. Given a set of data in a high-dimensional fea-ture space, MDS maps them into a low-dimensional (generally 2-dimensional) data space in such a way, that objects that are very similar to each other in the original space are placed near each other on the map, and objects that are very different from each other, are placed far away from each other. There are two types of MDS: (i) metric MDS and (ii) non-metric MDS. Multidimensional scaling based on measured proximities is called metric multidimensional scaling. While metric MDS preserves the distances among the objects, non-metric MDS methods attempts to preserve the rank order among the dissimilarities.

The main steps of the proposed MDS algorithm ([77]) are the following:

1. Let the searched coordinates ofN points in aDdimensional Euclidean space be given by x_i (i= 1, . . . , N), wherex_i = (x_i1, . . . , x_iD)^T. MatrixX = [x₁, . . . ,x_N]^T is the N × D coordinates matrix. The Euclidean distances

{dij = (xi−xj)^T (xi−xj)} are known. The inner product of matrix Xis denoted B = XX^T. Find matrix B from the known distances {dij} using Young-Householder process [115]:

(a) Define matrixA= [a_ij], wherea_ij =−¹₂d²_ij,

(b) Deduce matrixBfromB =HAH, whereH=I−_N¹11^T is the centering matrix,

2. Recover the coordinates matrixX fromB using the spectral decomposition ofB:

(a) The inner product matrix Bis expressed asB = XX^T. The rank of B is r(B) = r¡

XX^T¢

= r(X) = D. B is sym-metric, positive semi-definite and of rankD, and hence hasD non-negative eigenvalues andN −Dzero eigenvalues.

(b) Matrix B is now written in terms of its spectral decomposi-tion,B=VΛV^T, whereΛ =diag(λ1, λ2, . . . , λN)the diag-onal matrix of eigenvaluesλi ofB, andV= [v1, . . . ,vN]the matrix of corresponding eigenvectors, normalized such that v^T_i v_i =1,

(c) Because of theN−Dzero eigenvalues,Bcan now be rewrit-ten asB = V1Λ1V^T₁, whereΛ1 =diag(λ1, λ2, . . . , λD)and V₁ = [v₁, . . . ,v_D],

(d) Finally the coordinates matrix is given by X=V₁Λ₁¹², where Λ₁¹² =diag

λ₁¹², . . . , λ_D¹²

´ .

Bibliography

[1] J. Abonyi, J.A. Roubos, and F.Szeifert. Data-driven generation of compact, accurate, and linguistically sound fuzzy classifiers based on a decision-tree initialization. International Journal of Approximate Reasoning, 32(1):1–21, 2003.

[2] J. Abonyi and F. Szeifert. Supervised fuzzy clustering for the identification of fuzzy classifiers. Pattern Recognition Letters, 24(14):2195–2207, 2003.

[3] J.M. Adamo. Fuzzy decision trees. Fuzzy Sets and Systems, 4(3):207–219, 1980.

[4] R. Agrawal, T. Imielinski, and A. Swami. Database mining: A perfor-mance perspective. IEEE Transactions on Knowledge and Data Engineering, 5(6):914–925, 1993.

[5] R. Agrawal and R. Srikant. Fast algorithm for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Data Bases, pages 487–499, 1994.

[6] L. A. Aguirre and S. A. Billings. Improved structure selection for nonlinear models based on term clustering. Int. J. Control, 62:569–587, 1995.

[7] L. A. Aguirre and E. M. A. M. Mendes. Global nonlinear polynomial models:

Structure, term clusters and fixed points. Int. J. Bifurcation Chaos, 6:279–

294, 1996.

[8] I. Ajtonyi and A. Ballagi. Integration of dcs in the complex producing sys-tem with wonderware factorysuite 2000 mmi software package. Distributed Control Systems 7th Meeting, Miskolc, Hungary, 2001.

[9] H. Akaike. A new look at the statistical model identification. IEEE Trans.

Autom. Control, 19:716–723, 1974.

[10] M.-L. Antonie and O. R. Zaiane. An associative classifier based on positive and negative rules. In Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, pages 64–69, 2004.

[11] Y. Bastide, R. Taouil, N. Pasquier, G. Stumme, and L. Lakhal. Mining fre-quent patterns with counting inference. SIGKDD Explorations, 2(2):66–75, 2000.

[12] R.J. Bayardo and R. Agrawal. Mining the most interesting rules. In Pro-ceedings of the 1999 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 145–154, 1999.

[13] R.J. Bayardo, R. Agrawal, and D. Gunopulos. Constraint-based rule mining in large, dense databases. In Proceedings of the 1999 IEEE International Conference on Data Engineering, pages 188–197, 1999.

[14] S. Bergamaschi, S. Castano, M. Vincini, and D. Beneventano. Semantic integration of heterogeneous information sources. Data and Knowledge En-gineering, 36:215–249, 2001.

[15] X. Boyen and L. Wehenkel. Automatic induction of fuzzy decision trees and its application to power system security assessment. Fuzzy Sets and Systems, 102(1):3–19, 1999.

[16] L. Breiman, J.H. Friedman, R.A. Olsen, and C.J. Stone. Classification and Regression Trees. The Wadsworth and Brooks/Cole Statistics/Probability Se-ries. Wadsworth, 1984.

[17] D. Bruzzese and C. Davino. Visual post-analysis of association rules. In Journal of Visual Languages and Computing, volume 14, pages 621–635, 2003.

[18] R. Burghaus, D. Leineweber, and J. Lippert. Data mining analysis of a com-plex polymer process. Chemical Engineering and Technology, 27(9):950–

953, 2004.

[19] S.E. Buttrey and C. Karo. Using k-nearest-neighbor classification in the leaves of a tree. Computational Statistics & Data Analysis, 40(1):27–37, 2002.

[20] J. Camacho and J. Picó. Online monitoring of batch processes using multi-phase principal component analysis. Journal of Process Control, 16(10):1021–1035, 2006.

[21] G. Capocaccia. Intellution production is the heart of manufacturing ebusi-ness, ihistorian. Distributed Control Systems 7th Meeting, Miskolc, Hungary, 2001.

[22] J. Casillas, O. Cordón, F. Herrera, and L. Magdalena. Interpretability Issues in Fuzzy Modeling. Springer, 2003.

[23] K.C.C. Chan and W-H. Au. Mining fuzzy association rules. In Proceed-ings of the 1997 International Conference on Information and Knowledge Management, pages 209–215, 1997.

[24] G. Chen, H.Liu, L. Yu, Q. Wei, and X. Zhang. A new approach to classifica-tion based on associaclassifica-tion rule mining. Decision Support Systems, 42(2):674–

689, 2006.

[25] F. Coenen and P. Leng. The effect of threshold values on association rule based classification accuracys. Data and Knowledge Engineering, In Press, Corrected Proof, Available online 20 March 2006.

[26] W. W. Cohen. Fast effective rule induction. In ICML, pages 115–123, 1995.

[27] S.R. Collins, S. Navathe, and L. Mark. Xml shema mappings for heteroge-neous database access. Information and Software Technology, 44:251–257, 2002.

[28] J.V. de Oliveira. Semantic constraints for membership function optimization.

IEEE Transactions on Systems, Man and Cybernetics, Part A, 29(1):128–

138, 1999.

[29] D. de Ridder and R.P.W. Duin. Sammon’s mapping using neural networks:

A comparison. Pattern Recognition Letters, 18:1307–1316, 1997.

[30] G. Dong, X. Zhang, L. Wong, and J. Li. CAEP: Classification by aggregating emerging patterns. In Discovery Science, pages 30–42, 1999.

[31] J. Dougherty, R. Kohavi, and M. Sahami. Supervised and unsupervised dis-cretization of continuous features. In International Conference on Machine Learning, pages 194–202, 1995.

[32] F.J. Doyle, B.A. Ogunnaike, and R. K. Pearson. Nonlinear model-based control using second-order volterra models. Automatica, 31:697–714, 1995.

[33] W. Duch, R. Setiono, and J. M. Zurada. Computational intelligence methods for rule-based data understanding. In Proceedings of IEEE, volume 92, pages 769–805, 2004.

[34] R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. John Wiley

& Sons, 1999.

[35] U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth. From data mining to knowl-edge discovery in databases. Ai Magazine, 17:37–54, 1996.

[36] S. Füle. Integration of distributed - and enterprise control systems. Distrib-uted Control Systems 5th Meeting, Miskolc, Hungary, 1999.

[37] M. Friedman, T. Bar-Noy, M. Blau, and A. Kandel. Certain computational aspects of fuzzy decision trees. Fuzzy Sets and Systems, 28(2):163–170, 1988.

[38] R. Gallion, D.C. St.Clair, C. Sabharwal, and W.E. Bond. Dynamic id3: A symbolic learning algorithm for many-valued attribute domains. In Pro-ceedings of the 1993 Symposium on Applied Computing, pages 14–20. ACM Press, 1993.

[39] I. Gath and A.B. Geva. Ieee transactions on pattern analysis and machine intelligence. Knowledge-Based Systems, 7:773–781, 1989.

[40] A. K. Ghosh. On optimum choice of k in nearest neighbor classification.

Computational Statistics & Data Analysis, 50(11):3113–3123, 2006.

[41] N. Giannadakis, A. Rowe, M. Ghanem, and Y.-K. Guo. Infogrid: provid-ing information integration for knowledge discovery. Information Sciences, 155:199–226, 2003.

[42] G. Grahne and J. Zhu. High performance mining of maximal frequent item sets. In Proceedings of 6th International Workshop on High Performance Data Mining, 2003.

[43] D.E. Gustafson and W.C. Kessel. Fuzzy clustering with fuzzy covariance

[44] H. Hu and J. Li. Using association rules to make rule-based classifiers robust.

In Proceedings of the sixteenth Australasian database conference, pages 47–

54, 2005.

[45] Y.C. Hu. Determining membership functions and minimum fuzzy support in finding fuzzy association rules for classification problems. Knowledge-Based Systems, 19:57–66, 2006.

[46] Y.C. Hu, R. S. Chen, and G. H. Tzeng. Mining fuzzy association rules for classification problems. Computers and Industrial Engineering, 43(4):735–

750, 2002.

[47] Y.C. Hu and G. H. Tzeng. Elicitation of classification rules by fuzzy data mining. Engineering Applications of Artificial Intelligence, 16:709–716, 2003.

[48] H. Ichihashi, T. Shirai, K. Nagasaka, and T. Miyoshi. Neuro-fuzzy id3: a method of inducing fuzzy decision trees with linear programming for maxi-mizing entropy and an algebraic method for incremental learning. Fuzzy Sets

In document Folyamatadatok szabálykeresésen alapuló elemzése (Pldal 132-154)