• Nem Talált Eredményt

Learning in Logic

N/A
N/A
Protected

Academic year: 2022

Ossza meg "Learning in Logic"

Copied!
127
0
0

Teljes szövegt

(1)

Learning in Logic

Tamás Horváth

University of Bonn &

Fraunhofer IAIS, Sankt Augustin, Germany

tamas.horvath@iais.fraunhofer.de

(2)

PhD Course, Szeged, 2012 - © T.Horváth 2

Learning in Logic - Inductive Generalization of Clauses

(3)

PhD Course, Szeged, 2012 - © T.Horváth 3

Learning in Logic

inductive generalization of first-order clauses

1. preliminaries

2. generalization of words (terms and literals)

3. generalization of clauses

(4)

PhD Course, Szeged, 2012 - © T.Horváth 4

First-Order Logic – Syntax (Alphabet)

(5)

PhD Course, Szeged, 2012 - © T.Horváth 5

First-Order Logic – Syntax (Terms)

(6)

PhD Course, Szeged, 2012 - © T.Horváth 6

First-Order Logic – Syntax (Formulas)

(7)

PhD Course, Szeged, 2012 - © T.Horváth 7

First-Order Logic – Syntax (First-Order Language)

(8)

PhD Course, Szeged, 2012 - © T.Horváth 8

First-Order Logic – Semantics (Pre-Interpretation)

(9)

PhD Course, Szeged, 2012 - © T.Horváth 9

First-Order Logic – Semantics (Term Assignment)

(10)

PhD Course, Szeged, 2012 - © T.Horváth 10

First-Order Logic – Semantics (Interpretation, Truth Value)

(11)

PhD Course, Szeged, 2012 - © T.Horváth 11

First-Order Logic – Semantics (Interpretation, Truth Value)

(12)

PhD Course, Szeged, 2012 - © T.Horváth 12

First-Order Logic – Semantics (Model, Implication)

(13)

PhD Course, Szeged, 2012 - © T.Horváth 13

Clauses

(14)

PhD Course, Szeged, 2012 - © T.Horváth 14

Substitutions

(15)

PhD Course, Szeged, 2012 - © T.Horváth 15

Substitutions

(16)

PhD Course, Szeged, 2012 - © T.Horváth 16

Substitutions

(17)

PhD Course, Szeged, 2012 - © T.Horváth 17

Substitutions

(18)

PhD Course, Szeged, 2012 - © T.Horváth 18

Learning in Logic

inductive generalization of first-order clauses

1. preliminaries

2. generalization of words (terms and literals)

3. generalization of clauses

(19)

PhD Course, Szeged, 2012 - © T.Horváth 19

Generalization of Words: Notions

(20)

PhD Course, Szeged, 2012 - © T.Horváth 20

Generalization of Words: Notions

(21)

PhD Course, Szeged, 2012 - © T.Horváth 21

Inductive Generalization of Words

(22)

PhD Course, Szeged, 2012 - © T.Horváth 22

Inductive Generalization of Words

(23)

PhD Course, Szeged, 2012 - © T.Horváth 23

Inductive Generalization of Words

(24)

PhD Course, Szeged, 2012 - © T.Horváth 24

Inductive Generalization of Words

(25)

PhD Course, Szeged, 2012 - © T.Horváth 25

The Algorithm

(26)

PhD Course, Szeged, 2012 - © T.Horváth 26

Example

(27)

PhD Course, Szeged, 2012 - © T.Horváth 27

Example (Cont‘d)

(28)

PhD Course, Szeged, 2012 - © T.Horváth 28

Proof of Plotkin‘s Theorem

(29)

PhD Course, Szeged, 2012 - © T.Horváth 29

Proof of Plotkin‘s Theorem – Definitions and Notions

(30)

PhD Course, Szeged, 2012 - © T.Horváth 30

Main Steps of the Proof of Plotkin‘s Theorem

(31)

PhD Course, Szeged, 2012 - © T.Horváth 31

Lemma 2

(32)

PhD Course, Szeged, 2012 - © T.Horváth 32

Lemma 3

(33)

PhD Course, Szeged, 2012 - © T.Horváth 33

Proof of Lemma 3 (cont‘d)

(34)

PhD Course, Szeged, 2012 - © T.Horváth 34

Lemma 4

(35)

PhD Course, Szeged, 2012 - © T.Horváth 35

Lemma 5

(36)

PhD Course, Szeged, 2012 - © T.Horváth 36

Proof of Lemma 5 (cont‘d)

(37)

PhD Course, Szeged, 2012 - © T.Horváth 37

Learning in Logic

inductive generalization of first-order clauses

1. generalization of words (terms and literals) 2. generalization of clauses

3. example

(38)

PhD Course, Szeged, 2012 - © T.Horváth 38

Generalization of Clauses

(39)

PhD Course, Szeged, 2012 - © T.Horváth 39

Generalization of Clauses

(40)

PhD Course, Szeged, 2012 - © T.Horváth 40

Subsumption vs. Implication

(41)

PhD Course, Szeged, 2012 - © T.Horváth 41

Subsumption vs. Implication

(42)

PhD Course, Szeged, 2012 - © T.Horváth 42

Least General Generalization of Clauses

(43)

PhD Course, Szeged, 2012 - © T.Horváth 43

Least General Generalization of Clauses

(44)

PhD Course, Szeged, 2012 - © T.Horváth 44

Least General Generalization of Clauses

(45)

PhD Course, Szeged, 2012 - © T.Horváth 45

Proof of (ii): The Algorithm

(46)

PhD Course, Szeged, 2012 - © T.Horváth 46

Example

(47)

PhD Course, Szeged, 2012 - © T.Horváth 47

Lemma A

(48)

PhD Course, Szeged, 2012 - © T.Horváth 48

Proof of (ii) on Slide 44

(49)

PhD Course, Szeged, 2012 - © T.Horváth 49

Proof of (ii) on Slide 46

(50)

PhD Course, Szeged, 2012 - © T.Horváth 50

Proof of (ii) on Slide 46

(51)

PhD Course, Szeged, 2012 - © T.Horváth 51

Proof of (ii) on Slide 46

(52)

PhD Course, Szeged, 2012 - © T.Horváth 52

Reduced Clauses

(53)

PhD Course, Szeged, 2012 - © T.Horváth 53

Clause Reduction

(54)

PhD Course, Szeged, 2012 - © T.Horváth 54

Clause Reduction Algorithm

(55)

PhD Course, Szeged, 2012 - © T.Horváth 55

Example

(56)

PhD Course, Szeged, 2012 - © T.Horváth 56

Proof of the Theorem on Slide 53

(57)

PhD Course, Szeged, 2012 - © T.Horváth 57

Lemma B

(58)

PhD Course, Szeged, 2012 - © T.Horváth 58

Proof of (ii) in the Theorem on Slide 53

(59)

PhD Course, Szeged, 2012 - © T.Horváth 59

Proof of (ii) in the Theorem on Slide 53 (cont‘d)

(60)

PhD Course, Szeged, 2012 - © T.Horváth 60

Putting All Together

(61)

PhD Course, Szeged, 2012 - © T.Horváth 61

Learning in Logic

notions and notations

inductive generalization of first-order clauses

1. generalization of words (terms and literals) 2. generalization of clauses

3. example

(62)

PhD Course, Szeged, 2012 - © T.Horváth 62

definition of the unknown target predicate Is_Physicist( X , Y ) Learning:

Application Example: Relation Extraction from Texts

Prediction: Einstein was a German theoretical physicist.

Training Data: Fraunhofer was a German optician.

Schrödinger was an Austrian - Irish physicist.

Planck was a German physicist.

Heisenberg was a celebrated German physicist and Nobel laureate.

Problem: Automatic extraction of semantic relations between entities from natural language texts.

Example:

(63)

PhD Course, Szeged, 2012 - © T.Horváth 63

Data Preprocessing

sentences  dependency trees

 labeled rooted directed trees representing grammatical dependencies among the words in a sentence

 capture a low-level syntactic structure of sentences

bijective map between words in a sentence and nodes in the tree

 generated by the Stanford Parser

 nodes defining the same entity are merged into a single node

- e.g., Ludwig Wittgenstein  Ludwig_Wittgenstein

(64)

PhD Course, Szeged, 2012 - © T.Horváth 64

Heisenberg was a celebrated German physicist and Nobel laureate.

Fraunhofer was a German optician.

unknown target relation: Is_Physicist (unary)

POS: { Is_Physicist(Fraunhofer), Is_Physicist(Heisenberg) } NEG: { Is_Physicist(Brecht) }

Brecht was a German poet, playwright,

and theatre director.

(65)

PhD Course, Szeged, 2012 - © T.Horváth 65

Heisenberg was a celebrated German physicist and Nobel laureate.

Fraunhofer was a German optician.

POS: { Is_Physicist(Fraunhofer), Is_Physicist(Heisenberg)}

We want to generalize these two structures!

Consider them as ground clauses and use Plotkin‘s LGG algorithm!

(66)

PhD Course, Szeged, 2012 - © T.Horváth 66

Dependency Trees as Relational Structures

 labeled trees are considered as relational structures - unique constant for each vertex

- unary and binary predicates only - ground

training examples: m-tuples of vertices of the dependency trees - P: m-ary target relation to be learned

 POS: set of instances (m-tuples) of the target relation P

 NEG: set of non-instances (m-tuples) of the target relation P

- ground atoms of the target predicate P

(67)

PhD Course, Szeged, 2012 - © T.Horváth 67

Fraunhofer was a German optician.

comes from semantic hierarchy

(e.g., Wordnet)

(68)

PhD Course, Szeged, 2012 - © T.Horváth 68

Heisenberg was a celebrated German

physicist and Nobel laureate.

(69)

PhD Course, Szeged, 2012 - © T.Horváth 69

Example – LGG of the Two Ground Clauses

the two structures correspond to labeled directed trees

 LGG: direct product of labeled trees

 introduce a product vertex for each pair (u,v) of constants

 each such product vertex corresponds to a new variable x (u,v) in the LGG

 add an edge from (u 1 ,v 1 ) to (u 2 ,v 2 ) in the product iff there is an edge from u 1 to u 2 and there is an edge from v 1 to v 2

 add the literal ¬R(x (u1,v1) , x (u2,v2) ) to the LGG

 „color“ the product vertex (u,v) by the unary predicate Q iff u and v are both colored by Q

 add the literal ¬Q(x (u,v) ) to the LGG

(70)

PhD Course, Szeged, 2012 - © T.Horváth 70

Fraunhofer

was

optician

a

German

Heisenberg was physicist a celebrated German and laureate Nobel

X

1

X

2

X

3

X

4

X

5

X

6

X

7

was

physicist

a

German

X

0

X

8

(71)

PhD Course, Szeged, 2012 - © T.Horváth 71

Fraunhofer

was

optician

a

German

Heisenberg was physicist a celebrated German and laureate Nobel

X

1

X

2

X

3

X

4

X

5

X

6

X

7

was

physicist

a

German

X

0

X

8

(72)

PhD Course, Szeged, 2012 - © T.Horváth 72

Fraunhofer

was

optician

a

German

Heisenberg was physicist a celebrated German and laureate Nobel

X

1

X

2

X

3

X

4

X

5

X

6

X

7

was

physicist

a

German

X

0

X

8

(73)

PhD Course, Szeged, 2012 - © T.Horváth 73

Fraunhofer

was

optician

a

German

Heisenberg was physicist a celebrated German and laureate Nobel

X

1

X

2

X

3

X

4

X

5

X

6

X

7

was

physicist

a

German

X

0

X

8

(74)

PhD Course, Szeged, 2012 - © T.Horváth 74

Heisenberg was a celebrated German physicist and Nobel laureate.

Fraunhofer was a German optician.

target relation: Is_Physicist (unary)

POS: { Is_Physicist(Fraunhofer), Is_Physicist(Heisenberg) }

x

the pattern

representing the concept generated by POS

y

z

German

(75)

PhD Course, Szeged, 2012 - © T.Horváth 75

Example – The Reduced Pattern as First-Order Clause

(76)

PhD Course, Szeged, 2012 - © T.Horváth 76

Example (cont‘d)

the pattern representing the concept generated by {Is_Physicist(Fraunhofer), Is_Physicist(Heisenberg)}

Einstein was a German theoretical physicist.

Is_Physicist(Einstein)

x

y

z

German

(77)

PhD Course, Szeged, 2012 - © T.Horváth 77

Example (cont‘d)

the pattern representing the concept generated by

{Is_Physicist(Fraunhofer), Is_Physicist(Heisenberg)} NOT Is_Physicist(Brecht)

Brecht was a German poet, playwright, and theatre director.

X

X

Y

Z

German

X

(78)

PhD Course, Szeged, 2012 - © T.Horváth 78

Summary

 LGG is a natural notion for the generalization of first-order clauses with respect to subsumption

 computing the LGG of clauses is reduced to computing the LGG of words

 a reduced non-empty LGG, if it exists, is unique up to variable renaming

problems with the LGG

- the size of the reduced LGG can grow exponentially with the number of clauses

- as subsumption is NP-complete, deciding whether the LGG generalizes

(i.e., subsumes) a clause is NP-complete

(79)

PhD Course, Szeged, 2012 - © T.Horváth 79

Outline

complexity of learning function-free definite Horn clauses

bottom-up induction of clauses

the relative least general generalization (RLGG)

a generic algorithm

on the length of the reduced RLGG

top-down induction of clauses

the FOIL algorithm

(80)

PhD Course, Szeged, 2012 - © T.Horváth 80

Horn Clauses

(81)

PhD Course, Szeged, 2012 - © T.Horváth 81

Finding a Consistent Clause

(82)

PhD Course, Szeged, 2012 - © T.Horváth 82

Finding a Consistent Clause

(83)

PhD Course, Szeged, 2012 - © T.Horváth 83

Learning Function-Free Definite Horn Clauses

complexity of learning function-free definite Horn clauses

bottom-up induction of clauses

the relative least general generalization (RLGG)

a generic algorithm

on the length of the reduced RLGG

top-down induction of clauses

the FOIL algorithm

(84)

PhD Course, Szeged, 2012 - © T.Horváth 84

Learning Non-Recursive Definite Horn Clauses

(85)

PhD Course, Szeged, 2012 - © T.Horváth 85

Learning Non-Recursive Definite Horn Clauses

(86)

PhD Course, Szeged, 2012 - © T.Horváth 86

Example

(87)

PhD Course, Szeged, 2012 - © T.Horváth 87

Finding Consistent Clauses wrt. Background Knowledge

(88)

PhD Course, Szeged, 2012 - © T.Horváth 88

Relative Least General Generalization

(89)

PhD Course, Szeged, 2012 - © T.Horváth 89

Problem Reformulation: Notions

(90)

PhD Course, Szeged, 2012 - © T.Horváth 90

Reformulation of the Problem on Slide 87

(91)

PhD Course, Szeged, 2012 - © T.Horváth 91

Reformulation of the Problem on Slide 87

(92)

PhD Course, Szeged, 2012 - © T.Horváth 92

Reformulation of the Problem on Slide 87

(93)

PhD Course, Szeged, 2012 - © T.Horváth 93

Reformulation of the Problem on Slide 87

(94)

PhD Course, Szeged, 2012 - © T.Horváth 94

Example

(95)

PhD Course, Szeged, 2012 - © T.Horváth 95

Example

(96)

PhD Course, Szeged, 2012 - © T.Horváth 96

Example

(97)

PhD Course, Szeged, 2012 - © T.Horváth 97

Bottom-Up Induction of First-Order Clauses

(98)

PhD Course, Szeged, 2012 - © T.Horváth 98

Bottom-Up Induction of First-Order Clauses

(99)

PhD Course, Szeged, 2012 - © T.Horváth 99

Problem 3: The Length of the Reduced RLGG

(100)

PhD Course, Szeged, 2012 - © T.Horváth 100

Problem 3: The Length of the Reduced RLGG

(101)

PhD Course, Szeged, 2012 - © T.Horváth 101

Problem 3: The Length of the Reduced RLGG

(102)

PhD Course, Szeged, 2012 - © T.Horváth 102

Summary

consistent hypothesis finding problem: computationally intractable

bottom-up induction: using the relative LGG, it iteratively generalizes the current clauses as long as it is consistent with the negative examples

- system based on this approach: Golem [Muggleton and Feng, 1993]

- problems with this approach:

(1) the size of the reduced LGG can grow exponentially with the number of positive examples

(2) as subsumption is NP-complete, deciding whether the LGG implies an example wrt. to

the background knowledge is NP-complete

(103)

PhD Course, Szeged, 2012 - © T.Horváth 103

Outline

complexity of learning function-free definite Horn clauses

bottom-up induction of clauses

the relative least general generalization (RLGG)

a generic algorithm

on the length of the reduced RLGG

top-down induction of clauses

the FOIL algorithm

(104)

PhD Course, Szeged, 2012 - © T.Horváth 104

FOIL: First-Order Inductive Learner

 Quinlan (1990-1993)

 combines the divide-and-conquer method designed for propositional TDIDT (top-down induction of decision trees) systems with the covering method developed for disjunctive logical expressions

- information-based heuristics in the divide-and-conquer method

 hypothesis space is searched top-down in a heuristic fashion, looking for maximally general rules consistent with the negative examples

 usually fast running times, no parameters, easy to use

may miss good solutions

implementations:

FOIL 6 (Quinlan; publicly available), mFoil (Dzeroski), Grendel (Cohen)

(105)

PhD Course, Szeged, 2012 - © T.Horváth 105

1. The Divide-and-Conquer Method

 Hunt et al. (1966), Quinlan (1979,1986), Breiman et al. (1984) Cestnik et al.

(1987)

 the method below yields a decision tree

1. if all training objects belong to a single class, the tree is a leaf labelled with that class

2. otherwise

1. select a test based on one attribute

2. divide the training set into subsets, each corresponding to one of the possible (mutually exclusive) outcomes of the test, and

3. apply the same procedure to each subset

(106)

PhD Course, Szeged, 2012 - © T.Horváth 106

2. The Covering Method

 Michalski (1989), Michalski et al. (1986)

 target class is represented by a disjunctive logical expression

1. find a conjunction of conditions that is satisfied by some objects in the target class, but no objects from another class

2. append this conjunction as one disjunct of the logical expression being developed

3. remove all objects that satisfy this conjunction and, if there are still some

remaining objects of the target class, repeat this procedure

(107)

PhD Course, Szeged, 2012 - © T.Horváth 107

FOIL: The Outer Loop

(108)

PhD Course, Szeged, 2012 - © T.Horváth 108

FOIL: How to Perform Step 3? (The Inner Loop of FOIL)

(109)

PhD Course, Szeged, 2012 - © T.Horváth 109

Inner Loop of FOIL: Which Literals Are Considered in Step 5?

(110)

PhD Course, Szeged, 2012 - © T.Horváth 110

Inner Loop of FOIL: Which Literal Is Selected in Steps 5-6?

(111)

PhD Course, Szeged, 2012 - © T.Horváth 111

Inner Loop of FOIL: Which Literal Is Selected in Step 5-6?

(112)

PhD Course, Szeged, 2012 - © T.Horváth 112

Inner Loop of FOIL: Which Literal Is Selected in Step 5-6?

(113)

PhD Course, Szeged, 2012 - © T.Horváth 113

FOIL: Example

(114)

PhD Course, Szeged, 2012 - © T.Horváth 114

FOIL: Example

(115)

PhD Course, Szeged, 2012 - © T.Horváth 115

FOIL: Example

(116)

PhD Course, Szeged, 2012 - © T.Horváth 116

FOIL: Example

(117)

PhD Course, Szeged, 2012 - © T.Horváth 117

FOIL: Next Iteration of the Inner Loop

(118)

PhD Course, Szeged, 2012 - © T.Horváth 118

FOIL: Next Iteration of the Inner Loop

(119)

PhD Course, Szeged, 2012 - © T.Horváth 119

Summary

Bottom-up induction: using the relative LGG, it iteratively generalizes the current clauses as long as it is consistent with the negative examples

- system based on this approach: Golem [Muggleton and Feng, 1993]

- problems with this approach:

(1) the size of the reduced LGG can grow exponentially with the number of positive examples

(2) as subsumption is NP-complete, deciding whether the LGG implies an example wrt. to the background knowledge is NP-complete

Top-down induction: iteratively specializes the current clauses by extending it with a literal as long as it is consistent with the negative examples

- system based on this approach: FOIL [Quinlan, 1990] and its variants - problems with this approach: same as (2) above

- the approach can be extended to learning recursive Horn clauses, as well as to

allowing negated literals in the clause‘ body

(120)

PhD Course, Szeged, 2012 - © T.Horváth 120

Appendix: Proof of Kietz‘s Theorem

(121)

PhD Course, Szeged, 2012 - © T.Horváth 121

Reduction Lemma

(122)

PhD Course, Szeged, 2012 - © T.Horváth 122

Reduction Lemma (cont‘d)

(123)

PhD Course, Szeged, 2012 - © T.Horváth 123

Reduction Lemma: Example

(124)

PhD Course, Szeged, 2012 - © T.Horváth 124

Proof of the Reduction Lemma: „IF“ Direction

(125)

PhD Course, Szeged, 2012 - © T.Horváth 125

Proof of the Reduction Lemma: „IF“ Direction

(126)

PhD Course, Szeged, 2012 - © T.Horváth 126

Proof of the Reduction Lemma: „ONLY IF“ Direction

(127)

PhD Course, Szeged, 2012 - © T.Horváth 127

Proof of the Reduction Lemma : „ONLY IF“ Direction

Hivatkozások

KAPCSOLÓDÓ DOKUMENTUMOK

The algorithm, denoted here by ExpAW, relies on the cumulative performance of the base rankers (as ExpA), but it is used as the weight of the base ranker, instead of using it

In the lessons of persistence and change, material and medium students learn about the evidence of the volume increase of water at freezing, its consequences

This approach is a very attractive solution for devices using DFIG as wind energy conversion systems; because, it is a simple, practical implementation, commonly applied in the

The Maastricht Treaty (1992) Article 109j states that the Commission and the EMI shall report to the Council on the fulfillment of the obligations of the Member

In adsorption tests determination is finished when the tension of the chamber is decreased by the water adsorption of the sample to the ERH leyp} (Fig. This is foUo'wed

Lady Macbeth is Shakespeare's most uncontrolled and uncontrollable transvestite hero ine, changing her gender with astonishing rapiditv - a protean Mercury who (and

The transcendental force that s,veeps him into the middle of the dance is like the whirlwind in the previousl y mentioned poems, while the loss of the narrator's

Rheological measurements were performed by shearing the suspension at constant shear stress (5 Pa) both in lack of electric field and under the influence of field. Low oscillating