Advanced Programming Languages

(1)

Advanced Programming Languages

Nyéky-Gaizler, Judit Ásványi, Tibor Balaskó, Ákos Balázs, Iván József

Csizmazia, Balázs Csontos, Péter Fodor, Szabina

Góbi, Attila Hegedűs, Hajnalka

Horváth, Zoltán Juhász, András Kispitye, Attila Kozsik, Tamás Kovács D., Lehel István

Legéndi, Richárd Marcinkovics, Tamás

Nohl, Attila Rajmund Páli, Gábor Porkoláb, Zoltán

Pécsy, Gábor Tejfel, Máté Sergyán , Szabolcs

Zaicsek, Balázs

Zsók, Viktória

(2)

Advanced Programming Languages

írta Nyéky-Gaizler, Judit, Ásványi, Tibor, Balaskó, Ákos, Balázs, Iván József, Csizmazia, Balázs, Csontos, Péter, Fodor, Szabina, Góbi, Attila, Hegedűs, Hajnalka, Horváth, Zoltán, Juhász, András, Kispitye, Attila, Kozsik, Tamás, Kovács D., Lehel István, Legéndi, Richárd, Marcinkovics, Tamás, Nohl, Attila Rajmund, Páli, Gábor, Porkoláb, Zoltán, Pécsy, Gábor, Tejfel, Máté, Sergyán , Szabolcs, Zaicsek, Balázs, és Zsók, Viktória Publication date 2014

Szerzői jog © 2014 Nyéky-Gaizler Judit, Ásványi Tibor, Balaskó Ákos, Balázs Iván József, Csizmazia Balázs, Csontos Péter, Fodor Szabina, Góbi Attila, Hegedűs Hajnalka, Horváth Zoltán, Juhász András, Kispitye Attila, Kozsik Tamás, Kovács D. Lehel István, Legéndi Richárd, Marcinkovics Tamás, Nohl Attila Rajmund, Páli Gábor, Porkoláb Zoltán, Pécsy Gábor, Tejfel Máté, Sergyán Szabolcs, Zaicsek Balázs, Zsók Viktória

(3)

Tartalom

Advanced Programming Languages ... 1

1. Introduction ... 1

1.1. Aspects of software quality ... 1

1.1.1. Correctness ... 1

1.1.2. Reliability ... 1

1.1.3. Maintainability ... 2

1.1.4. Reusability ... 2

1.1.5. Compatibility ... 2

1.1.6. Other characteristics ... 2

1.2. Aspects of software design ... 3

1.3. Study of the tools of programming languages ... 3

1.3.1. Increase of the expressive power ... 3

1.3.2. Choosing the appropriate programming language ... 3

1.3.3. Better attainment of new tools ... 3

1.4. Acknowledgements ... 4

2. 1 Language Design (Szabina Fodor) ... 4

2.1. 1.1 Programming languages: syntax, semantics, and pragmatics ... 5

2.1.1. 1.1.1 Syntax ... 5

2.1.2. 1.1.2 Semantics ... 6

2.1.3. 1.1.3 Pragmatics ... 6

2.2. 1.2 Implementation of computer programs ... 6

2.2.1. Compiler implementation ... 7

2.2.2. Pure interpretation ... 9

2.2.3. Hybrid implementation systems ... 9

2.3. 1.3 The evolution of programming languages ... 10

2.3.1. 1.3.1 The early years ... 10

2.3.2. 1.3.2 The move to higher-level languages ... 11

2.3.3. 1.3.3 The future of programming languages ... 13

2.4. 1.4 Programming language categories ... 14

2.4.1. 1.4.1 Imperative or procedural languages ... 14

2.4.2. 1.4.2 Applicative or functional languages ... 15

2.4.3. 1.4.3 Rule-based or logical languages ... 15

2.4.4. 1.4.4 Object-oriented languages ... 16

2.4.5. 1.4.5 Concurrent programming languages ... 16

2.4.6. 1.4.6 Scripting languages ... 16

2.5. 1.5 Influences on language design ... 17

2.6. 1.6 Principles of programming language design ... 19

2.6.1. 1.6.1 Features of a good programming language ... 19

2.6.2. 1.6.2 Language design ... 25

2.7. 1.7 The standardization process ... 26

2.8. 1.8 Summary ... 27

2.9. 1.9 Exercises ... 27

2.10. 1.10 Useful tips ... 27

2.11. 1.11 Solutions ... 27

3. 2 Lexical elements (Judit Nyéky-Gaizler, Attila Kispitye) ... 29

3.1. 2.1 Symbol sets ... 29

3.1.1. 2.1.1 The ASCII code ... 30

3.1.2. 2.1.2 The EBCDIC code ... 30

3.1.3. 2.1.3 The ISO 8859 family ... 31

3.1.4. 2.1.4 The Unicode standard ... 31

3.2. 2.2 Symbol sets of programming languages ... 32

3.2.1. Pascal ... 32

3.2.2. Ada ... 32

3.2.3. C++ ... 32

3.2.4. Java and C# ... 33

3.2.5. Delimiters ... 33

(4)

3.3. 2.3 Identifiers ... 33

3.3.1. 2.3.1 Allowed syntax ... 34

3.3.2. 2.3.2 Distinction between lower and upper case letters ... 34

3.3.3. 2.3.3 Length restrictions ... 35

3.3.4. 2.3.4 Reserved words ... 35

3.4. 2.4 Literals ... 36

3.4.1. 2.4.1 Numeric literals ... 36

3.4.2. 2.4.2 Characters and strings ... 38

3.5. 2.5 Comments ... 39

3.5.1. From a mark in a special column till the end of the line ... 39

3.5.2. Special marks at the beginning and end of the comment ... 39

3.5.3. Special mark at the beginning of the comment - comment ends at the end of the line ... 39

3.6. 2.6 Summary ... 40

3.7. 2.7 Exercises ... 40

3.8. 2.8 Useful tips ... 40

3.9. 2.9 Solutions ... 41

4. 3 Control structures, statements (Balázs Csizmazia, Attila Kispitye, Judit Nyéky-Gaizler) 41 4.1. 3.1 The job of a programmer ... 42

4.1.1. 3.1.1 Sentence-like description ... 42

4.1.2. 3.1.2 Flow diagrams ... 43

4.1.3. 3.1.3 D-diagrams ... 44

4.1.4. 3.1.4 Block diagrams ... 45

4.1.5. 3.1.5 Structograms ... 46

4.2. 3.2 Implementation in assembly ... 46

4.2.1. 3.2.1 The solution in Pascal ... 47

4.2.2. 3.2.2 LMC ... 47

4.2.3. 3.2.3 Comparison of the solutions in LMC and Pascal ... 50

4.3. 3.3 An elementary approach ... 50

4.3.1. 3.3.1 Elements of the while-programs ... 51

4.3.2. 3.3.2 Higher level operations ... 51

4.3.3. 3.3.3 Considerations ... 52

4.4. 3.4 Control approaches ... 52

4.4.1. 3.4.1 Imperative programming languages ... 53

4.4.2. 3.4.2 Declarative and functional languages ... 53

4.4.3. 3.4.3 Parallel execution ... 53

4.4.4. 3.4.4 Event driven programming ... 53

4.5. 3.5 Programming languages examined ... 54

4.5.1. 3.5.1 Sentence-like algorithm description: COBOL ... 54

4.5.2. 3.5.2 Structured programming: the Pascal language ... 56

4.5.3. 3.5.3 Portable assembly: the C language ... 56

4.5.4. 3.5.4 Everything is an object: the Smalltalk language ... 57

4.5.5. 3.5.5 Other examined programming languages ... 57

4.6. 3.6 Assignment, arithmetic statements ... 58

4.6.1. 3.6.1 Features of COBOL ... 58

4.6.2. 3.6.2 Simple assignment: the Pascal language ... 59

4.6.3. 3.6.3 Assignment in C ... 59

4.6.4. 3.6.4 Solution in Smalltalk ... 59

4.6.5. 3.6.5 Multiple assignment and the CLU language ... 60

4.6.6. 3.6.6 The role of assignment in programs ... 60

4.6.7. 3.6.7 The empty statement ... 60

4.7. 3.7 Sequence and the block statement ... 61

4.7.1. 3.7.1 Block statement in Pascal ... 61

4.7.2. 3.7.2 Break with the tradition of Pascal: the Ada language ... 62

4.7.3. 3.7.3 Characteristics of the C language family ... 62

4.7.4. 3.7.4 Block statement in Smalltalk ... 63

4.8. 3.8 Unconditional transfer of control ... 63

4.8.1. 3.8.1 The features of COBOL ... 64

4.8.2. 3.8.2 Unconditional transfer of control in Pascal ... 64

4.8.3. 3.8.3 Modula-3: end of GOTO ... 65

(5)

4.8.4. 3.8.4 Special control statements in C ... 65

4.8.5. 3.8.5 New features in Java ... 65

4.9. 3.9 Branch structures ... 66

4.9.1. 3.9.1 Branching in COBOL ... 67

4.9.2. 3.9.2 Conditional statement in Pascal ... 68

4.9.3. 3.9.3 Multiway branching in Pascal ... 68

4.9.4. 3.9.4 Safe branching: innovations of Modula-3 ... 69

4.9.5. 3.9.5 Safe CASE in Modula-3 ... 70

4.9.6. 3.9.6 Branch structures in C ... 70

4.9.7. 3.9.7 Multiway branching in C ... 70

4.9.8. 3.9.8 Multiway branching in C# ... 71

4.9.9. 3.9.9 Conditional statement in Smalltalk ... 71

4.10. 3.10 Loops ... 71

4.10.1. 3.10.1 Loops in COBOL ... 72

4.10.2. 3.10.2 Loops in Pascal ... 73

4.10.3. 3.10.3 Modula-3: safe loops ... 74

4.10.4. 3.10.4 Loop-end-exit loops ... 75

4.10.5. 3.10.5 Features of the Ada language ... 75

4.10.6. 3.10.6 Repeating structures in C and Java ... 76

4.10.7. 3.10.7 Novelties of the C# language ... 77

4.10.8. 3.10.8 Iterators ... 78

4.10.9. 3.10.9 Loop statement in Smalltalk ... 80

4.11. 3.11 Self-invoking code (recursion) ... 81

4.12. 3.12 Summary ... 82

4.13. 3.13 Exercises ... 82

4.14. 3.14 Useful tips ... 83

4.15. 3.15 Solutions ... 84

5. 4 Scope and lifespan (Iván József Balázs, Zoltán Porkoláb) ... 88

5.1. 4.1 The types of memory storage ... 89

5.1.1. 4.1.1 The static memory ... 89

5.1.2. 4.1.2 The automatic memory ... 90

5.1.3. 4.1.3 Dynamic memory ... 90

5.1.4. 4.1.4 A simple example ... 91

5.2. 4.2 Scope ... 92

5.2.1. 4.2.1 Global scope ... 93

5.2.2. 4.2.2 Compilation unit as a scope ... 93

5.2.3. 4.2.3 Functions and code blocks as scope ... 94

5.2.4. 4.2.4 A type as scope ... 94

5.3. 4.3 Lifespan ... 95

5.3.1. 4.3.1 Creation and destruction of objects ... 95

5.3.2. 4.3.2 Static ... 96

5.3.3. 4.3.3 Automatic ... 96

5.3.4. 4.3.4 Dynamic ... 97

5.4. 4.4 Examples ... 97

5.4.1. Usage of a static buffer ... 97

5.4.2. Resource management through objects ... 98

5.5. 4.5 Summary ... 99

5.6. 4.6 Exercises ... 100

5.7. 4.7 Useful Tips ... 100

5.8. 4.8 Solutions ... 101

6. 5 Data types (Gábor Pécsy) ... 110

6.1. 5.1 What is a data type? ... 111

6.1.1. 5.1.1 The programming language perspective ... 111

6.1.2. 5.1.2 The programmers' perspective ... 111

6.1.3. 5.1.3 Type systems of programming languages ... 113

6.1.4. 5.1.4 Type conversions ... 114

6.2. 5.2 Taxonomy of types ... 117

6.2.1. 5.2.1 Type classes ... 117

6.2.2. 5.2.2 Attributes in Ada ... 117

6.3. 5.3 Scalar type class ... 118

(6)

6.3.1. 5.3.1 Representation ... 118

6.3.2. 5.3.2 Operations ... 119

6.3.3. 5.3.3 Scalar types in Ada ... 119

6.4. 5.4 Discrete type class ... 119

6.4.1. 5.4.1 Enumerations ... 120

6.4.2. 5.4.2 Integer types ... 122

6.4.3. 5.4.3 Outliers ... 125

6.5. 5.5 Real type class ... 127

6.5.1. 5.5.1 Type-value set ... 127

6.5.2. 5.5.2 Operations ... 128

6.5.3. 5.5.3 Programming languages ... 129

6.6. 5.6 Pointer types ... 130

6.6.1. 5.6.1 Memory management ... 130

6.6.2. 5.6.2 Type-value set ... 132

6.6.3. 5.6.3 Operations ... 133

6.6.4. 5.6.4 Dereference ... 136

6.6.5. 5.6.5 Pointers to subprograms ... 136

6.6.6. 5.6.6 Language specialties ... 137

6.7. 5.7 Expressions ... 139

6.7.1. 5.7.1 Structure of expressions ... 140

6.7.2. 5.7.2 Evaluating expressions ... 141

6.8. 5.8 Other language specialties ... 144

6.8.1. 5.8.1 Ada: Type derivation and subtypes ... 144

6.9. 5.9 Summary ... 146

6.10. 5.10 Exercises ... 148

6.11. 5.11 Useful tips ... 148

6.12. 5.12 Solutions ... 149

7. 6 Composite types (Gábor Pécsy) ... 155

7.1. 6.1 Type equivalence ... 155

7.2. 6.2 Mutable and immutable types ... 156

7.3. 6.3 Cartesian product types ... 157

7.3.1. 6.3.1 Type-value set ... 158

7.3.2. 6.3.2 Operations ... 158

7.3.3. 6.3.3 Representation of cartesian product types ... 160

7.3.4. 6.3.4 Language specific features ... 162

7.4. 6.4 Union types ... 163

7.4.1. 6.4.1 Type-value set ... 163

7.4.2. 6.4.2 Operations ... 163

7.4.3. 6.4.3 Union-like composite types ... 164

7.5. 6.5 Iterated types ... 170

7.6. 6.6 Array ... 170

7.6.1. 6.6.1 Type-value set ... 170

7.6.2. 6.6.2 Operation ... 171

7.6.3. 6.6.3 Language specific features ... 172

7.6.4. 6.6.4 Arrays in Java ... 172

7.6.5. 6.6.5 Generalization - multi-dimensional arrays ... 174

7.7. 6.7 Sets ... 175

7.7.1. 6.7.1 Type-value set ... 175

7.7.2. 6.7.2 Operations ... 176

7.8. 6.8 Other iterated types ... 176

7.8.1. Hashtables in Perl ... 176

7.9. 6.9 Summary ... 177

7.10. 6.10 Exercises ... 179

7.11. 6.11 Useful tips ... 180

7.12. 6.12 Solutions ... 180

8. 7 Subprograms (Tamás Kozsik, Attila Kispitye, Judit Nyéky-Gaizler) ... 183

8.1. 7.1 The effect of subprograms on software quality ... 184

8.1.1. Reusability ... 184

8.1.2. Readability ... 184

8.1.3. Changeability ... 184

(7)

8.1.4. Maintainability ... 184

8.2. 7.2 Procedures and functions ... 185

8.2.1. 7.2.1 Languages with no difference between procedures and functions .. 185

8.2.2. 7.2.2 Languages which distinguish between procedures and functions ... 186

8.3. 7.3 Structure of subprograms and calls ... 187

8.3.1. 7.3.1 What could be a parameter or return value? ... 187

8.3.2. 7.3.2 Specification of subprograms ... 192

8.3.3. 7.3.3 Body of subprograms ... 197

8.3.4. 7.3.4 Calling subprograms ... 200

8.3.5. 7.3.5 Recursive subprograms ... 204

8.3.6. 7.3.6 Declaration of the subprograms ... 205

8.3.7. 7.3.7 Macros and inline subprograms ... 206

8.3.8. 7.3.8 Subprogram types ... 207

8.4. 7.4 Passing parameters ... 208

8.4.1. 7.4.1 Parameter passing modes ... 208

8.4.2. 7.4.2 Comparison of parameter passing modes ... 214

8.4.3. 7.4.3 Parameter possibilities in some programming languages ... 215

8.5. 7.5 Environment of the subprograms ... 220

8.5.1. 7.5.1 Separate compilability ... 220

8.5.2. 7.5.2 Embedding ... 221

8.5.3. 7.5.3 Static and dynamic scope ... 222

8.5.4. 7.5.4 Lifetime of the variables ... 224

8.6. 7.6 Overloading subprogram names ... 225

8.6.1. 7.6.1 Operator overloading ... 227

8.7. 7.7 Implementation of subprograms ... 227

8.7.1. 7.7.1 Implementation of subprograms passed as parameters ... 230

8.8. 7.8 Iterators ... 231

8.9. 7.9 Coroutines ... 233

8.10. 7.10 Summary ... 234

8.11. 7.11 Exercises ... 235

8.12. 7.12 Useful tips ... 237

8.13. 7.13 Solutions ... 239

9. 8 Exception handling (Attila Rajmund Nohl) ... 252

9.1. 8.1 Introduction ... 253

9.1.1. 8.1.1 Basic concepts ... 253

9.1.2. 8.1.2 Why is exception handling useful ... 254

9.1.3. 8.1.3 The aspects of comparing exception handling ... 258

9.2. 8.2 The beginnings of exception handling ... 259

9.2.1. 8.2.1 Exception handling of a single statement: FORTRAN ... 259

9.2.2. 8.2.2 Exception handling of multiple statements: COBOL ... 259

9.2.3. 8.2.3 Dynamic exception handling: PL/I ... 260

9.3. 8.3 Advanced exception handling ... 260

9.3.1. 8.3.1 Static exception handling: CLU ... 260

9.3.2. 8.3.2 Exception propagation: Ada ... 261

9.3.3. 8.3.3 Exception classes: C++ ... 263

9.3.4. 8.3.4 Exception handling and correctness proving: Eiffel ... 264

9.3.5. 8.3.5 The 'finally' block: Modula-3 ... 266

9.3.6. 8.3.6 Checked exceptions: Java ... 266

9.3.7. 8.3.7 The exception handling of Delphi ... 268

9.3.8. 8.3.8 Nested exceptions: C# ... 268

9.3.9. 8.3.9 Exception handling with functions: Common Lisp ... 269

9.3.10. 8.3.10 Exceptions in concurrent environment: Erlang ... 270

9.3.11. 8.3.11 New solutions: Perl ... 272

9.3.12. 8.3.12 Back to the basics: Go ... 273

9.4. 8.4 Summary ... 275

9.5. 8.5 Examples for exception handling ... 275

9.5.1. 8.5.1 C++ ... 275

9.5.2. 8.5.2 Java ... 276

9.5.3. 8.5.3 Ada ... 277

9.5.4. 8.5.4 Eiffel ... 278

(8)

9.5.5. 8.5.5 Erlang ... 279

9.6. 8.6 Excercises ... 281

9.7. 8.7 Useful tips ... 283

9.8. 8.8 Solutions ... 283

10. 9 Abstract data types (Gábor Pécsy, Attila Kispitye) ... 286

10.1. 9.1 Type constructs and data abstraction ... 286

10.2. 9.2 Expectations for programming languages ... 287

10.3. 9.3 Breaking down to modules ... 287

10.3.1. 9.3.1 Modular design ... 288

10.3.2. 9.3.2 Language support for modules ... 291

10.4. 9.4 Encapsulation ... 299

10.5. 9.5 Representation hiding ... 299

10.5.1. 9.5.1 Opaque type in C ... 299

10.5.2. 9.5.2 Private view of Ada types ... 300

10.5.3. 9.5.3 CLU abstract data types ... 301

10.5.4. 9.5.4 Visibility levels ... 301

10.6. 9.6 Separation of specification and implementation ... 302

10.6.1. C and C++ header files ... 302

10.6.2. Mapping to pointers ... 302

10.6.3. Visibility areas ... 303

10.6.4. Languages not supporting physical separation ... 303

10.7. 9.7 Management of module dependency ... 304

10.8. 9.8 Consistent usage ... 304

10.9. 9.9 Generalized program schemes ... 305

10.9.1. Subprograms ... 305

10.9.2. Parametrization of subprograms ... 305

10.9.3. Parametrization of types ... 305

10.9.4. Subprograms as parameters ... 305

10.9.5. Types as parameters ... 306

10.9.6. Higher level structures as parameters ... 307

10.10. 9.10 Summary ... 307

10.11. 9.11 Exercises ... 309

10.12. 9.12 Useful tips ... 309

10.13. 9.13 Solutions ... 309

11. 10 Object-oriented programming (Judit Nyéky-Gaizler, Balázs Zaicsek, István L. Kovács D., Szabolcs Sergyán) ... 318

11.1. 10.1 The class and the object ... 319

11.1.1. 10.1.1 Classes and objects in different languages ... 320

11.2. 10.2 Notations and diagrams ... 326

11.2.1. 10.2.1 Class diagram ... 326

11.2.2. 10.2.2 Object diagram ... 327

11.2.3. 10.2.3 The representation of instantiation ... 327

11.3. 10.3 Constructing and destructing objects ... 327

11.3.1. C++ ... 327

11.3.2. Object Pascal ... 328

11.3.3. Java ... 329

11.3.4. Eiffel ... 329

11.3.5. Ada ... 329

11.3.6. Python ... 330

11.3.7. Scala ... 330

11.3.8. 10.3.1 Instantiation and the concept of Self (this) ... 331

11.4. 10.4 Encapsulation ... 332

11.5. 10.5 Data hiding, interfaces ... 332

11.5.1. Data hiding solutions of Smalltalk ... 333

11.5.2. Access control in C++ ... 333

11.5.3. Data hiding of Object Pascal ... 334

11.5.4. Accessibility categories of Java ... 334

11.5.5. Selective visibility of Eiffel ... 334

11.5.6. 10.5.1 Friend methods and classes ... 335

11.5.7. 10.5.2 The private notation of Python ... 336

(9)

11.5.8. 10.5.3 Visibility Rules of Scala ... 336

11.6. 10.6 Class data, class method ... 337

11.6.1. Smalltalk ... 338

11.6.2. C++, Java, C# ... 338

11.6.3. Object Pascal ... 339

11.6.4. Python ... 339

11.6.5. Scala ... 340

11.6.6. 10.6.1 Class diagrams ... 340

11.7. 10.7 Inheritance ... 340

11.7.1. Inheritance in SIMULA 67 ... 341

11.7.2. Inheritance in Smalltalk ... 343

11.7.3. Inheritance example of C++ ... 343

11.7.4. Inheritance in Object Pascal ... 344

11.7.5. Inheritance in Eiffel ... 344

11.7.6. Inheritance in Java ... 345

11.7.7. Inheritance in C# ... 345

11.7.8. Inheritance in Ada ... 345

11.7.9. Inheritance in Python ... 346

11.7.10. Inheritance in Scala ... 346

11.7.11. 10.7.1 Data hiding and inheritance ... 347

11.7.12. 10.7.2 Polymorphism and dynamic dispatching ... 349

11.7.13. 10.7.3 Abstract class ... 358

11.7.14. 10.7.4 Common ancestor ... 362

11.7.15. 10.7.5 Multiple inheritance ... 363

11.7.16. 10.7.6 Interfaces ... 372

11.7.17. 10.7.7 Nested classes, inner classes ... 376

11.8. 10.8 Working with classes and objects ... 376

11.8.1. 10.8.1 The Roman Principle ... 376

11.8.2. 10.8.2 Testing doubles ... 377

11.8.3. 10.8.3 SOLID object hierarchy ... 378

11.8.4. 10.8.4 The Law of Demeter ... 381

11.9. 10.9 Summary ... 382

11.10. 10.10 Exercises ... 382

11.11. 10.11 Useful tips ... 383

11.12. 10.12 Solutions ... 383

12. 11 Type parameters (Attila Góbi, Tamás Kozsik, Judit Nyéky-Gaizler, Hajnalka Hegedűs, Tamás Marcinkovics) ... 385

12.1. 11.1 Control abstraction ... 386

12.2. 11.2 Data abstraction ... 388

12.3. 11.3 Polymorphism ... 391

12.3.1. 11.3.1 Parametric polymorphism ... 392

12.3.2. 11.3.2 Inclusion polymorphism ... 394

12.3.3. 11.3.3 Overloading polymorphism ... 397

12.3.4. 11.3.4 Coercion polymorphism ... 397

12.3.5. 11.3.5 Implementation of polymorphism in monomorphic languages ... 399

12.4. 11.4 Generic contract model ... 400

12.5. 11.5 Generic parameters ... 401

12.5.1. 11.5.1 Type and type class ... 401

12.5.2. 11.5.2 Template ... 404

12.5.3. 11.5.3 Subprogram ... 404

12.5.4. 11.5.4 Object ... 405

12.5.5. 11.5.5 Module ... 405

12.6. 11.6 Instantiation ... 406

12.6.1. 11.6.1 Explicit instantiation ... 407

12.6.2. 11.6.2 On-demand instantiation ... 407

12.6.3. 11.6.3 Lazy instantiation ... 407

12.6.4. 11.6.4 Generic parameter matching ... 408

12.6.5. 11.6.5 Specialization ... 409

12.6.6. 11.6.6 Type erasure ... 410

12.7. 11.7 Generics and inheritance ... 411

(10)

12.8. 11.8 Summary ... 413

12.9. 11.9 Examples ... 413

12.9.1. 11.9.1 C++ ... 414

12.9.2. 11.9.2 Java ... 416

12.9.3. 11.9.3 C# ... 418

12.9.4. 11.9.4 Comparing the examples ... 420

12.10. 11.10 Excercises ... 421

12.11. 11.11 Useful tips ... 421

12.12. 11.12 Solutions ... 422

13. 12 Correctness in practice (András Juhász, Languages section: Judit Nyéky-Gaizler) ... 425

13.1. 12.1 Introduction ... 425

13.1.1. 12.1.1 Thought-provoking ... 427

13.2. 12.2 Flavor of object-oriented approach ... 427

13.2.1. 12.2.1 Abstract data types ... 427

13.2.2. 12.2.2 Type system ... 428

13.2.3. 12.2.3 Dynamic properties ... 428

13.2.4. 12.2.4 Object-oriented problem solving ... 428

13.2.5. Approach ... 428

13.2.6. Overview ... 429

13.3. 12.3 The correctness specification language ... 429

13.3.1. 12.3.1 Eiffel and first-order predicate Logic ... 429

13.3.2. 12.3.2 Stack as an example ... 434

13.3.3. 12.3.3 Partial and total functions ... 436

13.3.4. 12.3.4 Precondition ... 436

13.3.5. 12.3.5 Postcondition ... 436

13.3.6. 12.3.6 Pre- and postconditions in Eiffel ... 437

13.3.7. 12.3.7 Design aspects ... 439

13.3.8. 12.3.8 Class invariant ... 439

13.3.9. 12.3.9 Check construct ... 441

13.3.10. 12.3.10 Loops ... 442

13.3.11. 12.3.11 Assertions and inheritance ... 444

13.4. 12.4 Program-correctness in Eiffel ... 448

13.4.1. 12.4.1 Hoare-formulas ... 448

13.4.2. 12.4.2 Correctness of attributes ... 449

13.4.3. 12.4.3 Loop correctness ... 450

13.4.4. 12.4.4 Check correctness ... 451

13.4.5. 12.4.5 Exception correctness ... 451

13.4.6. 12.4.6 Class consistency ... 452

13.4.7. 12.4.7 Class correctness ... 453

13.4.8. 12.4.8 Note on method correctness ... 453

13.4.9. 12.4.9 Program correctness ... 454

13.5. 12.5 Program correctness issues ... 455

13.5.1. 12.5.1 Dependencies ... 455

13.5.2. 12.5.2 Void-safety ... 456

13.5.3. 12.5.3 Type safety ... 458

13.5.4. 12.5.4 Concurrency ... 460

13.6. 12.6 Correctness specification language ... 460

13.6.1. 12.6.1 Practical limits ... 460

13.6.2. 12.6.2 Model classes: an interim solution? ... 461

13.6.3. 12.6.3 Theoretical limits ... 463

13.7. 12.7 Languages and tools supporting Design by Contract ... 463

13.7.1. 12.7.1 D language ... 463

13.7.2. 12.7.2 Cobra language ... 464

13.7.3. 12.7.3 Oxygene language ... 466

13.7.4. 12.7.4 Correctness in .NET ... 466

13.7.5. 12.7.5 Java language and additional tools ... 470

13.7.6. 12.7.6 Ada 2012 language ... 476

13.8. 12.8 Summary ... 476

13.9. 12.9 Example source code ... 477

13.10. 12.10 Exercises ... 479

(11)

13.11. 12.11 Useful tips ... 481

13.12. 12.12 Solutions ... 482

14. 13 Concurrency (Richard O. Legendi, Ákos Balaskó, Máté Tejfel, Viktória Zsók) ... 487

14.1. 13.1 Reasons for concurrency ... 489

14.2. 13.2 An abstract example ... 489

14.2.1. The problem ... 490

14.2.2. The first attempt for finding the solution ... 490

14.2.3. The second attempt for finding the solution ... 490

14.2.4. The third attempt for finding the solution ... 491

14.2.5. The fourth attempt for finding the solution ... 492

14.2.6. The fifth attempt for finding the solution ... 492

14.2.7. The sixth attempt for finding the solution ... 493

14.3. 13.3 Fallacies of concurrent computing ... 494

14.4. 13.4 Possible number of execution paths ... 495

14.4.1. 13.4.1 Amdahl's law ... 498

14.5. 13.5 Taxonomy of concurrent architectures ... 498

14.6. 13.6 Communication and synchronization models ... 499

14.7. 13.7 Mutual exclusion and synchronization ... 500

14.7.1. 13.7.1 Deadlocks ... 500

14.7.2. 13.7.2 Starvation ... 501

14.7.3. 13.7.3 Techniques for synchronization ... 501

14.7.4. 13.7.4 Solutions for managing critical sections ... 502

14.8. 13.8 Taxonomy of languages supporting concurrency ... 504

14.8.1. 13.8.1 Processes, tasks, threads: Concurrent execution units ... 506

14.8.2. 13.8.2 Monitors ... 511

14.8.3. 13.8.3 Alternative approaches ... 512

14.9. 13.9 Common execution models ... 513

14.9.1. 13.9.1 Producers-consumers problem ... 514

14.9.2. 13.9.2 Readers-writers problem ... 514

14.9.3. 13.9.3 Dining philosophers problem ... 514

14.10. 13.10 Ada ... 515

14.10.1. 13.10.1 Tasks ... 515

14.10.2. 13.10.2 Entry, entry calls, accept statement ... 520

14.10.3. 13.10.3 Selective handling of incoming messages ... 523

14.10.4. 13.10.4 Exception handling ... 527

14.10.5. 13.10.5 Examples ... 527

14.11. 13.11 CSP ... 533

14.11.1. Example: Dining philosophers problem ... 534

14.12. 13.12 Occam ... 535

14.13. 13.13 MPI ... 537

14.13.1. MPI control methods ... 538

14.13.2. Creating tasks ... 538

14.13.3. Groups ... 538

14.13.4. Communication methods ... 539

14.13.5. Communication in groups ... 540

14.13.6. 13.13.1 Case study: Matrix multiplication ... 541

14.14. 13.14 Java ... 541

14.14.1. The Runnable interface and the thread class ... 541

14.14.2. Thread groups ... 544

14.14.3. The concurrent API ... 544

14.14.4. Concurrent collections ... 544

14.14.5. The Executor framework ... 545

14.15. 13.15 C#/.NET ... 550

14.15.1. 13.15.1 Comparison of .Net with Java ... 550

14.16. 13.16 Scala ... 552

14.16.1. Actors in general ... 553

14.16.2. 13.16.1 Comparison with concurrent processes ... 553

14.16.3. 13.16.2 Parallel collections ... 557

14.17. 13.17 General tips for creating concurrent software ... 560

14.17.1. 13.17.1 Single responsibility principle ... 560

(12)

14.17.2. 13.17.2 Restrict access to shared resources ... 560

14.17.3. 13.17.3 Independency ... 560

14.17.4. 13.17.4 Do not reinvent the wheel! ... 561

14.17.5. 13.17.5 Know the library support ... 561

14.17.6. 13.17.6 Write thread-safe modules ... 562

14.17.7. 13.17.7 Testing ... 562

14.18. 13.18 Summary ... 562

14.19. 13.19 Exercises ... 563

14.20. 13.20 Useful tips ... 563

14.21. 13.21 Solutions ... 564

15. 14 Program libraries (Attila Kispitye, Péter Csontos) ... 574

15.1. 14.1 Requirements against program libraries ... 575

15.1.1. 14.1.1 Skills of a good program library developer ... 575

15.1.2. 14.1.2 Basic quality requirements ... 576

15.1.3. 14.1.3 Special requirements for program libraries ... 578

15.1.4. 14.1.4 Conditions for fulfillment of the requirements ... 580

15.2. 14.2 Object-oriented program library design ... 581

15.2.1. 14.2.1 Class hierarchy ... 581

15.2.2. 14.2.2 Size of the classes ... 583

15.2.3. 14.2.3 Size of services ... 586

15.2.4. 14.2.4 Types of classes ... 587

15.3. 14.3 New paradigms ... 590

15.4. 14.4 Standard program libraries ... 590

15.4.1. 14.4.1 Data structures ... 591

15.4.2. 14.4.2 I/O ... 591

15.4.3. 14.4.3 Memory management ... 591

15.5. 14.5 Lifecycle of program libraries ... 592

15.5.1. 14.5.1 Design phase ... 592

15.5.2. 14.5.2 Implementation phase ... 592

15.5.3. 14.5.3 Maintenance phase ... 592

15.6. 14.6 Summary ... 593

16. 15 Elements of functional programming languages (Zoltán Horváth, Gábor Páli, coauthors in Concurrent section: Viktória Zsók, Máté Tejfel) ... 593

16.1. 15.1 Introduction ... 593

16.1.1. 15.1.1 The functional programming style ... 594

16.1.2. 15.1.2 Structure and evaluation of functional programs ... 594

16.1.3. 15.1.3 Features of modern functional languages ... 596

16.1.4. 15.1.4 Brief overview of functional languages ... 599

16.2. 15.2 Simple functional programs ... 600

16.2.1. 15.2.1 Definition of simple functions ... 600

16.2.2. 15.2.2 Guards ... 601

16.2.3. 15.2.3 Pattern matching ... 601

16.3. 15.3 Function types, higher-order functions ... 602

16.3.1. 15.3.1 Simple type constructions ... 603

16.3.2. 15.3.2 Local declarations ... 608

16.3.3. 15.3.3 An interesting example: queens on the chessboard ... 609

16.4. 15.4 Types and classes ... 612

16.4.1. 15.4.1 Polymorphism, type classes ... 612

16.4.2. 15.4.2 Algebraic data types ... 614

16.4.3. 15.4.3 Type synonyms ... 617

16.4.4. 15.4.4 Derived types ... 617

16.4.5. 15.4.5 Type constructor classes ... 618

16.5. 15.5 Modules ... 619

16.5.1. 15.5.1 Abstract algebraic data types ... 619

16.6. 15.6 Uniqueness, monads, side effects ... 622

16.6.1. 15.6.1 Unique variables ... 622

16.6.2. 15.6.2 Monads ... 623

16.6.3. 15.6.3 Mutable variables ... 625

16.7. 15.7 Interactive functional programs ... 626

16.8. 15.8 Error handling ... 628

(13)

16.9. 15.9 Dynamic types ... 629

16.10. 15.10 Concurrent, parallel and distributed programs ... 630

16.10.1. 15.10.1 Parallel and distributed programming in Concurrent Clean .... 630

16.10.2. 15.10.2 Distributed, parallel and concurrent programming in Haskell 633

16.10.3. 15.10.3 Parallel and distributed language constructs of JoCaml ... 642

16.11. 15.11 Summary ... 644

16.12. 15.12 Exercises ... 644

16.13. 15.13 Useful tips ... 645

16.14. 15.14 Solutions ... 646

17. 16 Logic programming and Prolog (Tibor Ásványi) ... 651

17.1. 16.1 Introduction ... 652

17.2. 16.2 Logic programs ... 652

17.2.1. 16.2.1 Facts ... 653

17.2.2. 16.2.2 Rules ... 654

17.2.3. 16.2.3 Computing the answer ... 656

17.2.4. 16.2.4 Search trees ... 657

17.2.5. 16.2.5 Recursive rules ... 658

17.3. 16.3 Introduction to the Prolog programming language ... 660

17.4. 16.4 The data structures of a logic program ... 661

17.5. 16.5 List handling with recursive logic programs ... 663

17.5.1. 16.5.1 Recursive search ... 665

17.5.2. 16.5.2 Step-by-step approximation of the output ... 665

17.5.3. 16.5.3 Accumulator pairs ... 666

17.5.4. 16.5.4 The method of generalization ... 667

17.6. 16.6 The Prolog machine ... 668

17.6.1. 16.6.1 Executing pure Prolog programs ... 668

17.6.2. 16.6.2 Pattern matching ... 669

17.6.3. 16.6.3 NSTO programs ... 670

17.6.4. 16.6.4 First argument indexing ... 672

17.6.5. 16.6.5 Last call optimization ... 672

17.7. 16.7 Modifying the default control in Prolog ... 673

17.7.1. 16.7.1 Disjunctions ... 673

17.7.2. 16.7.2 Conditional goals and local cuts ... 674

17.7.3. 16.7.3 Negation and meta-goals ... 676

17.7.4. 16.7.4 The ordinary cut ... 677

17.8. 16.8 The meta-logical predicates of Prolog ... 680

17.8.1. 16.8.1 Arithmetic ... 680

17.8.2. 16.8.2 Type and comparison of terms ... 681

17.8.3. 16.8.3 Term manipulation ... 682

17.9. 16.9 Operator symbols in Prolog ... 683

17.10. 16.10 Extra-logical predicates of Prolog ... 685

17.10.1. 16.10.1 Loading Prolog programfiles ... 686

17.10.2. 16.10.2 Input and output ... 686

17.10.3. 16.10.3 Dynamic predicates ... 687

17.11. 16.11 Collecting solutions of queries ... 689

17.12. 16.12 Exception handling in Prolog ... 690

17.13. 16.13 Prolog modules ... 691

17.13.1. 16.13.1 Flat, predicate-based module system ... 691

17.13.2. 16.13.2 Module prefixing ... 693

17.13.3. 16.13.3 Modules and meta-predicates ... 693

17.14. 16.14 Conclusion ... 694

17.14.1. 16.14.1 Some classical literature ... 695

17.14.2. 16.14.2 Extensions of Prolog ... 695

17.14.3. 16.14.3 Problems with Prolog ... 695

17.14.4. 16.14.4 Fifth generation computers and their programs ... 696

17.14.5. 16.14.5 Newer trends ... 696

17.15. 16.15 Summary ... 697

17.16. 16.16 Exercises ... 697

17.17. 16.17 Useful tips ... 699

17.18. 16.18 Solutions ... 701

(14)

18. 17 Aspect-oriented programming (Péter Csontos, Tamás Kozsik, Attila Kispitye) ... 709

18.1. 17.1 Overview of AOP ... 710

18.1.1. 17.1.1 Aspects and components ... 710

18.1.2. 17.1.2 Aspect description languages ... 711

18.1.3. 17.1.3 Aspect weavers ... 711

18.2. 17.2 Introduction to AspectJ ... 711

18.2.1. 17.2.1 Elements and main features of AspectJ ... 712

18.2.2. 17.2.2 A short AspectJ example ... 712

18.2.3. 17.2.3 Development tools and related languages ... 713

18.3. 17.3 Paradigms related to AOP and their implementations ... 713

18.3.1. 17.3.1 Multi-dimensional separation of concerns (MDSC) ... 713

18.3.2. 17.3.2 Adaptive programming (AP) ... 714

18.3.3. 17.3.3 Composition filters (CF) ... 714

18.3.4. 17.3.4 Generative programming (GP) ... 715

18.3.5. 17.3.5 Intentional programming (IP) ... 715

18.3.6. 17.3.6 Further promising initiatives ... 715

18.4. 17.4 Summary ... 716

19. 18 Appendix (Péter Csontos, Attila Kispitye et al.) ... 716

19.1. 18.1 Short descriptions of programming languages ... 716

19.1.1. 18.1.1 Ada ... 717

19.1.2. 18.1.2 ALGOL 60 ... 718

19.1.3. 18.1.3 ALGOL 68 ... 718

19.1.4. 18.1.4 BASIC ... 718

19.1.5. 18.1.5 BETA ... 718

19.1.6. 18.1.6 C ... 718

19.1.7. 18.1.7 C++ ... 719

19.1.8. 18.1.8 C# ... 720

19.1.9. 18.1.9 Clean ... 720

19.1.10. 18.1.10 CLU ... 720

19.1.11. 18.1.11 COBOL ... 720

19.1.12. 18.1.12 Delphi ... 721

19.1.13. 18.1.13 Eiffel ... 721

19.1.14. 18.1.14 FORTRAN ... 721

19.1.15. 18.1.15 Haskell ... 722

19.1.16. 18.1.16 Java ... 722

19.1.17. 18.1.17 LISP ... 723

19.1.18. 18.1.18 Maple ... 723

19.1.19. 18.1.19 Modula-2 ... 723

19.1.20. 18.1.20 Modula-3 ... 723

19.1.21. 18.1.21 Objective-C ... 723

19.1.22. 18.1.22 Pascal ... 724

19.1.23. 18.1.23 Perl ... 724

19.1.24. 18.1.24 PHP ... 724

19.1.25. 18.1.25 PL/I ... 725

19.1.26. 18.1.26 Python ... 725

19.1.27. 18.1.27 Ruby ... 725

19.1.28. 18.1.28 SIMULA 67 ... 725

19.1.29. 18.1.29 Smalltalk ... 726

19.1.30. 18.1.30 SML ... 726

19.1.31. 18.1.31 SQL ... 726

19.1.32. 18.1.32 Tcl ... 726

19.2. 18.2 Codetables ... 727

19.2.1. 18.2.1 The ASCII character table ... 727

19.2.2. 18.2.2 The ISO 8859-1 (Latin-1) printable character table ... 728

19.2.3. 18.2.3 The ISO 8859-2 (Latin-2) printable character table ... 729

19.2.4. 18.2.4 The IBM Codepage 437 ... 729

19.2.5. 18.2.5 The EBCDIC character table ... 730

20. References ... 732

(15)

Advanced Programming Languages

1. Introduction

Programming languages are thought by many to provide as a notation form for program description. This view does not take into account - or does not even know -, how high level or user-centered languages can aid in managing program complexity. Different languages with their possibilities suggest different programming approaches, so the common practice, which is still used nowadays in many places, is highly dangerous, when programming methodology is taught through particular programming languages, not independently from them - this could only lead to narrow concerning all the programming possibilities.

The goal of programming is to produce a good quality software product, so the education of programming must start with the general definition of the task and its solving program [Fóthi, 1983]. Then based on this principle, the different concrete language tools should be acquainted to the programmers, which support the implementation. However, as it is questionable to teach the methodology through particular concrete programming languages, it also leads to a dead end, if the used programming language is said to be not important for the sake of the methodology. This is - as described by Bertrand Meyer [Meyer, 2000] - like "a bird without wings". The idea is inseparable from the possibilities of formulation. It is not a coincidence that in programming no single language has become dominant, nor that always newer programming languages are designed, which support even more the adaptation of different methodological concepts and requirements into practice.

Designers of programming languages must deal with three problems [Horowitz, 1994]:

• The representation provided by the language must fit the hardware and the software at the same time.

• The language must provide a good nomenclature for the description of algorithms.

• The language must serve as a tool to manage program complexity.

1.1. Aspects of software quality

The software is a product, and as for every product, it has - as defined by many ([Meyer, 2000], [Horowitz, 1994] and [Liskov and Guttag, 1996]) - different quality characteristics and requirements. One of the most important goals of the programming methodology is to specify a theoretical approach for creating good quality program products. The design and the evaluation of already existing programming languages are definitely influenced by methodological considerations.

Next, characteristics of "good" software will be discussed according to the work of Bertrand Meyer [Meyer, 2000]. After that, language features will be examined for supporting the methodology - through numerous programming languages.

Software quality is influenced by many factors. One part of these - such as reliability, speed, or ease of use - are basically perceived by the user of the program. Others - such as how easy it is to reuse some parts of it for a different, but similar problem - affect program developers.

1.1.1. Correctness

Correctness of the program product means that the program solves exactly the problem and fits the desired specification. This is the first and most important criterion, since if a program is not working like it should, other requirements do not really count. The elementary basis for this is the precise and the most complete specification.

1.1.2. Reliability

(16)

A program is called reliable if it is correct, and abnormal - not described in the specification - circumstances do not lead to catastrophe, but are handled in some "reasonable" way.

This definition shows, that reliability is by far not as a precise notion as correctness. One could say, of course with a more specific specification reliability would mean correctness exactly, but in practice there are always cases which are not covered by specification explicitly. That is why reliability is of high priority for the program product quality.

1.1.3. Maintainability

Maintainability refers to how easy it is to adjust the program product to specification changes.

The users often demand further development, modification, adjustment of the program product to new external conditions. According to some surveys 70% of program product costs are spent on maintenance, so it is understandable that this requirement significantly affects the quality of the program. (This is relevant especially if developing big programs and program systems, since for small programs usually no change is too complex.)

To increase maintainability, design simplicity and decentralization (to have independent modules) can be seen as the two most important basic principles.

1.1.4. Reusability

Reusability is the feature of the software products, that they can be partly or as a whole reused in new applications.

This is different to maintainability, since the same specification was modified there, but now the experience should be utilized, that many elements of software systems follow common patterns, and reimplementing already solved problems should be avoided.

This question is particularly important, not only when producing individual program products, but for a global optimization of software development, as the more reusable components are available to help problem solving, the more energy remains to improve other quality characteristics (at the same costs).

1.1.5. Compatibility

Compatibility shows how easy it is to combine the software products with each other. Programs are not developed isolated, so efficiency can go up by orders of magnitude, if ready software can be simply connected to other systems. (Communication between programs is based on some standards, such as, for example, in Unix.)

1.1.6. Other characteristics

From the quality characteristics of the program product, portability, efficiency, user friendliness, testability, clarity etc. are also important to pay attention to.

Portability regards how easy it is to port the program to another machine, configuration or operating system - usually to have it run in different runtime environments.

The efficiency of a program is proportional to the running time and used memory size - the faster, or the less memory is used, the more efficient it is. (These requirements often contradict each other, a faster run is often set off by bigger memory requirements, and vice versa.)

The user friendliness is very important for the user: this requires data input to be logical and simple, the output of the results must be clearly formatted.

Testability and clarity are important for the developers and maintainers of the program, without these the reliability of the program cannot be guaranteed.

(17)

1.2. Aspects of software design

Some of these requirements - the improvement of correctness and reliability - require primarily the development of specification tools. The easier it is to verify if a piece of program code is really an implementation according to the specification, the easier it will be to developed correct and reliable programs. The main role here have programming language features for specification (type invariant, pre- and postconditions) descriptions - this is supported for example by Eiffel [Meyer, 2000], by Ada 2012 [Nyéky-Gaizler et al., 1998] etc.

Implementation of another group of requirements - mainly maintainability, reusability and compatibility - can be best supported by designing the programs as independent program units having well defined interconnections.

This is the basis of the so called modular design. (A module here is not a programming language concept, but a unit of the design.) This question will be handled in more detail in Chapter 9.3.

Our goal is to examine the features of different programming languages to support professional programmers in developing reliable software of good quality.

1.3. Study of the tools of programming languages

It is a natural question, why it is not enough to know one programming language, for what purpose it is good to deal with all the possible features of different programming languages. In the following - primarily based on the work of Robert W. Sebesta [Sebesta, 2013] - we will try to summarize the advantages coming from this:

1.3.1. Increase of the expressive power

Our thinking and even abstraction skills are strongly influenced by the possibilities of the language used. Only that can be expressed, for which there are words. Likewise during program development and designing the solution, the knowledge of diverse programming language features can help programmers to widen their horizon. This is also true if a particular language must be used, since good principles can be applied in any environments.

1.3.2. Choosing the appropriate programming language

Many programmers have learnt programming through one or two languages. Others know older languages which are now considered obsolete, and they are not familiar with the features of modern languages. This could result in not selecting the most appropriate language if there would be more programming languages as options to choose from for a new task - since they do not know the possibilities the other languages could offer. If these programmers would know the unique features of the available tools, they could make considerably better decisions.

1.3.3. Better attainment of new tools

Newer and newer programming languages will appear, thus quality programming requires continuous learning.

The more the basic elements of the programming languages are known, the easier it will be to learn and keep up with progress.

In our book most examples are in Ada, C/C++ or Java language for certain language constructs, there are only a few chapters (except of course those about logical and functional programming) where these languages are not referenced in almost every paragraph.

Our book is aimed at facilitating primarily, the studies of university and college students to learn about programming languages, and to help the work of IT and computer specialists. Some degree of knowledge of informatics is a prerequisite to fully understand our book: readers must have already solved some programming tasks on some programming languages.

(18)

1.4. Acknowledgements

The authors wish to thank for the support of TÁMOP tender on developing teaching materials.

We also thank Zoltán Horváth, the dean of the Faculty of Informatics at the Eötvös Loránd University for permitting the usage of the infrastructure of the Faculty of Informatics. Without his kind contribution this work could not have been completed.

We would like to thank the generous assistance of the PhD students who helped us with their feedback to improve this new edition of the book.

In such a voluminous book - despite all the best efforts of the authors and the editor - there could be errors. We would like to ask You, dear reader, if such an error is found, please notify us via email addressed to proglang@inf.elte.hu. We also welcome every kind of constructive criticism.

The current version of the whole book can be found as a downloadable pdf at: http://nyelvek.inf.elte.hu/APL

2. 1 Language Design (Szabina Fodor)

In this chapter, we provide a general overview of the concepts of programming language design (such as syntax, semantics and pragmatics) and discuss the various implementation options (compiler, interpreter, etc.). We then discuss the evolution of programming languages. We identify the features of a good programming language. We also examine the consequences of the dramatic increase in the number of novel programming languages, how that explosion has affected the principles of computer programming and what historical and methodological categories the large number of languages can be grouped into. Finally, we analyze how external factors, such as programming and communication environments, have shaped the development of programming languages.

There are thousands of high-level programming languages, and new ones continue to emerge. However, most programmers only use a handful of those languages during their work. Then why are there so many languages?

There are several possible answers to that question:

• Evolution of programming paradigms. Programming languages and the principles behind them are being continuously improved. The late 1960s and the early 1970s saw the revolution in "structured programming".

In the late 1980s the nested block structure of languages such as Pascal began to give way to the object- oriented structure of C++ and Eiffel.

• Different problem domains. Many languages were specifically designed for a special problem domain. For example, LISP works well for manipulating symbolic data and complex data structures. Prolog is suitable for reasoning about logical relationships between data sets. Most of the programming languages can be used successfully for a wide range of tasks, but some of them are better than others in solving specific problems.

• Personal preferences. Different people like different things. Some people like to work with pointers, others prefer the implicit dereferencing of Java, ML, or LISP.

• Expressive power. The expressive power of a language is the spectrum of ideas that can be expressed using the given language. Though this could, in theory, be an important basis for comparison, the majority of languages are all suitable for implementing any algorithm (a feature closely related to Turing completeness).

Therefore, the expressive power of the various languages is mostly equivalent.

• Easy to learn. The success of Basic was in part due to its simplicity. Pascal was taught for many years as an introductory language because it was very easy to learn.

• Ease of implementation. Basic became successful not only because it was easy to learn but also because it could easily be implemented on smaller machines with limited resources.

• Standardization. Almost every language in use has an official international standard, or a canonical implementation. Standardization of the language is an effective way of ensuring the portability of the code across different platforms.

(19)

• Open source. Many programming languages have open source compilers or interpreters, but some languages are more closely associated with freely distributed, peer-reviewed, community-supported computing than others.

• Excellent compilers. Fortran owes much of its success to extremely good compilers. Some other languages (e.g. Common Lisp) are successful, at least in part, because they have compilers and supporting tools that effectively help programmers.

• Patronage. Technical features are not the only relevant factors, though. Cobol and Ada owe their existence to the U.S. Department of Defense (DoD): Ada contains a wealth of excellent features and ideas, but the sheer complexity of implementation would have killed it without the DoD backing. Similarly, C# probably would not have received the same attention without the backing of Microsoft.

Clearly no single factor determines whether a language is "good" or "bad". Therefore, the study and assessment of programming languages requires a careful look at a number of issues [Scott, 2009].

2.1. 1.1 Programming languages: syntax, semantics, and pragmatics

Programming languages are artificial formalisms designed to express algorithms in a clearly defined and unambiguous form. Despite their artificial nature, they nevertheless fully conform to the criteria of a language.

Programming languages are structured around several descriptional/structural levels [Horowitz, 1994]. Three such levels discussed below are syntax, semantics and pragmatics [Gabbrielli and Martini, 2010].

• Syntax describes the correct grammar of the language, i.e. how to formulate a grammatically correct phrase in the language.

• Semantics defines the meaning of a syntactically correct phrase, i.e. it gives meaning and significance to each phrase of the language.

• Pragmatics determines the usefulness of a meaningful phrase of the language, i.e. it defines how to use the given phase for a useful purpose within the program.

The three structural levels can be illustrated in the assignment let year = 2013. At the syntax level, the question is whether this formula is grammatically correct (let us assume that it is). At the level of semantics, the question is what this phrase means (in this case, the meaning is that the value of the variable year is set to 2013). At the level of pragmatics, the question is what this assignment is used for (e.g. to calculate, by using another formula, the remaining value of a mortgage at the end of year 2013).

As programming languages are bona fide languages, their structural levels are very similar to those of natural languages. Indeed, a novel written in a natural language can be analogous to a program written in a programming language. At the syntax level, "fishes swim in the ocean" and "suitcases drive pine trees" are both correct. Yet, at the level of semantics, the latter one is wrong due to the lack of an appropriate meaning. At the level of pragmatics, the former sentence would make sense as part of a story on a little mermaid but it would most likely not fit into a technical guide on how to survive for a week in the Saharan desert.

In technical terms, syntax defines how programs are written and read by programmers, and parsed by computers. Semantics determines how programs are composed and understood by programmers, and interpreted by computers. Finally, pragmatics guides programmers in how to design and implement programs in real life [Watt, 2006].

In the following sections, we will discuss each of the above structural levels of programming languages in more detail.

2.1.1. 1.1.1 Syntax

As mentioned above, syntax in principle corresponds to the grammatical rules of the language. Like natural languages, programming languages are also sets of characters (symbols) of a predefined alphabet. At the lowest level, syntax requires definitions of the sequences of characters that constitute the smallest logical units (words

(20)

or tokens) of the language. Once the alphabet and the words have been defined, syntax describes which sequences of words constitute legitimate phrases, the smallest meaningful units of the language. At a higher syntactic level, strings of those phrases combine into sentences or statements, which are then again combined into program modules or entire programs.

The syntactic rules of a language specify which strings of characters are valid, i.e. grammatically correct. The theoretical basis of syntactic descriptions date back to the mid 20th century. In the 1950s, the American linguist Noam Chomsky developed techniques to describe syntactic phenomena in natural languages in a formal manner.

Though his descriptions originally used formalisms designed to limit the ambiguity present in natural languages, this formalism also applies to the syntax of artificial languages, such as programming languages [Gabbrielli and Martini, 2010]. Shortly after Chomsky's work on language classes and structures, the ACM-GAMM group begun designing ALGOL 58, one of the early programming languages. John Backus, a prominent member of this group introduced a new formal notation for specifying programming language syntax. This notation was then modified by Peter Naur; this revised method of syntax description is now known as the Backus-Naur form or BNF. Though the development of BNF occurred independently from Chomsky's work, it is remarkable that the basic principles of BNF are very similar to those of one of Chomsky's language classes, the so-called context-free languages [Sebesta, 2013].

For the different lexical elements of programming languages, see Chapter 2.

2.1.2. 1.1.2 Semantics

While syntax only concerns itself with the appropriate format of the language, semantics is a higher level feature that deals with the meaning and significance of the given phrase [Gabbrielli and Martini, 2010]. The meaning of a phrase can be very diverse, such as a mathematical function, a relationship between program components, or an exchange of information between the different parts of the program and the environment, etc. The semantics of the programming language describes what processes the computer will follow during the execution of the program. The description of the semantics of a programming language is more complex than its syntactic description. This complexity is caused by technical problems in describing abstract features, as well as by the need to balance between the opposing requirements for exactness and flexibility of implementation [Gabbrielli and Martini, 2010]. Indeed, it is relatively easy to design exact semantics if only one route of implementation is expected. However, as soon as the implementation platform changes, additional questions arise which further complicate the semantic definition. It is also relatively difficult to describe semantic issues in computer language. Therefore, most semantic definitions are provided in natural languages and are then implemented/translated to a computer language [Gabbrielli and Martini, 2010].

2.1.3. 1.1.3 Pragmatics

Semantics defines whether a given phrase is meaningful, i.e. whether it can be interpreted and executed, but it does not tell whether the phrase is used for any purpose. The level of pragmatics ensures that the program composed of meaningful phrases makes sense and that it is indeed a useful tool for a given purpose [Gabbrielli and Martini, 2010]. The precise description of the pragmatics of a programming language is difficult, if at all possible. This is in part due to the highly abstract nature of pragmatics. In addition, pragmatics deals with the purpose or use of a syntactically and semantically correct phrase. While both syntax and semantics may be clearly defined and unambiguously understood, the same phrase may be used for a number of different purposes, and its uses may change during the use of the language. Therefore, no single definition of the pragmatics of a given phrase is possible. One component of pragmatics is programming style. While it is relatively easy to clearly describe some programming style issues (such as the avoidance of jumps or gotos), others are more of vague guidance than clear instructions. Undoubtedly, pragmatics is an integral part of the concept of programming languages and it strongly affects the usefulness of a given programming language for a particular purpose.

In this book, we will discuss semantic and pragmatic issues in detail, while little emphasis will be placed on the syntax of the various programming languages. Readers interested in syntactic issues of a given language are referred to the vast literature on the technical details of the different programming languages.

2.2. 1.2 Implementation of computer programs

(21)

Besides the above issues of syntax, semantics and pragmatics, the overall performance of a computer program also strongly depends on how the program is implemented on a given run-time environment. This implementation level therefore is added on top of the above three levels. A program written in a given programming language can eventually be implemented using several separate and even conceptually different implementation approaches. Nevertheless, most programming languages are designed for a given implementation strategy and there is little communication between the different strategies in the case of a given language. Though implementation is in most cases beyond the programmer's scope and perspective, the actual implementation may strongly influence the eventual efficiency of the program, and thus the possible ways of implementation may also determine the choice of the most suitable programming language. Here we will outline the different strategies for the automated translation and implementation of programs developed using higher- level programming languages. The most widely used implementation strategies use one of the following methods:

• compiler implementation;

• pure interpretation;

• hybrid implementation systems.

2.2.1. Compiler implementation

In the case of compiler implementation, the program is first translated to machine language to generate a code that can later be executed directly on a computer. The original program code is called the source code while its language is known as the source language. The resulting machine-executable code is the object code and its language is the object language. The translation of the source code to machine language is called compilation, which is completely separated from the execution of the program. This approach has several advantages, mainly in large-scale industrial program development. Given that no re-translation of the source code is required, the execution of the program is very fast. Another advantage is that the final executable program can be distributed without distributing the source code, thus providing protection for the programmer's intellectual property rights.

Disadvantages, on the other hand, include the compilation process itself which is rather slow. The program needs to be re-compiled every time the source code is altered, and there is a limited number of opportunities for checking and correcting the code. However, the widely available professional code writing, compilation and de- bugging tools make the compiler implementation approach a very viable strategy overall.

The process of compilation takes place in several phases, the most important of which are shown in Figure 1 and discussed in the next paragraphs.

(22)

Lexical analysis - The aim of lexical analysis is to read the program text and to group the characters (symbols) into meaningful logical units called tokens. The input text of the source program is scanned in a sequential manner, taking a single pass to recognize tokens. No further analysis of whether, for instance, the separators or the number of attributes are correct, is performed at this point.

Syntactic analysis - Once the list of tokens has been constructed, the syntactic analyzer (or parser) attempts to construct a derivation tree (or parse tree), a structured composition of the input string (the source code), in line with the grammatical restrictions of the language. At the end of syntactic analysis, each unit (leaf) of the derivation tree has to form a correct phrase in the given language.

Semantic analysis - The derivation tree, which is a structured representation of the input string, is subject to checks of the language's various context-based constraints. It is at this stage that declarations, types, number of function parameters, etc., are processed. As these checks are performed, the derivation tree is complemented with the relevant additional information and new structural complexities are generated.

Generation of intermediate forms - In this phase, an initial intermediate code is generated from the derivation tree. This intermediate code is not yet in the object langage since a substantial amount of code optimization - independent of the object language - has to be performed, and this optimization can best be done without restrictions of the object language.

(23)

Code optimization - The code obtained in the first translation attempt is usually inefficient. Therefore, several steps of optimization need to be performed at this phase. This includes removal of the redundant code, optimization of loop structures, etc. All this optimization precedes the generation of the object code.

Generation of object code - Once an optimized intermediate code has been generated, it has to be translated to the object language to obtain the final object code. This will be a machine-readable code that will be directly executed by the computer. An important part of the object code generation is the register assignment.

2.2.2. Pure interpretation

A conceptually different approach from compiler implementation is that the program is interpreted by another program, called an interpreter, every time the program is executed. This interpretation occurs parallel to the execution itself, and thus no separate translation of the entire program to machine code is performed, and no executable machine code is generated. In principle, the interpreter simulates a machine that is capable of dealing with high-level programming languages and statements rather than with low-level machine code only. Since such a machine does not physically exist (it is only simulated by the interpreter), the execution environment generated by the interpreter is often called a virtual machine. The advantage of this approach is that it makes the execution and optimization of the program code relatively easy. In particular, the de-bugging of programs in pure interpretation languages is straightforward since run-time error messages can directly be connected to the units of the original program code. On the other hand, the pure interpreter approach is not quite suitable for large-scale industrial development of highly complex and structured programs due to the time consuming nature of the interpretation of the entire program code at every instance of program execution.

For the functioning of an interpreter, see Figure 2.

2.2.3. Hybrid implementation systems

(24)

As described above, compiler implementation allows the fastest execution of the program but its compilation phase is time consuming and de-bugging is more difficult; on the other hand, pure interpretation allows immediate execution (without delay of compilation) and de-bugging is fast and straightforward, but ultimately the execution of the program is slow. Some language implementation systems combine the two approaches so as to exploit the advantages of both the compiler and the interpreter systems. In such cases, the high-level language is translated (partially compiled) to an intermediate level code which is then executed by an interpreter of the intermediate code (the virtual machine). The language of the intermediate code is designed in such a way that it allows very fast interpretation for machine execution. As a result, the source code is translated only once in a faster manner than in the case of compiler implementation, and the resulting intermediate code is executed rapidly by the intermediate code interpreter (the virtual machine). A classical example of such a hybrid system is Java which first translates the Java source code to an intermediate code called byte code, which is then executed by an interpreter approach using the Java Virtual Machine. A similar system is used by Perl, another hybrid implementation system. An additional advantage of such systems is that the intermediate code (e.g. Java byte code) is independent of the execution platform and can be run on virtual machines (e.g. Java Virtual Machines) implemented on any operating system. In addition, the intermediate code is different from the source code, and therefore, it can be distributed without compromising the intellectual property linked to the source code.

The process used in a hybrid implementation system is shown in Figure 3.

After the discussion of the various features of program design and implementation, we next describe the emergence of the programming languages from a historical and evolutionary perspective.

2.3. 1.3 The evolution of programming languages

2.3.1. 1.3.1 The early years

Thousands of programming languages have been developed over the last 50 years, but only the ones with the best features have received wider recognition. Every language is judged on the basis of its features. Initially,