• Nem Talált Eredményt

Transcending “borders”: the rise of corpus linguistics

In document TRANSLATION STUDIES (Pldal 47-56)

János Kohn

4. Transcending “borders”: the rise of corpus linguistics

Having registered such opposed attitudes as these, we must underline that modern Translation Theory cannot avoid transcending “borders”, whether linguis­

tic or literary: there has to be a zone of convergence where a scientific (i.e. lin­

guistic) methodology can be a real support not only for the evaluation of transla­

tion results, but also, and particularly, for an exact and exhaustive description of the translated literature. This point of convergence has been reached by the rise of corpus linguistics'.

The turning point in the history ofTS will come as a direct consequence of access to large corpora of both original and translated texts, and of the de­

velopment of specific methods and tools for interrogating such corpora in ways which are appropriate to the needs of translation scholars. Large cor­

pora will provide theorists of translation with a unique opportunity to ob­

serve the object of their study and to explore what it is that makes it differ­

ent from other objects of study, such as language in general or indeed any other kind of cultural interaction. It will also allow us to explore, on a larger scale than was ever possible before, the principles that govern translational behaviour and the constraints under which it operates (Baker 1993: 235).

The first machine-readable corpus, compiled by Nelson Francis and Henry Kucera at Brown University in 1961 (one million words of printed American English) and its “sister corpus” of British English (the Lancaster-Oslo-Bergen corpus, compiled during the late seventies) represented “salient events” in the his­

tory of corpus linguistics (Clear 1993: 164) being very valuable resources for many linguists. The COBUILD project (set up at Birmingham University, with funding from Collins publishers) was and continues to be a crucial enterprise which led to an impressive expansion of corpus-based computational linguistic research (for example Corpus-based lexicography; Sinclair: 1987; Clear: 1993). The interest in this new field grew dramatically towards the end of eighties: numerous corpus ini­

tiatives (especially in the European Community, United States, and Japan) attest to the value and usefulness of large corpora of authentic running text.8 The British National Corpus (BNC) consortium was formed during 1989.

The proposal was to collect a corpus of at least 100 million words, to add lin­

guistic annotations (word-class tags to the whole corpus and skeleton parse trees to a much smaller “core” subset) and to make the results available for a wide range of industrial and academic research tasks (Clear 1993: 167). From the hardly navi­

gable beginnings (“...Chomsky had, effectively put to flight the corpus linguistics of the early generation.”9 Leech 1991: 8) corpus linguistics has, step by step, ex­

panded its scope and influence, and developed into an acknowledged research framework, a paradigm with specific methods and purposes. Offering, in particu­

lar, opportunities for quantitative studies (word frequencies, probabilistic gram­

mar, various “ecology” of linguistic elements, collocation etc.) and making syntac­

tic and lexical phenomena accessible for empirical investigation, it represents a unique source of insight into the nature of language. The existence of large bodies of running text enables the testing of linguistic theories and hypotheses, providing a fascinating new perspective of language analysis:

For the linguist, one exciting prospect arising from computational analysis of very large corpora is the potential to see entirely new patterns and fea­

tures of language by means of statistical methods. The theoretical models with which linguists have worked throughout this century seem to be chal­

lenged by the massive data that can be sifted and sorted on a machine (Clear 1993: 185).

It is now gradually being recognized that linguistic knowledge and models gained and proved by corpus techniques are more reliable than those furnished by “black­

board” linguistics. This paradigm shift has not only occurred because of the novel­

ty of the discipline, but it also “reflects a move away from conceptual and formal representations of language, which have not proved very helpful in the past, to ad­

dressing natural language in all its variety and infiniteness” (Baker 1994: 1). The development of corpus-based research is thus a natural consequence of the need for a more accurate methodology in most humanistic areas, and therefore in vari­

ous aspects of interlingual communication as well.

It was Firth who in the fifties addressed the problem theoretically, stressing the relevance of an approach which links language structures and systems to structures in their situational context and which is “more easily related to problems of trans­

lation” (1956). This shift of perspective from meaning to usage, from a conceptual to a situational orientation gave a decisive impetus to corpus-based studies. There were several tendencies and developments which prepared the ground for corpus work in Translation Studies: the attempt to define more exactly the notion of equiva­

lence created a need to explore not only the source text but also the target language text type, in order to determine such categories as stylistic and/or functional equiv­

alence.

The “semantic view” of the relatedness between ST and TT had been declining in popularity and the notion of equivalence was beginning to be reassessed. In the seventies Even-Zohar began to develop a theory of literature as a dynamic and orderly structured polysystem which is different in different cultures, translated lit­

erature having an important (even central) position in this system. This view of lit­

erature as a conglomerate of systems “has gradually undermined the status of the source text in translation studies” (Baker 1993: 236).

Although the usefulness of corpora in Translation Studies is undisputed, and their employability abundantly and well argued in theory, their practical exploitation for a more revealing insight into the specific structure of the languages involved, and for improving translational results is still at a preliminary stage. Exceptions to this are the domains a) of improving the performance of machine translation sys­

tems and b) of corpus-based terminology compilation, where it has become so widely accepted as to represent almost a mainstream in itself (Baker: 1994: 1). One of the first was Hans Lindquist (1984), who advocated the use of corpus-based studies in the preparation of materials for training translators. J. Sinclair has devel­

oped a pertinent research methodology applicable to the study of interlingual com­

munication:

The new corpus resources are expected to have a profound effect on the translations of the future. Attempts at machine translation have consistently

JánosKohn

demonstrated to linguists that they do not know enough about the lan­

guages concerned to effect an acceptable translation. In principle, the cor­

pora can provide the information” (Sinclair 1992: 395).

Mona Baker (1993; 1994) does not only furnish evidence for the powerful stimulus which theoretical research into the essence of translation phenomena will receive from the impact of corpus techniques, but she also provides a systematic overview of the corpus-typology and of the relevant software already used on a wide scale, outlining the obvious applications and future perspectives of corpora in Translation Studies. Corpora used for translation purposes do not necessarily con­

trast in two or more languages. In addition to parallel corpora (“bilingual corpora”

Leech 1991), which consist of “source language texts in language A and their trans­

lated version in language B” (for example the Hansard Corpus10) and multilingual corpora “two or more monolingual corpora in different languages, built up on the basis of similar design criteria” (The Council of Europe Multilingual Lexicography Project), Mona Baker stresses the benefit of using comparable corpora, which consist of “two separate collections of texts in the same language: a corpus of original texts in the language in question and another of translations into that language” (Baker 1994: 9). The most important contribution of this last mentioned type (which has not yet been set up anywhere) for translation research is to identify structures which are characteristic to translated texts, regardless of the source and target languages involved. They would enable not only the comparison of two languages (SL vs.

TL) or two texts, but that of translation (where a coherent model already exists in language A) with free text production in the same language. Parallel corpora have the essential function of discovering translation norms in specific contexts, and of revealing both the solutions and the methods by which professional translators overcome the numerous problems they are faced with during their translational endeavours, while offering realistic models.

The concept of norm has played an important role in the development ofTrans- lation Studies since the 1970s. InToury’s model, as mentioned earlier, norms rep­

resent an intermediate level between competence (i.e. an inventory of all options available in a given context) and performance (the options actually selected by trans­

lators from this inventory). Thus, norms are the options regularly taken by the trans­

lators in a socio-cultural situation11, a term which is very similar to that of typicality, a notion of the corpus linguistics. The real nature of norm/typicality can be de­

scribed reliably only on the basis of objective data provided by large bodies of run­

ning text. Corpus techniques are appropriate for revealing the “operational norms”, i. e. norms “which affect the matrix of the text, (matrica! norms), that is, the modes of distributing linguistic material (the occurrence of omissions, additions, substitu­

tions and transpositions) in translated texts, and the actual verbal formulation of the text (textual norms) (Toury 1978: 87).Toury lays stress on the purely descrip­

tive character of the translation norms (which describe and explain, but do not in­

corporate any evaluative elements), an attitude which, though widely accepted has been newly expanded by a fertile complementary viewpoint (conception support­

ed also by the powerful impetus of corpus-based studies).

A. Chestermann (1993: 2) considers the one-sided emphasis of the descriptive approach (which loses sight of the motivation of studying translation behaviour) as

a heritage of the long tradition of confusion in Translation Studies “between de­

scriptive and prescriptive aims”. It would be necessary to know (“...not instead, but in addition to these general descriptive laws”) what are the principles which rule a good translation: “Translation theory, if it is to take the form of a theory of translation behaviour, must include both a descriptive and an evaluative element”

(ibidem: 4). In other words, instead of a “one-legged”-theory it has to be a nor­

mative science in a dual sense. Postulating a subset of translated texts which em­

body a model, a practical example of a desired quality12, Chestermann proposes two kinds of norm: professional (constituted by competent professional behaviour) and expectancy norms (established by the expectations of the target readership). The translator proceeds heuristically by choosing the best solution s/he can find at the time, i.e. adopts a strategy. A strategy regularly used by competent translators, will

“take the status of a normative law”, it will describe a translation behaviour which is recognised as being maximally compatible with professional and expectancy norms. From an ontologic viewpoint normative laws have a prescriptive force “for members of a given translating community” (ibid.: 14). In thetraining of transla­

tors these could be acquired by the observation of model behaviour (“...an impor­

tant property of norms is that they have to be learned” Bartsch 1987: 178).

If we accept with Mona Baker that one of the major contributions of parallel corpora is to provide models of highly professional translation behaviour (“...how professional translators overcome difficulties of translation in practice...”) and of a desired quality (“...realistic models for trainee translators”) then we must realize at the same time the remarkable potential of corpus-based research for outlining and constituting a complex translation theory. Translation scholars and corpus linguists will have to adapt somehow the corpora to these specific needs: for example, the criterion of typicality has to be refined by taking the range of translators represent­

ed in the corpus into consideration. In setting up the syntactic and semantic ex­

pectancy norms for a given target language, an important part might be played by the multilingual corpora which enable the study of linguistic structures and features in their home environment, highlighting the natural patterns of a given idiom.

A corpus-assisted complex translation theory “catering” for both description and evaluation will be able to diminish the traditional differences between a linguistic and a literary approach, integrating the two (not opposed, but in this sense com­

plementary) attitudes.13

The study of parallel concordancing (parallel corpora processed by Micro- Concord or another concordance package) in the language learning or translator training classroom can reveal the specific structures of the languages involved and some of the most striking translation regularities. Thus for example, the participants of a workshop on parallel concordancing organized by the RECALL (Eastern European Computer Assisted Language Learning) Centre in Szombathely in con­

junction with the German Department of the Berzsenyi College in May 1994, analysing the original (German) and translated (Hungarian) version of the short story Hundeblume (The Dandelion) by W. Borchert identified a number a character­

istic patterns by which German items or structures are regularly rendered into Hungarian (Burgess 1995). German compounds cannot be always reproduced by a similar Hungarian word structure in this case the only solution being a para­

phrase, which of course does not have the same stylistic force. Two words in the

JánosKohn

German text, which are repeated so obsessively that they acquire symbolic value in the original, do not have corresponding equivalents in the Hungarian version - or English, for that matter: ‘Vordermann’ (‘man in front’, 14 occurrences) and ‘Hin- termann’ (‘man behind’ 12 occurrences). The possibility of contrastive interpreta­

tion offered by parallel concordancing allowed the participants to discover how such words were regularly translated. In most cases, the translator paraphrased the term with a present participle construction: ‘mögöttem jövő (ember)’, ‘előttem menő (ember)’. The frequent use of the present participle in Hungarian texts (probably more frequent than in original Hungarian texts) suggests that this form may be a candidate for a typical feature (norm) of the Hungarian subsystem of translated texts. This unusual (recurrent) distribution “has a subliminal effect14 on the way we respond to translated text and contributes to its identification as a translation. We need of course reliable data to support this hypothesis, and this may be facilitated by access (in addition to the parallel corpora) to comparable corpora. They might enable us to distinguish the characteristic structures of the translated texts (which differ both from those of the source and of the target language as well, and are not a result of interference of the two codes) and thus an insight into the nature of trans­

lation generally as a mediated communicative event. To discover in other words the

“universal features of translation”. Among these universals (which have to be con­

firmed and validated by wide scale corpus-research) may be considered:15

- A marked rise in the level of explicitness compared to the source text and origi­

nal texts. Baker assumes that lexical density (the percentage of lexical as opposed to grammatical items) contributes to information load. If by corpus techniques it could be demonstrated that a translated text has a lower lexical density than an orig­

inal in the same language, this feature may be considered as a subconscious ten­

dency of the translator to make the text more accessible to the new target public.

- A tendency towards disambiguation and simplification. Corpus techniques may support this presupposition, for example, by exploring the type/token-ratio in translated texts, compared with that of an original body of running texts. A high type-token proportion may be explained as a result of the phenomenon of lexical simplification (for instance the limited number of synonyms) which is specific for mediated communicative acts. As a counter-argument we may mention that in some Hungarian translations from German, frequent verbs like sagen, machen, kommen have a richer and a more diverse “synonymic chain” than in the source language. This circumstance is probably due to the specific traditions of the target culture and to the expectancies of the target readership (Klaudy 1995: 167).

The new corpus resources have already had a profound effect on the develop­

ment of the various areas of the linguistic research, particularly through the wide and firm empirical basis they provide for verifying hypotheses and for formulating theories. Questions that have long been around but have not been answered, or an­

swered only imperfectly (if not deliberately rejected during the hegemony of trans­

formational grammar) are finding reliable answers. This new linguistic discipline represents for Translation Studies the basis on which a complex and coherent the­

ory, integrating the different approaches (descriptive and evaluative, literary and linguistic etc., considered not opposed but complementary), could be elaborated, a theory accepted equally by both translator scholars and (which would be a pre­

miere) by the community of professional translators.

Notes

1. Although the beginnings of the interlingual translation are situated in immemorial times, a practice-based, self-conscious reasoning on translation, on its principles and rules, with decisive reverberation in the thinking of modern ages originated in the last century of the Roman republic, producing vigorous literary and translation activity. Cf. Rolf Kloepfer: Die Theorie dér literarischen Übersetzung, Tübingen: Fink Verlag, 1967, p. 22-24; Fr. Blatt:

Remarques sur Vhistoire des traductions latines. In: Classica et medievalia, 1-2, 1936-39;

Ingrid Kurz: Dolmetschen im altén Rom. In: Babel 4 1986/vol. XXXII, p. 215-221:

... Auch über die Regein die beim Übersetzen bzw. Dolmetschen zu befolgen sind, habén sich römische Autoren Gedanken gemacht. So schrieb Cicero, dafi nur ein ungewandter in- terpres Wort fur Wort übersetzt, und Horaz schlágt in die gleiche Kerbe: ...nec converti ut in- terpres; non verbum pro verbo;genus omne verborum vimque servavi...” p. 216; For further de­

tails see also János Kohn (1983): “...the first cultural period showing an openess towards a culture which was considered as to be superior - that of ancient Greece - was the Golden Era of the Roman literature. The impressive number of translations produced at Rome has had as a natural consequence the appearence of the first theoretical preocupations” (32).

2. Nida claims that his theory of translation was already well developed before Chomsky. In an article called “A framework for the Analysis and Evaluation of Theories of Translation”, he argues: Before the formulation of generative-transformational grammar by Chomsky Nida had already adopted an essentially deep-structure approach to certain problems of exegesis. (...) he advocated the back-transformations of complex surface structures onto an underlying level, in which the fundamental elements are objects, events, abstracts, and rela­

tions (1976: 71).

3. The existence of deep-seated formal universal ... implies that all languages are cut to the same pattern, but does not imply that there is any point by point correspondence between particular languages. It does not, for example, imply that there must be some reasonable procedure for translating between languages (Chomsky 1965: 30).

4. In spite of the widespread and justified view that many of theories contesting the possibili­

ties of translation are rooted in Humboldt’s “sprachliche Weltansicht” (linguistic view of the world; the famous Worf/Sapir-hypothesis for example), Wilss emphasises Humboldt’s opti­

mistic attitude, namely “that the natural predisposition to language is a universal one and that all (languages) must hold within them the key to understanding all languages”

(Kloepfer, 1967:55).That means that everything can be expressed in any language.

5. Gy. Radó was a leading expert not only in Hungary: he was vice-president of FIT and for many years editor-in-chief of its journal Babel. He received the FIT-Nathorst-Prize for Literary Translation in 1987. Radó György, “one of the great figures of international trans­

lation” as G. Kingscott calls him, died on 2 July 1994. I was privileged to be one of his

lation” as G. Kingscott calls him, died on 2 July 1994. I was privileged to be one of his

In document TRANSLATION STUDIES (Pldal 47-56)