• Nem Talált Eredményt

Learner corpora: Issues and implications

Concordance

2.5 Learner corpora: Issues and implications

Another value of DDL lies in the manner in which teachers can establish and maintain a classroom-based research interest themselves. By applying corpora in their syllabus design and class materials development efforts, they are bridging the gap between research and pedagogic activities, a trend wel-comed by Dornyei (1997) and Ellis (1995, 1998), among others. One example of such involvement was offered by Tribble (1997), who described an innova-tive use of a multimedia product whose text component was used as a corpus.

The author proposed that teachers who find it difficult to access large cor-pora or who do not regard the use of one as relevant can use multimedia encyclopedias as language learning resources. Targeting EFL students begin-ning to work with academic writing, the syllabus incorporated the multimedia product Encarta, a set of hypertexts, movies and graphics containing such d i -verse text types as, for example, articles by experts in the fields of physical sci-ence, geography, history, social scisci-ence, language and performing arts.

Tribble claimed that using this resource not only caters for diverse student interests in the writing course but can result in their recognition of different text organization and lexical preferences in descriptive and discursive essays, process descriptions, physical descriptions and biographies.

error analysis, and interlanguage (Selinker, 1992), the ICLE-based project enables researchers and educators to directly analyze and compare the writ-ten output of students from such countries as France, Germany, the Netherlands, Spain, Sweden, Finland, Poland, the Czech Republic, Bulgaria, Russia, Italy, Israel, Japan and China.

Part of the ICE project, the developers of ICLE identify the origins of interest i n the analysis of learner language i n early error analysis SLA studies. Granger pointed out (1998a) that although the investigations and theoretical explanations made about learner errors were grounded in data observation, the corpora for those studies did not take full account of the variables that affected the samples. For example, the number of students, their learning experience and often non-comparable test elicitation techniques raised doubts about the reliability of some of those observations. By contrast, the ICLE project has worked out a system of sampling scripts that allows for more reliable studies in the description phase as well as in contrasting individual subcorpora and a subcorpus with an L I corpus.

W i t h each script, detailed information is recorded i n the contributor's profile. This not only ensures that the data comes from a valid source, but also allows for specific analyses of types of language use in clearly defined subcorpora. The descriptors include, according to Granger (1996, p. 16):

> biographical information: nationality, age and gender

> English learning experience: years of formal English stud-ies and stay in an English-speaking country

> other learning experience: knowledge of other languages

> task- and text-related details: conditions of writing the script (test or non-test, timed or untimed, and use of ref-erence tools).

2.5.2 The composition of the ICLE

The target word count of the ICLE is two million words. The scripts are pri-marily argumentative essays, covering a variety of topics, with a smaller set of scripts made up by literature examination essays (see a list of the essay titles recommended for national contributors in Appendix A). As the aim is to col-lect and analyze authentic learner scripts, the designers pointed out in their call for submissions that essays should be "entirely the student's own" and that "no help should be sought from third parties." This specification, how-ever, raises two problems: one theoretical, the other pedagogical.

First, as a number of the assignments do not appear to involve much of the students' own deliberation as they present an argument that they need to support, no matter what their own positions, the validity of a text being a stu-dent's "own" is dubious. Even if students have the chance of choosing a title or a theme, they cannot "entirely own" their writing as they play a limited role in deciding on the focus of their essays. For this reason, the title "Europe"

61

Digitized by

Google

may be regarded of the suggested ones as the most authentic: it does define a clear enough focus, allowing students to develop an argument which is truly their own, yet specific for any lexical or rhetorical analysis when the text be-comes part of the corpus.

As for the pedagogical implications of the preferred mode of submitting a student's "own" essay with "no help...sought from third parties," the authen-ticity of the task may be lessened. With so much written production viewed and undertaken as a collaborative process effort in the LI field, it is somewhat surprising that no peer or teacher involvement is allowed. The specification also raises the issue of audience: the themes appear to favor the production of writer-based prose; yet the task is defined as an argumentative one where awareness of the position of the audience is crucial. Furthermore, why deny the opportunity of consulting a reader before the script is finalized i f one were to follow, even for such a basically product-oriented enterprise as cor-pus development, a process syllabus? Considering the role that editors, col-leagues and publishers play i n the finalization of the written work of L I au-thors (represented i n L I corpora), it stands to reason that such restriction i n the development of L2 corpora may bias the comparative analyses.

These constraints notwithstanding, the ICLE has ushered i n the time of interest i n more specific analyses of learner language. Each of the national subcorpora will be about 200,000 words, allowing for grammatical and lexical investigations, but small for research into words and phrases of lower fre-quencies (Granger, 1996, p. 16). However, the project has been instrumental in helping an international team of researchers and teachers to join forces in the field (Ringbom, 1998; Lorenz, 1998; Virtanen, 1998; Petch-Tyson, 1998;

Kaszubski, 1998, among others), and i n leading the way to new inquiries: for the development of more specialized ESL and ESP corpora. Another area where the ICLE has motivated research is the advanced spoken learner pus and the intermediate corpus, both under development. Work on L2 cor-pora is gaining recognition, and the practical implications of these efforts may be seen shortly i n the new reference and teaching materials that take account of L2 learners' language use (Gillard & Gadsby, 1998; Granger, 1998a, 1998b;

Granger 8c Tribble, 1998; Kaszubski, 1997, 1998).

Besides the large-scale work of the ICLE and the LonLC, there are several other projects that have attempted to capture what is significant i n learner texts. Of these endeavors, Tono's (1999) and Mark's (1997a, 1997b, 1997c, 1998) work merits recognition. Both are individual teachers' initiatives, but the aims and the applications are slightly different. The Tokyo Gakugei University Learner Corpus consists of 700,000 words written by lower-grade and upper-grade Japanese students' of EFL. One of the largest such collec-tions i n Japan, it has been used primarily for interlanguage error studies (Tono, 1999). The Meiji University Learner Corpus is smaller, made up by

2.5.3 Other written learner corpora

220,000 words (Mark, 1997a, p. 93). Mark's interest focused on exploiting the data i n syllabus design, helping students in examinations, and materials de-velopment. This latter objective was conceived as especially important be-cause textbooks available for advanced Japanese students of EFL d i d not seem to reflect the needs arising from the status of their interlanguage (Mark,

In this chapter, I have presented the case for employing corpora for language description and education. Describing corpus linguistics as an empirical study of naturally occurring language use i n context, I have evaluated the theoretical contrast between generative linguistics and text-based language analysis. I have reviewed the development of various types of LI and L2 cor-pora and recent work done in the field internationally. The scope of applica-tion has widened, with corpora set to affect the way language tests are val-idated (see, for example, Alderson, 1997; and Horvath, 1998c). Besides, teacher education and materials development can also benefit from corpus linguistic techniques (Bocz 8c Horvath, 1996; Hughes, 1997; M i n u g h , 1997;

Renouf, 1997; Wilson, 1997).

Interest i n applying corpora in linguistic analysis and materials devel-opment is on the rise i n Hungary, too. Studies that are partly or entirely based on such corpora as the Bank of English represent a new trend i n cur-rent Hungarian linguistics. Among these, Andor (1998), for example, applied a sample from this corpus, together with data elicited from forty native speak-ers of English, i n the study of the mental representation and contextual basis of ellipsis and suggested that a combined use of psycholinguistic and corpus linguistic research methods would enable linguists to arrive at more valid and reliable conclusions. Csapo (1997) studied the viability of the conver-gence of pedagogical grammars and learner dictionaries, Hollosy (1996, 1998) reported on work to develop a corpus-based dictionary of academic English, whereas Szirmai (2000) investigated translation equivalence by using corpus linguistic methods.

The framework of DDL and the increasing interest i n analyzing learner English on the basis of learner corpora will be applied i n the following chap-ters: the next describing and analyzing writing pedagogy at the English Department of Janus Pannonius University, and the fourth giving an account and analysis of the JPU Corpus. The study of learner scripts contributes to the authenticity of writing pedagogy: those who collect, describe, and analyze L2 texts can test, in a valid and reliable way, hypotheses of the effectiveness of writing pedagogy. Also, such collections can serve as a basis of an innovative type of learning material that can be applied directly i n the writing classroom.

1997a).