Saturday, December 25, 2010

Investigation of Using Text-Critiquing Programs in a Process-Oriented Writing Class


Investigation of Using Text-Critiquing Programs in a Process-Oriented Writing Class
Hsien-Chin Liou
National Tsing Hua University, Taiwan, Republic of China
Abstract:
While the cost vs. gain of text-analysis CALL programs has been discussed (e.g., Brock, 1990) and major drawbacks of commercial packages such as Grammat
ik for specific learner groups were pointed out (Liou, 1991), few research studies on integrating available text-analysis programs into realistic writing classroom activities have been formally conducted so far. This research report addresses the issue by documenting the college EFL writers' revision process in which human teachers' feedback, students' own revision, and the use of two commercial packages, Grammatik and Complete Writer's Toolkit, were incorporated. A small-scale quasi-experimental study was conducted to examine the effect of the packages while assessing the writing performance of 39 subjects placed in either the control group or the experimental group. In addition, an interview was conducted to elicit the subjects' response to this CALL strategy. A comparison was made regarding (a) the performance of the two-group subjects, (b) the effects of the programs vs. subjects' peer comments (peer editing), and (c) the differences between the effects of critiquing of Grammatik and Complete Writer's Toolkit. Other factors which may affect the gain while using the program were qualitatively documented. Results showed that each of the two packages has a role to play for students of various proficiency levels and that weak subjects benefited more from Writer's Toolkit and liked such programs better. No group difference in writing quality was found regarding students who used or did not use the programs. Attitudes toward the use of such programs tend to be positive. It seems that text-analysis programs may be beneficial to learners in writing revision given careful classroom design and individual attention to learners' writing proficiency. Several pedagogical implications are raised.
INTRODUCTION
Focusing on accuracy of form while not sacrificing expressiveness of composition, this paper addresses the issue of writing revision from the perspectives of (a) kinds of help which computer-assisted language learning (CALL) may provide and (b) pragmatic concerns which exist in the Taiwanese context.
Writing revision tasks can be divided into an examination of five aspects in an essay: contents, organization, language use, mechanics, and vocabulary (Jacob, Zinkgraf, Wormuth, Hartfiel, and Hughey, 1981 provided an evaluation profile with these five categories). It is argued in this paper that human teachers and CALL should cooperate to make writing revision cost-effective. In this cooperation, CALI, may serve well in assisting grammar, spelling, and punctuation editing/revision tasks. These tasks mainly deal with accurate form in writing because computers, as machines, will function best in mechanical aspects in the foreseeable future of technological development. Truly, correct form alone, or accuracy of writing, does not constitute a good paper. Further, a shift in focus on fluency and appropriateness occurred when communicative language teaching started to influence the teaching of English in Taiwan several years ago. However, first, communicative approaches do not neglect accuracy at all but try to add another dimension of consideration by counter-balancing1 the predominant single focus on accuracy of form alone as in the past. In addition, recurrent errors in production in a piece of student writing, demonstrate the learner's immature English proficiency. Thus, the pursuit of correctness of form in EFL writing can be justified, though research is required in terms of how the gravity of grammatical mistakes influences the total evaluation of a paper.
In Taiwan, the English-as-a-foreign-language (EFL) writing class has been regarded as one of the most formidable courses to teach due mainly to the task of correcting student papers. It occupies much of the writing teachers' evaluation time, among other revision or commenting tasks. Most of the students tend to make grammatical mistakes repeatedly regardless of the teacher's efforts in correcting the same errors several times. In addition, correcting composition mistakes has become not only the writing teachers' headache in formal schools but also has created an urgent demand for semi-government-sponsored institutes. Researchers, professors, and graduate students in the fields of science and technology, with a great need to frequently publish scholarly papers, resort to writing revisors in some sort of writing center. National Tsing Hua University is a good example of this phenomenon. The University is famous in Taiwan for its science and engineering programs. Most of its graduates are required to finish their theses in English. However, this is never an easy task for either the graduate students or the advisors, the latter suffering from correcting errors in student papers. In view of this, in 1988 the Dean of Academic Affairs in our university launched a project which was aimed at promoting English proficiency of the students. A writing center was set up where graduate assistants are responsible for correcting the research papers of graduate students in the fields of science and engineering. One of the other ideas was to use computer programs to correct the grammatical errors on students' papers. It would save a great deal of time if computer programs could help with some revision tasks. During the period of 1989 to 1992, we purchased four text-critiquing packages, Right Writer (Rightsoft, 1988), Grammatìk IV (Price, 1989), Complete Writer's Toolkit (System Compatibility Co., 1990), and Power Edit (Artificial Linguistics, Inc., 1991). Since the version of Right Writer we obtained did not allow an interactive mode of processing and could not satisfy our needs in terms of error detection, it was given up for widespread use. Power Edit, rated as the first (Rabinovitz, 1991), is hard to learn to use in a short period of time due to its complicated functions. Furthermore, as a pilot test, these packages were first used with students of Foreign Languages, who had little computer literacy. Power Edit was regarded as inappropriate for the moment. Grammatìk IV was examined by the author but found ineffective for the students in a self-access context (Liou, 1991). Thus, Complete Writer's Toolkit is the main package which is investigated in this study.
REVIEW OF THE LITERATURE
Since word-processing packages gained popularity in writing classes, text-critiquing programs, or grammar checkers, have become widespread. However, common criticisms against text-critiquing programs stem from several arguments. First, they tend to foster student dependence on programs. Second, most of the commercial programs generate incorrect analysis due to their limitations in using primitive computational techniques. Then, there is a concern that students — with deficient language proficiency — tend to accept the incorrect analysis. Last, the use of such programs encourages a product-oriented approach to writing. In spite of the claims that appear to be against the use of text-critiquing programs, empirical data are needed to validate them.
In second language (L2) contexts, several studies on the use of text-critiquing have been conducted. Reid (1986), using Writer's Workbench found that generally students liked such programs, but she did not examine whether or not the quality of writing was influenced by the use of the program. Pennington and Brock (1989) compared the writing quality of two small groups of students with one applying tutor-facilitated process-oriented revision and the other, the feedback of Critique alone. They found that the Critique group performed worse in producing more short sentences, shorter drafts, and fewer revisions. However it is hardly fair to compare a human tutor who understands content and organization with a grammar-sustained program, which has inherent limitations due to its inability to analyze the semantic or pragmatic aspects of writing. Brock (1990), using the new self-formulated rule sets provided as facilities in Grammatik, argued that though there is the possibility of customizing text-critiquing programs for English-as-a-second-language (ESL) writers, important questions remain as he explored the cost versus gain:
• For what purpose is the text analysis program to be used?
• To what extent will a grammar checking program, such as Grammatik, improve student writing?
• What messages about writing are teachers conveying when they encourage ESL students to use text analysis?
• Could the time students spend with a text analysis program be given to other writing tasks that offer greater returns?
• If text analysis is utilized, where in the writing process should this occur? (p. 59)
Liou (1991), investigating the usefulness of Grammatìk IV (G4) for English-as-a-foreign-language writers in a study in which no control group was used, found that only 14 percent of the errors G4 detects are substantive though student writers felt the process interesting. She then launched a project to develop a grammar checker specifically for Chinese students.
A thorough investigation of text-critiquing programs should be conducted as not enough empirical evidence has been provided yet to determine its usefulness. Previous research has either lacked a control group for comparison of writing quality or a valid comparison, or was not conducted in a realistic class setting. It is believed that CALL should be integrated into regular classroom activities so that it becomes a mode of learning (also supported by Levy, 1992). Further, as a teacher-researcher, I believe that conducting classroom-oriented research can directly benefit classroom practice because the findings can provide immediate feedback for teaching practices. Thus, to address the effectiveness of text-critiquing programs through empirical investigation and, if possible, to suggest a good way of employing them in EFL writing classrooms are the purposes of the present study.
RESEARCH STUDY
The study was designed in such a way that each of the procedures was integrated into regular classroom activities in a first-year writing course — Grammar and Writing (G & W) for EFL majors in a university in Taiwan, Republic of China. The design is a quasi-experimental study with 19 students in the CALL group and 20 in the control group. Three months before the research study was conducted (Spring 1992), a pretest measure including a 20-question sentence-level test and an essay-writing task (titled "My College Life") were used to see if there was a difference between the groups. The verification of the lack of difference is seen in Table 1.
0x01 graphic
In the G & W class, in addition to one-on-one conferences, freshman students were asked to write five essays according to the five units/topics designed for this semester. Two-week (four hours) activities form one unit, during which students had to write up one paper. The G & W course was designed to teach writing through a process-oriented approach whose activities can be summarized in Table 2.
0x01 graphic
In the first two hours of each unit, classroom activities involved lecture and practice of grammatical points - through sentence combining exercises, vocabulary presentation by a small group of students about a particular topic, group editing of a student essay selected from the previous unit, and editing practice of another essay from the previous unit. During the second two hours, some kind of pre-writing activity was carried outside before the task of composing the writing assignment started in class. Lastly, all the students formed pairs to do a peer editing task where students had to critique their partners' papers regarding clarity and structure. This was done sometimes out of the class due to time constraints. There are forty2 first-year students in our department; they were evenly divided into two sections. Instructors of the two sections followed the same course design and materials. Students of the two sections served as the CALL group and the control group.
For this study of the writing assignment a choice was given of "The Advantages and Disadvantages of Being a/an ______ (name of a profession)," or "Some Career Tips for College Graduates." The research procedures are summarized in Table 3.
0x01 graphic
When the students finished the first draft, they did the peer editing task. Then they went to a computer laboratory, typed in the paper, and submitted it to the instructor. For their first drafts, the instructor looked closely at the content and organization and made general comments like ,! re-write the five lines" or "You need a conclusion," etc. In addition, the instructor underlined each of the problematic areas (ungrammatical points, wrong word form, inappropriate word usage, etc.) without any other marks or directions for correction. It took about 11 to 12 minutes to comment on each essay. In the second week of the unit, students were given their own essays back and did the group editing and individual editing activities; then they were asked to revise the first drafts after class on the computer. After one or two days, each of the students came back to the instructor's office and ran Complete Writer's Toolkit (CWT) one by one with the instructor observing the process (about 15 minutes for each subject). Because they had never used any text-critiquing programs like CWT, the instructor told them how to proceed. Some of them were not satisfied with the performance of CWT; then, the investigator introduced G4 to them and let them run the essay again using G43. After the grammar checking process, their essays were printed out to await final grading. Then, each of the students was orally interviewed by the instructor about how they felt about the package. Notes were taken during the process. The following six questions were asked:
1. Have you used CWT or G4 before?
2. Do you like packages such as CWT or G4?
3. Do you think they are useful in writing revision?
4. How do they compare with the effectiveness of peer editing?
5. Do you think their use can save you time in writing revision?
6. Do you think they are useful for EFL learning?
Meanwhile, the control group followed the same procedures except that they skipped the commenting task done by CWT. Both groups turned their final drafts in after either the students' own or CWT's and the students' own revision was completed on the first drafts. The investigator ran CWT on the control group's drafts for comparison.
Data Analysis
Error analysis was done on each of the subjects' drafts; the total numbers of errors and words in each of the drafts were recorded. The types of errors were marked but were not quantitatively considered for this study. The CALL group's drafts analyzed by CWT or G4 were recorded and analyzed. Final drafts were graded according to the following criteria: 30% of the total score was allocated to grammar/sentence structures and 70% to the other aspects. This resulted in a score out of 100 points. The reason for special allocation to grammar and sentence structures is that the course was designed to strengthen student writing and in particular grammatical accuracy in sentence-level performance. In grading, when a paper consisted of many errors or simple and naive sentence structures, it was scored low in the 30% portion. Papers of the two groups were scored by the same instructor. The T-test procedure was used to compare the means of writing performance of the CALL group and the control group. Most of the interview data were coded on a scale of being positive to negative. Responses from the 19 subjects were tallied according to the scale and percentage for each question. Idiosyncratic comments were recorded.
Results
A comparison of total words used in the subjects' first drafts and final drafts showed that the mean difference was small (first drafts, 339 words on average; final drafts, 359 words). This indicated that most of the subjects tended to keep their content and possibly organization (no further analysis of organization was done).
Comparison of writing performance
Errors were located in the drafts of the subjects' first version, the CWT version, and the final version. Throughout the revision process, subjects might have created new errors when attempting to revise content organization, or expressions.4 The newly added errors and those in the first drafts became the grand total number of errors for each group. The results are shown in Table 4.
0x01 graphic
More than half of the total errors remained in the final drafts. It is clear that the control group made more errors and did not correct them at all for the second drafts — only 0.5% of them were rectified. In contrast, the CALL group made fewer errors and knew how to rectify them in the second drafts - 7% of the errors were removed. It is evident that subjects were not able to correct most of their mistakes by themselves even after some devices to raise their consciousness as to form, such as marks, were used.
Further, about one tenth of the errors came from the student revision process - see added errors. It seems that the more they wrote, the more errors they tended to make.
Most importantly, the results indicated that CWT had exerted some influence on the CALL group: 38% of the total errors could be detected by CWT. It would have eased one fifth of the instructor's burden if it had been used in the control group (20.5%).
As for the final evaluation of the papers, no difference in writing performance was found regardless of whether or not CWT had been used. Results from the T-test procedure run on the final scores to compare the group means between the CALL group and the control group indicated that there is no significant group difference (P > .05; see Table 5), though the mean of the CALL group (68.3 /100) is larger than that of the control group (66.4 /100). This may suggest that error reduction due to use of CWT does not contribute to the gain in the final grade of a paper, but it may be attributed to the fact that a small sample of subjects participated in this study.
COMPARISON BETWEEN USE OF CWT AND G4
One of the major drawbacks in the commercial packages like CWT, other than the fact that they use primitive computational techniques, is that they are designed for first language (L1) writers. Among other differences, it is noted that native speakers do not make frequent grammatical mistakes like EFL learners regarding types and frequency. However, CWT and G4, as used in this study, seem to suggest some usefulness.
0x01 graphic
As mentioned, when subjects were not satisfied with the critique from CWT, the observer asked if they would like to try another and introduced G4 if the offer was accepted. This happened mostly with the more advanced student writers. No quantitative measure was made to compare the performance between that of CWT and that of G4; however, more than 30 cases of error detection examples were taken from the running sessions of each package. An impressionistic summary of these samples indicated that CWT did outperform G4 in terms of accuracy of error detection in that CWT covered more error types and generated fewer false alarms. Sophisticated student writers did not think CWT useful in substantive ways, but the less proficient students found it useful. While G4 is less accurate in grammar and mechanics, it is superior in stylistic checking (see examples in Table 6), which may benefit advanced students more.
0x01 graphic
The two examples taken from G4 critiques caused an advanced learner to look up the usage of rather and fairly in an off-line dictionary before making the final word choice.
Other than these considerations, the two packages are comparable regarding user-friendliness and easy-to-operate interface design, the focus of error types to be detected (see the following) as well as false alarms and the misleading messages they tend to generate (detailed examples are listed in Appendix A).
EXAMPLE: to unconsciously suggest
message generated by G4: Avoid splitting the infinitive to suggest. Try changing the position of the intervening modifier(s).
message generated by CWT: The sequence {to ... suggest} may be a split infinitive. It is preferable to avoid split infinitives by placing adverbial material before or after the verb.
Furthermore, false alarms may not become detrimental for learners. For example, in one case CWT generates a false alarm with irrelevant feedback, as shown in one of the subjects' drafts:
"As teachers, our great achievement and satisfaction are that all of our students can get somewhere," said my teacher and aunt. Indeed, this feeling is exceedingly strong in my father's mind whose is usually proud of his students' success.
CWT's feedback: Some writers prefer to use that rather than which in restrictive (defining) relative clauses. If the relative clause is non-restrictive, which should be preceded by a comma.
(See Appendix B for an elaborate long message for this point)
CWT flags whose. Even though it did not give an accurate feedback message, this consciousness raising caused the subject to revise the sentence drastically into the following, which is beyond her regular revision habits.
"As teachers, our great achievement and satisfaction are that all of our students can get somewhere," said my teacher and aunt. Naturally, they are always proud of their students' successes.
For this grammatical point, there is a corresponding message in G4.
Check: which
Advice: That is almost always preferred to which in this situation. If you really mean which, then it usually needs to be replaced by a comma. Press the Help key for more information.
Replace: that
Likewise, the result of the following false alarm in G4 was the following:
To forge ahead toward what they intend to do without considering the inner and outer factors does not guarantee successful; [Advice: Consider using an adverb instead of the adjective {successful}.]
The advice message, though misleading, raised the subject's consciousness of form and finally caused her to replace successful with success.
Attitudes Toward the Use of Text-Critiquing Programs
As to the question of student perceptions of the use of such a tool, results from the interview were summarized in the following figures (see detail in Appendix C). Fifty-two percent of the subjects liked such packages; 69% of them thought the programs were useful for writing revision; 69% of them found the programs could save revision time; and 52% thought the programs useful for learning. Generally speaking subjects had a positive attitude toward the use of such programs. However, the role of peer editing in our syllabus needs to be re-assessed; at least the way we designed its use needs to be improved. Because the peer editing task requires good enough English proficiency and accommodation of scheduling to arrange for the meeting of pairs, the interview results suggested that use of CWT may save more time than asking the partner to criticize the paper. In addition, the partner tended to be lenient about the peer's paper, but CWT faithfully pointed out the mistakes. This also suggests that pair dynamics in the syllabus design may influence the degree of usefulness of the programs.
CONCLUSION AND IMPLICATIONS
In conclusion, though no difference was found in the quality of writing regardless of the use of text-analysis programs, the error number counts showed that the programs may help remove some portion of mistakes in a paper, if one uses them. Either CWT or G4 has a role to play in text-analysis for learners with a certain proficiency level. Moreover, students have a generally positive attitude toward the use of such programs. Given sound design and preparation from classroom teachers, the use of text-analysis programs does not have to lead to a product-oriented writing class.
The cost versus gain in using a text-analysis program to assist in writing revision involves many variables over which classroom teachers may have most of the control. Several implications can be drawn from this study. First, it is suggested that student attention should be called to the use of the program by a writing instructor, if s/he decides to incorporate the programs into the students' writing. Teachers may warn students that such programs have limitations; thus, students will not have too high expectations nor completely depend on them for revision. Besides, instructors should give detailed instructions before they ask students to use such programs or even give a demonstration to familiarize them with terms or explanations used in the packages. Brock (1990) warned that Grammatìk III may not be a suitable addition to ESL composition pedagogy because students may believe the incorrect analyses and adopt them into their revision. Indeed, some of the false alarms from such programs are very misleading to weak student writers. It is safer if the instructor can observe the process several times before having students use it on a widespread basis. Whether the programs encourage an orientation toward product depends on how a teacher designs the writing class and when or where to encourage students to use the program. In our case, it is used to refine students' final drafts, at a time when form becomes important after the content and organization are fixed.
Another implication is for the developers of text-analysis programs. Overt feedback for an error in such systems may not help students develop their own revision strategies. The system may as well produce messages telling student users how many errors of a certain type are detected in the paper with/out marks. That is, it should summarize the error types and frequencies) but not direct them how to correct the error  (communication with Liz Hamp-Lyons at TESOL'92 Vancouver). Students should learn how to correct their own mistakes if we hope they will become independent learners and writers. Another nice feature the programs could incorporate is a tutorial or elaborate explanation, such as those in CWT, to be at the user's disposal. Those done in CWT are barely satisfactory because they tend to address an irrelevant point for the error detected; moreover, the examples and explanation messages are not very helpful. Better instructional design in these aspects is what language teachers can contribute. For non-EFL majors in the Taiwanese context, it may as well provide explanation messages in Chinese to facilitate understanding.
Equally important are the teachers' efforts. Classroom teachers should be responsible for teaching students strategies to enhance clarity and quality of writing, which text-analysis programs cannot achieve in the foreseeable future. It is suggested that the text-analysis programs be used in process-oriented writing instruction, given clear instructions and sound design, when the purpose of their use is to call upon students' explicit knowledge in editing their final drafts.
Last, this study suggests that given human efforts in the revision efforts of contents and organization — from either peers or the teacher — text-analysis programs can help with the revision of grammar, spelling, and punctuation. Such use encourages a collaboration task where humans and CALL complement each other to make the revision task cost-effective.
APPENDIX A
Examples of Output from CWT and G4
OUTPUT from CWT
False alarms
• may be various from person to person -- > people or persons
• Not only is teaching a stable career but I like children very much. Why do they have this feeling?
• --> not contain a main clause
• Some of them pay no attention to what the teacher teaches but disturb the learning atmosphere of whole class. -- > disturbs.
• the disadvantage of being a teacher is that ethical morality between teachers and students is gradually disappearing. --> are
• many guiltiness young students [The word {many} does not agree with {guiltiness}.]
• First of all --> [Consider firsts instead of {First}.]
• First, you must find out what kind of career you interest in most automatically. [Advice: The word {First} does not agree with {interest}.]
Errors detected
• the contact with unsophisticated children often keep a young mind --> keeps
• Like my father and relatives often tell me "I can get more achievement of being a teacher than any other career." [Advice: Consider {As} or {As if} instead of {like}.]
• Take an example, a man who likes young students will be more patient with his students so he will like to be a teacher. [Advice: This appears to be a run-on sentence.]
• may refused --> refuse
• The author claims that "an important issues that has involved journalist in legal procedures (and even put it in jail) is the matter of protecting the identities of sources." [Advice: The quoted material appears to be improperly punctuated.] -->issue
• must goes -- > go
• an news --> a
• upholded --> upheld
System failure
Message: Sentence was too long to process for grammatical structure. (as compared with related phenomenon detected in G4: long sentences can be difficult to read and understand. Consider revising so that no more than one complete thought is expressed in each sentence.)
OUTPUT from G4
False alarms
• Thirdly, most teachers agree with the fact that getting along with many young and active students is a great pleasure of being a teacher which also brings them young spirits.
• [Be sure you are using is with a singular subject. (It is.).] [The singular noun teacher may be used incorrectly with the plural form of the verb bring.]
• several career tips --> several careers tips
as being famous, making a lot of money --> as
Errors detected
• more direction ("direction" should be in plural form.)
• You may also
a unsuitable one.
• amounts of their work does
• the both (word order)
• Well, I am very fond of this job because what waiting for me in everyday life are challenges, not routines, --> is
• "As teachers, our great achievement and satisfaction are that all of our students can get somewhere," said my uncle and aunt. (CWT caught this, too.)
Style
• A paragraph contains more than one sentence.
APPENDIX B
On-Line Long Tutorial-Like Message in CWT
Some writers prefer to use that rather than which in restrictive (defining) relative clauses. If the relative clause is non-restrictive, which should be preceded by a comma.
Relative clauses provide additional information about a noun. They are classified as restrictive or non-restrictive depending upon whether they are essential to the definition of the noun or supplementary to it.
A restrictive clause is essential to the meaning of the sentence in that it restricts the definition of the noun to the class described by the clause. For example,
The painting that won first prize is hanging in the foyer.
(The relative clause "that won the first prize" specifies exactly which painting is hanging in the foyer.)
A non-restrictive clause contains information that is not essential to the specification of the noun it modifies. It is more like a parenthetical comment. For example,
The use of seat belts, which can prevent serious injury in auto accidents, is now mandatory in several states.
(The relative clause "which can prevent serious injury in auto accidents" provides supplementary information about seat belts, but is not essential to the statement that their use is now mandatory in several states.)
When a relative clause is non-restrictive (and the noun begin modified is not a person), the relative pronoun is "which," and the clause must be separated from the rest of the sentence by commas.
When a relative clause is restrictive (and the noun being modified is not a person), the relative pronouns "which" and "that" are often used interchangeably. If "which" introduces a restrictive relative clause you can substitute "that" without changing the meaning. For example,
A black hole is a star which has collapsed
Some writers argue that the use of "which" in restrictive relative clauses may be confusing. They prefer to use "that" for restrictive clauses and to limit "which" to non-restrictive clauses.
APPENDIX C
Results of the Interview Data (case = 19)
1. Have you used CWT or G4 before?
0% of them have.
2. Do you like packages such as CWT or G4?
Yes 52% No 36% Don't know yet 11%
(Note: The figures were rounded off to the nearest whole number.)
Other comments
• acceptable
• like it if it is useful
• funny but confusing (because subject didn't know much of the grammar checker like a spelling checker)
3. Do you think they are useful to writing revision?
Yes 69% very little 16%
No 5% don't know yet 11%
Other comments
• better than nothing
• It takes time to run but functions are few and unsatisfactory.
• little on contents and organization
• Only computer thinks it incorrect in using "secondly;" nobody thinks so
4. How are they compared with the effect of peer editing? (prompt more about the peer editing task)
Positive comments
• partners are fine
• fine but lenient; courteous, did not correct word usage (only read for clarity)
• corrected my paper to a limited extent
• corrected grammar
• conscientious partner, commented on every aspect of paper
Negative comments
• reluctant to comment (because the subject's paper is at least twice as long as that of average subjects for most of her writing assignments) — impatient to read a long essay
• partner nastily criticizes
• partner not easily accessible
• neither partner nor computer helps
• partner often tired, could not match mutual schedules
• almost didn't revise any
Other comments
• peer corrected semantics, not mechanics, depending on peer's attitude and proficiency level (if peer is good or compatible, the commenting task tends to be satisfactory)
• peers are humble but comments not useful, would rather replace a partner, like teacher's comment, very helpful; if he corrected others, others would feel unhappy
5. Do you think their use can save your time in writing revision?
Yes 69% No 11%
Barely 5% Don't know yet 16%
Other comments
• In using the program, we do not have to find a classmate to comment on our paper.
6. Do you think they are useful for EFL learning?
Yes 52% No 22%
Don't know yet 21%
Other comments
• To some extent. But we should read more books instead of using CALL programs (due to limited experience with CALL).
• No, but would like to try it later.
ACKNOWLEDGMENT
Special thanks go to Ms. Yuli Yeh, who taught the control group for this study and helped with data collection. I would like to acknowledge Ms. Kui-Pen Hsu's contribution in helping analyze part of the data.
NOTES
1 Correctness of form has been a focus in our EFL education; one of the reasons is that learners do develop variable interlanguage systems, or errors at every stage of language development, especially at the beginning level; the other reason is long-term pedagogical emphasis on accuracy.
2 A subject in the CALL group dropped out in the middle of the study; thus, there were 39 subjects in total.
3 For those who only used CWT, the investigator herself alone ran their papers again using G4 and recorded the output.
4 The T-unit (Hunt, 1965) was used to analyze four subjects' final drafts with an increase of more than 30 words from their 1st drafts. Two of them added one to four T-units without an increase in error number, whereas one, interestingly, increased 5 T-units with an increase of 3 errors, and another one increased 4 T-units with an increase of 5 errors. This may have indicated that their interlanguage systems are still developing.

The Applied Linguist, School Reform, and Technology: Challenges and Opportunities for the Coming Decade


The Applied Linguist, School Reform, and Technology: Challenges and Opportunities for the Coming Decade
G. Richard Tucker
Carnegie Mellon University

Abstract:
In this presentation, I propose to describe briefly the rapidly changing demography of the U.S. school population and the implications of these changes for pedagogical practice(s). I will then summarize current research priorities for the language education profession that have recently been articulated by several professional associations and task forces. I will conclude by discussing some of the implications of these suggested research priorities and directions for the work of those concerned with improving the effectiveness of language learning and teaching through the use of innovative technologies.
CHANGING DEMOGRAPHY
As we are all aware, the demography of the United States is changing dramatically and, concomitantly, that of our school systems and of the workplace.1 (For a review of these issues, see Crandall, 1995.) As a nation, we are becoming markedly more culturally and linguistically diverse. The number of foreign-born as a percentage of the total population and the percentage of individuals who typically speak a language other than English at home have increased significantly since 1980. More and more of the entrants to our schools and to our workforce are language minority individuals, and this trend will continue for the foreseeable future.
Consider the following observations by Jodi Crandall—a former colleague of mine and scholar well known to many of you:
• During the 1980s more than nine million individuals immigrated to the US—more than at any time in the twentieth century except for the period from 1905-1914;
• Between 1980 and 1990, the Asian-American population more than doubled; the Hispanic-American grew by more than 50%;
• In the five years from 1986-1991, the nation's school-age population grew by approximately 4%; but the percentage of limited-English-proficient youngsters rose by more than 50%; and
• It is estimated that in 1990 approximately 30% of all students were minority along with 12% of all teachers; but that by the year 2,000 approximately 38% of pupils will be minority but only 5% of all teachers will be.
These changes have profound implications for our educational system and for the types of individuals who provide services to our students.
Clearly, as the composition of the American population continues to change, individuals who possess at least some degree of even latent bilingual proficiency will increasingly comprise the pool of prospective students and members of the workforce. However, if present educational practices continue, these individuals will not be encouraged, nor will they even be assisted, to nurture or to maintain their native language skills as they add English to their repertoire. They will comprise a rapidly expanding pool of individuals that Wallace Lambert has characterized as "subtractive bilinguals."
Every indication that we have seen suggests that these trends will be even more prominent when the data from the year 2000 census have been analyzed—regardless of whether one uses the "exact count" or the statistical approximation method.
Language Education for Minority Students
Unfortunately, available data suggest that language minority students (and, in particular, those who are so-called limited-English-proficient youngsters) do not perform as well academically as their language majority counterparts. They often—as a class—do not develop the academic English language skills that they need to participate effectively in educational instruction; they drop out of school in disproportionately higher numbers than their counterparts; if they do remain in school, they are less likely to proceed to colleges or universities (del Pinal, 1995); if they proceed to college or university, they are less likely to study professional subjects such as engineering, medicine, etc.; if they find employment, they are less likely to be retained and more likely to earn lower wages than their counterparts, etc. The outlook is not a positive one. Let's turn for a moment to consider so-called language-majority students.
Language Education for Majority Students
What is the prognosis for developing bilingual language competence in our students? With respect to so-called language majority residents, by all accounts we are not achieving the level of success in foreign or second language teaching programs necessary for them to compete effectively in the commercial world of the 21st century. Although the absolute number of students enrolled in modern foreign languages at the postsecondary level has increased substantially from 1960 to 1990, enrollments in relative terms have actually fallen from 16.1 per 100 college students in 1960 to 8.5 per 100 in 1990. Nor do American students, for the most part, study abroad. In a typical year, fewer than 3% of American postsecondary students study abroad. In 1998, there were approximately 480,000 international students studying in the United States, but there were only 98,000 American students studying abroad—and the majority of those students were in English-speaking countries!
The picture is equally bleak at the elementary and secondary levels, where it is estimated that fewer than 5% and 38%, respectively, of public school students participate in any foreign language study whatsoever. Furthermore, a majority of the relatively small number of individuals who do have an opportunity for foreign language study achieve disappointingly low levels of proficiency in their target languages (see Branaman & Rhodes, 1997).
We have talked a great deal in professional educational circles in recent years about the effectiveness of so-called foreign language immersion programs and of developmental bilingual education programs—and indeed they are effective; but the truth of the matter is that to date it is estimated that fewer than 50,000 American youngsters participate in one or the other of these types of programs, that is, fewer than one thousandth of one percent of the youngsters enrolled in public schools.
A National Need
Unfortunately, U.S. schools and colleges have been strikingly unsuccessful in expanding (or in conserving) our country's language resources. And we have a crying national need. Why? What makes addressing this need so urgent?
Let's talk for a moment about our students, what they will be doing tomorrow, how they will support themselves, and how they will (or will not) help our nation to prosper economically and socially (Cetron & Gayle, 1993).
In schools,
• By the year 2000, public school enrollments will be approximately 43.8 million;
• One million youngsters will continue to drop out of school annually at an estimated cost of $240 billion in lost earnings and foregone taxes over their lifetimes;
• The number of at-risk students will increase as academic standards rise and social problems intensify;
in the workplace,
• The major management issues of the 1990s will be quality, productivity, and the decline of the work ethic;
• The decline of employment in agricultural and manufacturing industries will continue;
• The emerging service economy will provide jobs for 85% of the work force by 2000;
• A new category of "knowledge workers" has resulted from the unprecedented growth of information and knowledge industries. By 2,000 knowledge workers will fill 43% of available jobs;
On the international scene,
• Alliances such as those created through NAFTA, GATT, increasing ties with the Pacific Rim, and other multinational linkages will increase rapidly;
• These alliances welcomed in so many parts of the world provide potential opportunities for American workers—but to date these trade agreements have been riddled with pitfalls for us.
The majority of our negative experiences to date can be attributed to the glaring lack of expertise in languages other than English and lack of cross-cultural competence on the part of U.S. professionals. Fully two thirds of our gross domestic product is now accounted for by "services." By removing artificial trade barriers, treaties allow U.S. professionals to provide services freely in the signatory nations in exchange for access to U.S. markets by foreign professionals. Providers of services must be able to speak the target language with a high degree of fluency and have basic comprehension of the cultural assumptions and norms of the society in which they are operating. This has not proven problematic for foreign professionals wishing to enter the American marketplace. This requirement has virtually paralyzed our workers wishing to gain access to foreign markets (Brecht & Walton, 1995).
I have spoken of the lack of language resources as a national need, but in fact as one looks throughout the world at the social and political changes occurring in East and Central Europe, the former Soviet Union, southern Africa and many other settings, it is clear that multilingual proficiency and advanced education will be prerequisites for participation in industrial development. Against this backdrop of our rapidly changing demography, let me now make a few remarks about instructed second language learning and teaching in North America.
INSTRUCTED SECOND LANGUAGE LEARNING AND TEACHING.
As a backdrop for this presentation, I propose to begin by reviewing briefly selected information concerning the ways in which language educators reacted to the wave of immigration to the United States in the early 1900s. (This example is drawn from the field of English as a second language.)
The Formative Years
In the early years of the 20th century, hundreds of thousands of newcomers migrated to the United States. A large majority of these individuals were non-English speakers, and it is interesting—and perhaps instructive—to examine briefly the types of services that were made available to these newcomers. Within the context of the so-called Americanization movement, a broad network of ESL classes was developed and offered by industry and by community organizations in many parts of the country.
During the period from 1918-1923, there was a great deal of activity. In reviewing government archival records from that period, I was surprised to find so many explicit references to topics of current interest. In 1918, a convention was held in Connecticut that brought together representatives from more than 100 industries in the state to begin to develop standards for English classes. A number of other states reported that they offered summer training programs for teachers who would be called upon to teach English to the "foreign born." A promotional film, The Making of an American, was prepared to encourage newcomers to study English. California established a Council on Immigration to coordinate service provision.
At the national level, much of the work of coordination was taken on by the Director of Americanization of the (federal) Bureau of Education, then housed within the U.S. Department of the Interior. The Department convened a national conference in May 1919 on "Methods of Americanization," and a number of sessions were devoted to the topic of teaching English to the foreign born. At the time Dr. Henry Goldberger, an Instructor in Methods of Teaching English to Foreigners at Teachers College, Columbia University, appears to have been widely recognized as the leading authority in this emerging field (Claxton, 1920).
As Claxton (1920), the Commissioner of Education, noted,
It is therefore of great importance both to the welfare of the country and to the happiness and prosperity of those among us who have recently come from countries of other speech than ours that these be given every possible opportunity to learn English, and that they be induced to make use of these opportunities, and that the methods of teaching be adapted to their needs, so that the task of learning a new language after they have passed the age when languages are most easily learned may be made as easy and attractive as possible. To offer the opportunity and formulate the method is a large part of the task of Americanization in which this bureau is engaged.
Let me call your attention in passing to the number of issues foreshadowed in Claxton's letter of transmittal which remain as major concerns for language researchers and language educators today—namely issues such as the role of individual differences in second language learning, the fit between student needs, learning characteristics and instructional methods, the critical or sensitive period for language learning and language teaching, and the role of government in developing standards or guidelines for such programs. I shall return to a number of these issues later in this paper.
By the mid 1920s, state reports began to include information about the development of in-service as well as pre-service training programs. Classes were offered at universities and local schools of education (e.g., the Cleveland School of Education in which teachers could enroll in a 40-hour course covering issues such as the organization and administration of Americanization activities, methods, etc.). In addition, some states such as Delaware required ESL teachers to complete a 30-hour training course, supplemented by in-service training which included monthly conferences, demonstration lessons, inspection of lesson plans, and supervision of classroom instruction.
Statewide and regional conferences were held regularly to discuss problems encountered in ESL classrooms and to generate lists of possible solutions for vexing issues such as pupil classification and placement, class size, number and length of sessions, and curricular matters. Useful resource materials such as the comprehensive handbooks by Goldberger (1919) and Hill (1923) were produced and distributed by the Government Printing Office. Another guide, The Ohio Manual for Teachers (Americanization Bulletin No. 2), was cited as one of the best guides available (Mahoney, 1923). So in summary, we can draw a number of positive inferences about the professionalization of the teaching of English as a foreign or second language during this period. Educators and policy makers then were concerned about many of the same issues that concern us today.
At the same time, however, it is interesting to note that "success" can only be inferred by allusions in reports such as that from South Dakota which called attention to the diminishing need for such classes since "the bulk of the potential students will have been trained over a two- to three-year period." That is, one finds no evidence of systematic research or evaluation accompanying this large-scale program.
Recent Research in Adult Second Language Acquisition
I turn now to a brief review of recent research in the area of adult second language acquisition. I will focus in particular on issues related to language learning and literacy in instructed settings, program design, instructional content, effective practices, and assessment. I want to preface this section by noting that I tend to share the view articulated by Avery (in Burnaford et al., 1996) that "Learning is a messy, mumbled, nonlinear, recursive and sometimes unpredictable process." Therefore research intended to examine, describe, and clarify the process of learning and teaching should be characterized by its systematicity and accessibility to public scrutiny and by a process of systematic inquiry that will yield reliable information permitting valid conceptualization. Furthermore, it often seems to me to be the case that research yields numerous implications but that these frequently result in few direct applications. At least that seems to be the case for educational research conducted in North America.
As preparation for this paper, I reviewed all issues of a number of scholarly journals for the past several years (e.g., Applied Linguistics, Journal of Multilingual Multicultural Development, Language Learning, Second Language Research, Studies in Second Language Acquisition, TESOL Journal, and TESOL Quarterly), volumes of the recent Encyclopedia of Language and Education, recent issues of the Annual Review of Applied Linguistics, as well as a variety of published and still unpublished policy statements or reports (e.g., from organizations such as the Center for Applied Linguistics, TESOL, the newly created TESOL International Research Foundation) and various web sites.
Theoretical Underpinnings
One of the more interesting paradoxes to emerge from this review for me is that the theoretical underpinnings or approaches or at least the so-called "linguistic propositions" which inform or guide a good deal of the current second language acquisition research such as the generative (see Ritchie & Bhatia, 1996) are quite explicit while, according to Schumann (1998) and others, the theories that account for or motivate research that examines individual differences among second language learners are for the most part fragmentary or relatively inexplicit. As a corollary, one might note in particular that much current second language acquisition (SLA) research grounded in universal grammar (UG), which is intended to test specific claims or predictions of current syntactic theory, does not concern itself for the most part with so-called applied language learning or teaching questions.
For researchers working within a UG framework, for example, two major questions seem to have emerged in the past decade: (a) to what extent does the syntactic representation of a second language reflect the parametric settings adopted for the first language and (b) how does this initial representation develop and how is it replaced by target language parameters over time through interaction with other UG principles (Sánchez, personal communication, September, 1998). According to Bayley and Preston (1996), this "hegemony" of the generative influence on much SLA research has tended to inhibit those working within a variationist perspective—or by extension one might argue for those working within a cognitive or sociocultural framework as well.
Selective Review of Recent Research Syntheses
In the following section, I offer a very selective overview of current research directions in adult SLA as reflected in recently published research studies or syntheses. There have been a number of quite informative syntheses published recently. As Shirai (1997) notes, research on second language acquisition has shed some light on the process of how second language competence is attained. As he reports, there has been a good deal of research in recent years on topics such as linguistic universals, markedness, and noun phrase accessibility. And we know, for example, unmarked meanings are easier to acquire in a second language than marked meanings and that learners generally start with unmarked prototypical forms. Shirai suggests, however, that there is a great deal that we do not yet know or, conversely, that much of what we apparently know does not have immediate pedagogical implication and that future research would do well to address the issue of which linguistic features are developmental, variational, or projectional.
Schachter (1997) focuses her review on what she identified as three major areas that require future research attention: (a) the critical (or sensitive) period for language learning, (b) the extent to which postadolescent learners can and do learn the components of a language implicitly and the associated value of explicit teaching of linguistic form, and (c) the issue of whether negative feedback (from the teacher) has pedagogical value. Each of these areas offers promise for impact on pedagogical practice, but, to date, extant research studies clearly seem to provide conflicting evidence. (I should add parenthetically here that I personally see great promise in the current generation of brain-imaging studies for the potential promise that they hold for clarifying the functional impact of learning and teaching and also the potential power that they hold for one day documenting that learning or consolidation has in fact occurred. I refer, for example, to work such as that of Kim et al. [1997] who found definitively with functional magnetic resonance imaging [fMRI] studies that second languages learned in adulthood are spatially separated from native languages (in Broca's area) as opposed to the two languages of early bilinguals whose languages are represented in common frontal cortical areas.)
Foster-Cohen (1999) reminds us that "most of the 'big questions' in the two fields [first and second language acquisition] are inherently connected." For her, the major questions are discovering whether there really is a "language instinct," what constitutes the nature of the input to which the learner is effectively exposed in either natural or instructed settings, and what kind of individual differences there are and how these differences impact adults' language learning and teaching. Reviews such as these studies cumulatively provide evidence that many of the specific questions frequently asked by planners and policy makers have not yet received definitive answers, nor have many even been addressed.
Selective Review of Recent Molecular Studies
Likewise, there have been a number of recent studies that examine specific modalities or features of second language learning such as vocabulary (Schmitt, 1998), writing (Gibbons & Lascar, 1998) and reading (Koda, 1998). Some of these studies do have potential short-term impact on adult language and literacy training. Koda (1998) found, for example, in her work with Chinese and Korean learners of English that L2 readers of English from nonalphabetic backgrounds may be seriously handicapped and that prior processing experience, task requirements, and learners' metacognitions interact in complex ways to account for procedural variations among L2 learners. She concluded that performance data among L2 readers cannot be explained through simplistic analyses of L1 variables.
With respect to vocabulary, Schmitt (1998), for example, found differential acquisition over time by adult ESL learners in terms of their spelling, development of associative networks, and acquisition of grammatical information and semantic information for a small number of continually studied lexical items. Similarly, Laufer (1998) finds differential development of vocabulary knowledge with periods of rapid gain and plateauing that seems to occur when learners are not encouraged to take risks and use more complex vocabulary.
In a different set of research foci, individuals such as Gibbons and Lascar (1998) and Shaw and Liu (1998) have extended the ways in which counts of "register features" are used to facilitate an evaluation of second language learners at different stages of development, and they have pilot tested a set of register-sensitive multiple-choice cloze tests that may have pedagogical utility.
The point that I wish to make through the use of these highly selective examples is that there is extant research which addresses concerns central to adult second language acquisition that does have pedagogical value, but that such research is relatively scarce.
Selective Review of Effective Schools Research
In the following section, I review briefly highlights of findings from the so-called "Effective Schools" research (see, for example, August & Hakuta, 1997, chap. 12) with a view toward understanding whether there are generally applicable practices and policies that would be relevant to adult ESL and literacy education. These studies typically focus on practice or process in relation to some type of outcome, normally either student outcomes or nomination. The unit of analysis—despite the nomenclature—is usually the classroom rather than the school. In effective schools, in general, staff design the learning environment to reflect contextual factors from the communities of their students. In particular when the bulk of the students are non-English speakers, effective schools possess attributes such as the following: a customized learning environment, some use of native language and culture in the instructional process, opportunities for student directed activities, systematic and continual student assessment, and staff development.
Despite these generalizations, August and Hakuta (1997) describe a "clear need for research to examine the effects of instructional interventions and social environments on the linguistic, social and cognitive development [of the participating students]." They also call for research that more broadly examines the school-change process.
Summary Observation
The major point that I want to make on the basis of this brief review of extant research in adult second language education is that we know relatively little about the course, the correlates, or the causes of instructed adult (ESL) second language learning despite more than 75 years of experience with program implementation in many parts of the world.
Suggestions for Future Collaboratives
I would like to conclude this section by noting that I believe that research should be a collaborative activity with teachers, administrators, policy makers, and researchers serving as equal partners in the enterprise and that it is important that the concerns of disparate audiences be represented equally in the research process. From my vantage point, the optimal planning, implementation, and dissemination of research involves, of necessity, a continuing dialogue among the diverse "stakeholders" in the various phases of teaching and learning.
Such a view has implications for graduate training. I believe we should encourage our students to seek the broadest possible training in qualitative and ethnographic as well as in quantitative techniques and that we should then work to ensure that we all use the tools that are most appropriate for the questions we are asking from among the broad array of techniques and procedures currently available to us. Oftentimes, this is more difficult for seasoned faculty members to do than for our students, which is at least partially attributable to advances in the research field. (For example, when I was a graduate student in experimental psychology, I took the most advanced courses that were then offered in quantitative data analysis at McGill University. Today, our courses for incoming doctoral students in Psychology at Carnegie Mellon begin by assuming that students have already done prerequisite work that is more advanced than that with which I finished my formal training.)
I also believe firmly that we should not encourage the importation of any one relatively restricted research paradigm or tradition but that we should be equipping our students and other individuals with the tools they need to participate as full and equal partners in what Swain (1996) and others have referred to as a recursive "cycle of discovery." Thus, I resist strongly the assertion that one or another methodological framework (or analytical procedure) is uniquely suited for the type of collaborative, action research to which I am referring as one that would be desirable in all adult ESL and literacy education.
RESEARCH PRIORITIES
In the section to follow, I will summarize and discuss the implications of several recently conducted metareviews of research priorities for the field. I note that there have appeared a surprising number of substantive research reviews within the last two years which have ended by offering a purposeful list of priorities for the language education field (see, for example, August & Hakuta, 1997; NCLE, 1998; Tucker, 1998) as well as statements or position papers that have been prepared and broadly disseminated by organizations such as the American Council on the Teaching of Foreign Languages, the National Network for Early Language Learners, international organization of Teachers of English to Speakers of Other Languages (TESOL), and the newly created TESOL International Research Foundation.
Interestingly, almost every review of language education research (and I am using this term quite broadly) decries the lack of longitudinal research or data. Skehan (1998), for example, identifies what he refers to as a serious omission in research activity, the "Most important of these is the lack of any longitudinal research," while Alampresse (1994) notes that there are no major longitudinal studies that yield valid data on the impact of participation in adult ESL programs. This same argument was advanced rather forcefully a number of years ago by Spolsky (1989) and is similar in spirit to the suggestions provided by Clark and Davidson (1993) when they argued for the desirability of designing and implementing what they referred to as a "Research Consolidation Project."
Two additional themes seem to be emerging frequently in recent reviews. The first is a growing concern with the increasing heterogeneity of the young adult student population (Gardner, 1991) in relationship to the population of more typically examined school-aged instructed learners from past years. A second major concern centers around the evolving views of psychometricians with issues of validity and the implications for assessment (see, for example, Chapelle, 1999; McNamara, 1996). In the current view, validity is seen as comprising the extent to which test interpretations and use can be justified, and test developers are increasingly concerned about testing consequences, that is, according to Chapelle, to the "value interpretations made from test scores and the social consequences of test use." I mention this latter area since the search for a useful or appropriate "metric" for assessing adult language attainment or growth will, of necessity, involve to some extent the use of standard tests. And it is fair to say that the field of language assessment and its underlying assumptions is in ferment.
There is a marvelous quote from Hamayan (in Donato, 1998) who is reflecting on the difficulties of capturing the essence of a person's language capability via some type of traditional standard test.
Assessing foreign language abilities is in many ways similar to painting a chameleon. Because the animal's colors depend on its physical surroundings, any one representation becomes inaccurate as soon as that background changes.
Let me now return to the notion of longitudinal research and its importance. As the TESOL organization (TESOL reauthorization statement, August 1993) noted in testimony before a committee of the U.S. Congress on the occasion of the reauthorization of the legislation for the federal Office of Educational Research and Improvement,
We propose that a series of truly longitudinal studies be undertaken to identify and track successful ESL students and the educational programs in which they have participated. Students would be followed over a number of years following their initial enrollment and progression through the system (using the model of the Terman "gifted-student" studies) to document the cumulative effects of various types of program participation as well as to describe clearly the correlates and consequences of developing bilingual proficiency. We believe that such a series of studies would not only serve as the basis for the dissemination of successful educational practices to others; but that it could also—and perhaps equally importantly—portray bilingual students and programs developed to promote bilingualism in a more favorable light.
Since the research project conducted by Terman and his associates at Stanford University is frequently cited as an exemplary tracking study, let me provide a few details (see, for example, Terman, 1925; Terman & Olden, 1959). Approximately 75 years ago, a large number of gifted students (approximately 1,000) were identified using some of the then common assessment, identification, and nomination procedures. These individuals were systematically tracked for many decades. At periodic intervals, detailed information was collected from them about educational experience, attainment, work experiences, health issues, etc., as well as from randomly drawn samples of individuals from the general population matched in terms of age. In general, the research showed that these "gifted" individuals came over time to differ markedly from supposedly similar cohorts of individuals. Thus, the gifted group maintained and, on many dimensions, even increased their early superiority. Their incidence of ill health, mortality, insanity, delinquency, and alcoholism was well below that of the general population of corresponding age. In addition, higher proportions of the gifted cohort entered college, graduated, earned honors and awards, and went on to postgraduate education, etc. The differences became accentuated over time and added to the perceived value of collecting longitudinal data of this type—research that is typically conducted within the medical and biological sciences.
I would now like to take a moment to review very briefly the results of extant longitudinal research in the language education field in areas that have informed both policy and practice. Here I propose to review the 12-year longitudinal evaluation of the effects of participation in French immersion programs by English-speaking youngsters (Lambert & Tucker, 1982), the tracking study of the cumulative effects on teachers of participating in a National Defense Education Act (NDEA) summer training institute (Bruck, Lambert, Tucker & Bowen, 1975), and the 23-year study [to date] of early participation in high quality preschool programs (Schweinhart & Weikart, 1997).
Canadian Immersion Programs
In academic year 1965-1966, a randomly assigned cohort of English-speaking youngsters entered school and were "immersed" in a second language, French. These youngsters proceeded through an instructional program that gradually introduced English Language Arts and the teaching of content material in English beginning in grade two and proceeding to an approximately equal allocation of instructional time in French and English by grades five and six. At secondary school, the balance shifted in favor of English with some content-based instruction in French and a French Language Arts program. At the very beginning, a commitment was made to assess the affective, cognitive, and linguistic development of the participants over time and to compare their progress with two other groups—a randomly assigned counterpart control group of English speaking youngsters who participated in English-medium instruction and a "matched" control group of French-speaking youngsters who attended French-medium instruction. The Canadian government, through a series of research grants, funded what became a 12-year program of research to document the effects of program participation on these students. To make a complex story very brief, the data revealed effects or patterns of development that would not have been predicted by a series of cross sectional snapshots taken at any one point or even at multiple points in time.
The data revealed among other things that, over time, the immersion youngsters developed significantly greater cognitive flexibility and nonverbal intelligence than their monolingually instructed control counterparts; they developed a broader, more positive, and more charitable view toward other people, values, attitudes, and traditions than their counterparts; and they developed a basic level of bilingual language proficiency that prepared them for a broader array of educational or occupational options than their monolingually educated counterparts. The results of this research (see, for example, Lambert & Tucker, 1982) demonstrated to us that such longitudinal study was absolutely essential for documenting the cumulative benefits associated with the instructed development of bilingual proficiency.
ESL Staff Development Under Realistic Conditions
As another example, I would like to say a few words about the evaluation of a summer in-service training program that was conducted in 1968 for novice ESL teachers from the United States. The NDEA-supported program provided for the identification and training of groups of 30 to 40 teachers at various campuses throughout the country. The University of California at Los Angeles proposed to host such an institute, but to house the program on the campus of the Philippine Normal College in Manila rather than in Los Angeles. Participants were recruited, identified, and sent to Manila where they lived with Filipino families, studied Tagalog one hour per day, and did all of the other things that their counterparts at summer institutes in the U.S. did (e.g., took courses in contrastive analysis, phonetics and phonology, "methods" of teaching English as a second language, etc.). A variety of preprogram and postprogram information (expectations, attitudes, experiences) was collected from the Philippine participants and from those at two institutes with similar goals held in the United States. The idea was to create a situation for the participants in the Philippines that modeled the experiences through which many of their own students in the United States would be going on a daily basis. For those who participated in the Institute in the Philippines, the postprogram data appeared to indicate that the program had been a terrible failure (see Tucker & Lambert, 1970). Students did not enjoy their sojourn to the Philippines, did not like Manila, did not like their host families, and apparently had learned little that they considered to be transferable to their classes. The participants in the comparison institutes, however, fared better. They had moderately productive summers and learned material, approaches, and techniques that they foresaw the possibility of using in their schools.
However, a tracking study that was conducted seven years later revealed a completely different participant profile. On that occasion, an overwhelming majority of the Philippine participants reported that the experience had significantly and positively affected their personal and professional lives. They reflected on the many concrete ways in which they had incorporated material from that summer into their teaching and the ways in which they had sought career paths that enabled them to draw on and amplify their experiences from the summer of 1968. In retrospect, the summer for the vast majority had, over time, become a "defining moment" in their professional lives (see Bruck, Lambert, Tucker & Bowen, 1975). This story could not have been told accurately without this longitudinal perspective.
Head Start and the Values of Early Participation
As the last example, let me mention a current continuing study from a different domain of language education. In the early 1970s, 68 children born and living in poverty were randomly assigned to one of three treatment conditions at ages 3 and 4: (a) direct instruction (a teacher centered curriculum using the DISTAR approach); (b) traditional nursery school (interactive and collaborative); and (c) a so-called High/Scope curriculum (in which teacher and child planned and initiated collaborative activities based upon a model developed at the High/Scope Educational Research Foundation). A recent report provides data for these participants who have now been tracked for 23 years. Apparently, a high quality innovative preschool program involving interactive and collaborative activities on the part of the students and teachers (High/Scope and Traditional nursery school above) has resulted in persistent long term benefits for participants. For example, the data indicate that participation has led to significantly improved educational and subsequent economic success (which incidentally provides taxpayers a return equal to 716% of their investment in the program), to greater optimism about the future; to a greater awareness of the needs of others as measured by participation in volunteer work, and even to a sharply reduced lifetime arrest rate. The long term prognosis for participants in such programs appears to be definitely more positive than for those who participate in the more traditional programs or those who participate in no program at all.
These results, and indeed those from each of the studies referred to above, would not necessarily be obvious on the basis of limited cross-sectional research. But indeed the results are important, and the cumulative story that can be told is important. Such results, when available, have demonstrably profound policy implications. To take but one example, a program of French-immersion education which began in one school in one community in the Province of Quebec has spread to become a natural and viable educational option for students in all 10 Canadian provinces and in many of the states of Australia and the United States. As Carey (1997) noted "one of the most successful educational examples of language planning for unity has been the French immersion programs across Canada."
Recommendations for Action
In the extant reviews and position papers, there are repeated calls for longitudinal studies of all kinds in the language education domain. In addition, there are numerous suggestions that we need to understand more clearly phenomena such as the critical role of "literacy" in the development of the cognitive academic language skills needed for effective participation in instruction and the various aspects of the development and transfer of native language literacy skills (particularly for speakers of noncognate, less commonly taught languages) since they provide the underpinnings for effective development of cognitive academic language abilities in English. In additional, numerous groups (e.g., TESOL and the TESOL International Research Foundation) favor encouraging theoretically motivated research relevant to various instructional issues (e.g., studies on the validation [in the senses described above] of developmentally sensitive measures of language proficiency, the value of cultivating metalinguistic awareness, and the usefulness of language strategy training).
These priorities also seem congruent with those explicitly articulated in the report of a specialist panel of the U.S. National Academy of Science (see August & Hakuta, 1997) in which it is recommended that funders and policy makers consider examining their needs within the general framework offered by four principles.
• Principle 1
Give priority to important topics to which insufficient attention has been paid, but for which there already exist promising theories and research methodologies;
• Principle 2
Give priority to addressing gaps in population coverage such as certain age or language groups;
• Principle 3
Give priority to legitimate questions that are of strong interest to well defined constituencies;
• Principle 4
Give priority to building the nation's capacity to conduct high quality research on English-language learners.
ACCESS TO TECHNOLOGY
Let me turn my attention now to the question of access to technology and utilization of technologies in the service of what I have referred to as instructed language learning or teaching.
On a personal note, my own involvement in TEL (technology-enhanced learning), albeit somewhat sporadic, stems from the early 1980s when I was head of the Center for Applied Linguistics in Washington, DC. A wealthy Indonesian entrepreneur (the then head of Pertamina) had a vision. He believed that the downtime experienced by workers on oil rigs could be used more productively by providing them with self-access materials designed to enable them to improve their language, literacy, numeracy, and other foundation skills.
After a series of trips to Indonesia for needs assessment and meetings, we proposed and ultimately received a substantial grant to develop a product called Skillpac (or English for Industry). This was a 15-unit English for Special Purposes course designed and developed for workers in the petroleum industry that was intended for self-study and delivered using a networked configuration of IBM PCs, videodisc players, and audiotape players, combined with selected print materials. There were video vignettes, authentic dialogues, cultural notes, grammar and vocabulary reviews, periodic tests, etc. Students could proceed to cover material to a mastery criterion at their own pace. By today's standards, the materials would be viewed as relatively primitive (and some of you may have seen them or even tried them at an early CALICO conference in Baltimore in 1990). However, they appeared to work; students enjoyed using the materials and seemed to learn.
But I had another concern. On the basis of our experience, I came away with two impressions that continue to influence my thinking today. (1) Many groups (entrepreneurial business people, specialists from AID and the World Bank, and others) were willing to underwrite the cost of program development; no one [at that time] was willing to fund research—either short term or preferably longitudinal—to examine the effectiveness of this type of instructed second language learning and teaching in relationship to the more traditional classroom-based delivery. (2) Access to the technological innovation was severely restricted to a small "privileged" class of individuals (e.g., individuals working on a Pertamina oil rig in the Celebes Sea and individuals studying a foreign language in the Central Intelligence Agency language school).
I continue to be deeply disturbed by the social inequities that seem to correlate with access to innovative technology in the service of educational innovation. An early report by the federal Office of Technology Assessment (Roberts, 1988) decries the limited access that blacks and Latinos have to TEL, and this situation from the 1980s appears not to have improved significantly in the intervening years. Virtually every month, I see yet another article in the New York Times, the Washington Post, or The Atlantic Monthly calling attention to the differential access to technologies on the part of majority students versus minority students (see Honan, 1999; Mathews, 1999; Walton, 1999). The disturbing pattern that we see in many areas is that the so-called newer technologies are more likely to be available to (upper) middle class students than to working class students and to white students than to Black or Latino students. They are also more likely to be used in rote or other "mechanistic" ways for repetitive drill and practice exercises by Black students than by white students who more frequently use them for simulations and other exciting real-life applications. It also appears to be the case that the teachers who work with "majority" students are much more likely to have received training in the uses of technologies than the teachers who work with minority students. I realize that there are exceptions to these general observations, perhaps many exceptions. Nonetheless, I am disturbed about perpetuating, or even worse about enlarging, the social inequalities that now exist in our schools.
In a recent review "Technology and its Continual Rise and Fall" (Education Week, May 19, 1999), it was noted
But for one reason or another—after a decade or so in the sun—most of the technological innovations drifted into the margins of school practice, to be used only occasionally or for peripheral activities by a few gung-ho teachers. Most didn't disappear but they failed to achieve the impact for which they had seemed destined.
At least two of the reasons often cited for this limited impact are the differential access to technologies as well as the differential training opportunities for teachers. For example, during recent testimony before the Education & Workforce subcommittee of the House of Representatives in preparation for the reauthorization of the Elementary and Secondary Education Act (see Education Week, May 19, 1999), it was noted
At the top of the list, I'd put strengthening the professional development component of technology programs.
… but to be effective, … technology must be concentrated, distributed to both teachers and students, and sustained financially over a period of years.
Let me here offer one other observation intended to make clear the magnitude of the discrepancy worldwide between classes of students by drawing attention to distinctions between the state of education in the so-called industrialized world and that in the third world (excerpted from the United Nations Human Development Report as reported in the New York Times, September 27, 1998):
• The richest fifth of the world's peoples consumes 86 percent of all goods and services while the poorest fifth consumes just 1.3 percent;
• Of the 4.4 billion people in developing countries, nearly three-fifths lack access to safe sewers, a third have no access to clean water, a quarter do not have adequate housing, and a fifth have no access to modern health services of any kind;
• Three of four children in the "poorest nations" in the world are not in school;
• Nearly one-sixth of the 5.9 billion people in the world cannot read or write according to a survey published by UNICEF (as reported in the New York Times, December 9, 1998);
• It is estimated that the additional cost of achieving and maintaining universal access to basic education for all, basic health care for all, reproductive health care for all women, adequate food for all and clean water and safe sewers for all is roughly $40 billion a year—or less that 4% of the combined wealth of the 225 richest people in the world;
• Providing universal primary education for all would cost about $8 billion extra per year which equals
• About four days' worth of global military spending;
• Seven days' worth of currency speculation in international markets;
• The amount spent by Americans each year on cosmetics (New York Times, September 27, 1998);
• Less than half of what Americans spend on toys for their children each year;
• Less than the annual amount that Europeans spend on mineral water.
Now I am not naïve enough to think that this is a problem that CALICO members can solve single handedly. But I do wish to indicate that my review of the extant educational literature and my constant review of demographic trends and projections combine to suggest that (a) we must give concerted individual and collective attention to the problem of allocation and distribution of resources, including but not limited to technological resources and (b) we must implement as soon as is practical a multifaceted and longitudinal research agenda to examine, from multiple perspectives, the value added to students who pursue some or all of their language education using innovative technologies.
Future Directions
By way of conclusion, let me note that the United States continues to receive over a million immigrants, legal and illegal, each year and that the foreign born is the fastest growing segment of the population. The most accurate prediction we can make is that this trend, evident in a number of other industrialized nations as well, will continue for the foreseeable future together with an increase in the number of native born individuals who will speak English as a second language.
Likewise, a recent study commissioned by the British Council on the future of the English language (Graddol, 1997) underscores the need for all individuals to develop bilingual proficiency in order to be able to participate effectively in our increasingly global society.
This situation poses challenges and opportunities for the applied linguist to craft a truly applied research agenda that can examine longitudinally, and from multiple perspectives, some of the perplexing questions that persist. For example (from NCLE, 1998), What is the role of native language oral and literate proficiencies in the acquisition of ESL? What instructional sequences and approaches work best? … With whom? How can technology be effectively utilized? What is the relationship between staff training and program quality and learner attainment? What immediate and long-term impact can be expected from various types of adult ESL programs? Which extant assessment instruments can reliably document change in learner proficiency at which levels?
Hopefully, during the decade ahead—with support from various public and private philanthropic sources—collaborative groups of researchers, educators, policy makers and administrators will begin to ask questions such as those identified above and gradually begin to aggregate relevant qualitative and quantitative data to shed light on the issues.
NOTE
1 This article is a revised version of a keynote address given by the author at CALICO '99, the sixteenth annual CALICO symposium, at Miami University in Oxford, Ohio in June 1999. CALICO wishes to thank Dr. Tucker for his keynote address at CALICO '99 and his permission to publish it here.