Experienced and Novice L2 Raters’ Cognitive Processes while Rating Integrated and Independent Writing Tasks

Keywords: cognitive processes, independent writing task, integrated writing task, rating experience, verbal protocol, task type, rubric

Abstract

Background. Recently, there has been a growing interest in the personal attributes of raters which determine the quality of cognitive processes involved in their rating writing practice.

Purpose. Accordingly, this research attempted to explore how the rating experience of L2 raters might affect their rating of integrated and independent writing tasks.

Methods. To pursue this aim, 13 experienced and 14 novice Iranian raters were selected through criterion sampling. After attending a training course on rating writing tasks, both groups produced introspective verbal protocols while they were rating integrated and independent writing tasks which were produced by an Iranian EFL learner. The verbal protocols were recorded and transcribed, and their content was analyzed by the researchers.

Results. The six extracted major themes from the content analysis included content, formal requirement, general linguistic range, language use, mechanics of writing, and organization. The results indicated that the type of writing task (integrated vs. independent) is a determining factor for the number of references experienced and novice raters made to the TOEFL-iBT rating rubric. Further, the raters’ rating experience determined the proportions of references they made. Yet, the proportional differences observed between experienced and novice raters in their references were statistically significant only in terms of language use, mechanics of writing, organization, and the total.

Conclusion. The variations in L2 raters’ rating performance on integrated and independent writing tasks emphasize the urgency of professional training to use and interpret the components of various rating writing scales by both experienced and novice raters.

nced and novice raters. 

Downloads

References

Ahmadi, A., & Mansoordehghan, S. (2015). Task type and prompt effect on test performance: A focus on IELTS academic writing tasks. Journal of Teaching Language Skills, 6(3), 1-20. DOI: https://doi.org/10.22099/jtls.2015.2897

Allen, S. (2004). Task representation of a Japanese L2 writing and its impact on the usage of source text information. Journal of Asian Pacific Communication, 14(1), 77-89. DOI: https://doi.org/10.1075/japc.14.1.06all

Attali, Y. (2016). A comparison of newly-trained and experienced raters on a standardized writing assessment. Language Testing, 33(1), 99-115. DOI: https://doi.org/10.1177/0265532215582283

Barkaoui, K. (2010a). Variability in ESL essay rating processes: The role of the rating scale and rater experience. Language Assessment Quarterly, 7(1), 54-74. DOI: https://doi.org/10.1080/15434300903464418

Barkaoui, K. (2010b). Do ESL essay raters' evaluation criteria change with experience? A mixed-methods, cross-sectional study. TESOL Quarterly, 44(1), 31-57. DOI: https://doi.org/10.2307/27785069

Beck, S. W., Llosa, L., Black, K., & Anderson, A. T. (2018). From assessing to teaching writing: What teachers prioritize. Assessing Writing, 37, 68-77. DOI: https://doi.org/10.1016/j.asw.2018.03.003

Brookhart, S. M. (2013). How to create and use rubrics for formative assessment and grading: Alexandria.

Brown, H. D., & Abeywickrama, P. (2018). Language assessment: Principles and classroom practices (3rd ed.). Pearson Education.

Davis, L. (2016). The influence of training and experience on rater performance in scoring spoken language. Language Testing, 33(1), 117-135. DOI: https://doi.org/10.1177/0265532215582282

Dörnyei, Z. (2007). Research methods in applied linguistics: Oxford University Press.

Duijm, K., Schoonen, R., & Hulstijn, J. H. (2018). Professional and non-professional raters' responsiveness to fluency and accuracy in L2 speech: An experimental approach. Language Testing, 35(4), 501-527. DOI: https://doi.org/10.1177/0265532217712553

Eckes, T. (2011).Introduction to many-facet Rasch measurement: Analyzing and evaluating rater-mediated assessments. Peter Lang.

Eckes, T. (2012). Operational rater types in writing assessment: Linking rater cognition to rater behavior. Language Assessment Quarterly, 9(3), 270-292. DOI: https://doi.org/10.1080/15434303.2011.649381

Elder, C., Barkhuizen, G., Knoch, U., & Von Randow, J. (2007). Evaluating rater responses to an online training program for L2 writing assessment. Language Testing, 24(1), 37-64. DOI: https://doi.org/10.1177/0265532207071511

Ericsson, K. A., & Simon, H. (1993). Protocol analysis: Verbal reports as data. MIT Press.

ETS website (2019). The TOEFL family of assessments. Retrieved from https://www.ets.org/toefl.

ETS Website (2020). TOEFL-iBT test independent and integrated writing rubrics. Retrieved from https://www.ets.org/s/toefl/pdf/toefl_writing_rubrics.pdf.

Fahim, M., & Bijani, H. (2011). The effects of rater training on raters' severity and bias in second language writing assessment.International Journal of Language Testing, 1(1), 1-16.

Gallagher, N. (2005). Delta's key to the next generation TOEFL test: Advanced skill practice for the IBT. Delta Publishing Company.

Harsch, C., & Martin, G. (2013).Comparing holistic and analytic scoring methods: Issues of validity and reliability. Assessment in Education: Principles, Policy & Practice, 20(3), 281-307. DOI: https://doi.org/10.1080/0969594X.2012.742422

Hoenig, J. M., & Heisey, D. M. (2012). The abuse of power: The pervasive fallacy of power calculations for data analysis. The American Statistican, 55(1), 19-24. DOI: https://doi.org/10.1198/000313001300339897

Hyland, K. (2003). Second language writing: Cambridge University Press.

James, C. L. (2006). Validating a computerized scoring system for assessing writing and placing students in composition courses. Assessing Writing, 11(3), 167-178. DOI: https://doi.org/10.1016/j.asw.2007.01.002

Khodi, A. (2021). The affectability of writing assessment scores: A G-theory analysis of rater, task, and scoring method contribution. Language Testing in Asia, 11(30), 1-27. DOI: https://doi.org/10.1186/s40468-021-00134-5

Klimova, B. F. (2013). Developing thinking skills in the course of academic writing. Procedia-Social and Behavioral Sciences, 93, 508-511. DOI: https://doi.org/10.1016/j.sbspro.2013.09.229

Krahmer, E., & Ummelen, N. (2004). Thinking about thinking aloud: A comparison of two verbal protocols for usability testing. IEEE Transactions on Professional Communication, 47(2), 105-117. DOI: https://doi.org/10.1109/TPC.2004.828205

Leung, C., & Lewkowicz, J. (2006). Expanding horizons and unresolved conundrums: Language testing and assessment. TESOL Quarterly, 40(1), 211-234. DOI: https://doi.org/10.2307/40264517

Lim, G. S. (2011). The development and maintenance of rating quality in performance writing assessment: A longitudinal study of new and experienced raters. Language Testing, 28(4), 543-560. DOI: https://doi.org/10.1177/0265532211406422

Long, H., & Pang, W. (2015). Rater effects in creativity assessment: A mixed methods investigation. Thinking Skills and Creativity, 15, 13-25. DOI: https://doi.org/10.1016/j.tsc.2014.10.004

Michel, M., Révész, A., Lu, X., Kourtali, N. E., Lee, M., & Borges, L. (2020). Investigating L2 writing processes across independent and integrated tasks: A mixed-methods study. Second Language Research, 36(3), 307-334. DOI: https://doi.org/10.1177/0267658320915501

Myford, C. & Wolfe, E. W. (2003). Detecting and measuring rater effects using many-facet Rasch measurement. Journal of Applied Measurement, 5(2), 189-223. PMID: 15064538

Nikmad, F., & Tavassoli, K. (in press). The impact of test length on raters' mental processes while scoring test-takers' writing performance. Journal of Language Horizons. DOI: https://doi.org/10.22051/LGHOR.2022.37340.1545

Plakans, L. (2010). Independent vs.integrated writing tasks: A comparison of task representation. TESOL Quarterly, 44(1), 185-195. DOI: https://doi.org/10.5054/TQ.2010.215251

Pourdana, N., Nour, P., & Yousefi, F. (2021). Investigating metalinguistic written corrective feedback focused on EFL learners' discourse markers accuracy in mobile-mediated context. Asian-Pacific Journal of Second and Foreign Language Education, 6(7),. DOI: https://doi.org/10.1186/s40862-021-00111-8

Ruiz-Funes, M. (2001). Task representation in foreign language reading-to-write. Foreign Language Annals, 34(3), 226-234. DOI: https://doi.org/10.1111/j.1944-9720.2001.tb02404.x

Şahan, Ö., & Razı, S. (2020). Do experience and text quality matter for raters' decision-making behaviors? Language Testing, 37(3), 311-332. DOI: https://doi.org/10.1177/0265532219900228

Shi, B., Huang, L., & Lu, X. (2020). Effect of prompt type on test-takers' writing performance and writing strategy use in the continuation task. Language Testing, 37(3), 361-388. DOI: https://doi.org/10.1177/0265532220911626

Suskie, L. (2008). Using assessment results to inform teaching practice and promote lasting learning. In G. Joughin (Ed.) Assessment, learning and judgment in higher education (pp. 1-20). Springer Science & Business Media. DOI: https://doi.org/10.1007/978-1-4020-8905-3_8

Swales, J. M., & Feak, C. B. (2004). Academic writing for graduate students: Essential tasks and skills (Vol. 1). University of Michigan Press Ann Arbor, MI. DOI: https://doi.org/10.3998/mpub.2173936

Uludag, P., McDonough, K., & Payant, C. (2021). Does prewriting planning positively impact English L2 students' integrated writing performance? Canadian Journal of Applied Linguistics, 24(3), 166-185. DOI: https://doi.org/10.37213/cjal.2021.31313

Van Moere, A. (2014). Raters and ratings. In A. Kunnan (Ed.), The companion to language assessment, Vol. III: Evaluation, methodology, and interdisciplinary themes (pp. 1358-1374). John Wiley & Sons, Inc. DOI: https://doi.org/10.1002/9781118411360.wbcla106

Weigle, S. C. (1998). Using FACETS to model rater training effects. Language Testing, 15(2), 263-287. DOI: https://doi.org/10.1177/026553229801500205

Weigle, S. C. (2002). Assessing writing. Cambridge University Press.

Wolfersberger, M. A. (2007). Second language writing from sources: An ethnographic study of an argument essay task. Unpublished doctoral dissertation, University of Auckland, Auckland, NZ.

Zabihi, R., Mehrani-Rad, M., & Khodi, A. (2019). Assessment of authorial voice strength in L2 argumentative written task performances: Contributions of voice components to text quality. Journal of Writing Research, 11(2), 331-352. DOI: https://doi.org/10.17239/jowr-2019.11.02.04

Zanders, C. J., & Wilson, E. (2019). Holistic, local, and process-oriented: What makes the University Utah's Writing Placement Exam work. Assessing Writing, 41, 84-87. DOI: https://doi.org/10.1016/j.asw.2019.06.003

Published
2022-12-26
How to Cite
TavassoliK., BashiriL., & Pourdana N. (2022). Experienced and Novice L2 Raters’ Cognitive Processes while Rating Integrated and Independent Writing Tasks. Journal of Language and Education, 8(4), 169-181. https://doi.org/10.17323/jle.2022.13466