Experienced and Novice L2 Raters’ Cognitive Processes while Rating Integrated and Independent Writing Tasks

Kobra Tavassoli; Leila  Bashiri; Natasha  Pourdana

doi:10.17323/jle.2022.13466

Kobra Tavassoli Department of ELT, Karaj Branch, Islamic Azad University, Karaj, Iran https://orcid.org/0000-0002-8246-8584
Leila Bashiri Department of ELT, Karaj Branch, Islamic Azad University, Karaj, Iran https://orcid.org/0000-0002-3851-3501
Natasha Pourdana Department of ELT, Karaj Branch, Islamic Azad University, Karaj, Iran https://orcid.org/0000-0002-5738-1137

DOI: https://doi.org/10.17323/jle.2022.13466

Ключевые слова: когнитивные процессы, независимое письменное задание, интегрированное письменное задание, рейтинговый опыт, вербальный протокол, тип задания, рубрика, тип задания, рубрика

Аннотация

Введение. В последнее время наблюдается растущий интерес к личным качествам экспертов, которые определяют качество когнитивных процессов, связанных с практикой оценки письменных работ.

Цель. Соответственно, данное исследование нацелено на исследование того, как опыт экспертов в оценивании английского как второго языка может повлиять на их оценивание интегрированных и независимых письменных заданий.

Методы. Для достижения поставленной цели путем выборки по критериям были отобраны 13 опытных и 14 начинающих экспертов из Ирана. После посещения учебного курса по оценке письменных заданий обе группы подготовили интроспективные вербальные протоколы с результатами оценивания интегрированных и независимых письменных заданий, выполненные иранским учащимся, изучающими английский в качестве иностранного языка. Вербальные протоколы записывались на пленку и расшифровывались, а их содержание анализировалось исследователями.

Результаты. В результате проведенного контент-анализа были извлечены шесть основных тем для комментариев, которые включали содержание, формальные требования, общий языковой диапазон, использование языка, механику письма и организацию. Результаты показали, что тип письменного задания (интегрированное или независимое) является определяющим фактором для количества комментариев, представленных опытными и начинающими экспертами в рейтинговой рубрике TOEFL-iBT. Более того опыт экспертов в оценивании определял пропорции сделанных ими комментариев. Тем не менее, пропорциональные различия в комментариях опытных и начинающих экспертов были статистически значимыми только с точки зрения использования языка, механики письма, организации и общего комментария.

Conclusion. The variations in L2 raters’ rating performance on integrated and independent writing tasks emphasize the urgency of professional training to use and interpret the components of various rating writing scales by both experienced and novice raters.

Выводы. Различия в оценке экспертов при оценивании английского как второго языка при выполнении интегрированных и независимых письменных заданий подчеркивают безотлагательность профессиональной подготовки для использования и интерпретации компонентов различных шкал оценивания письма как опытными, так и начинающими экспертами.

Скачивания

Данные скачивания пока не доступны.

Литература

Ahmadi, A., & Mansoordehghan, S. (2015). Task type and prompt effect on test performance: A focus on IELTS academic writing tasks. Journal of Teaching Language Skills, 6(3), 1-20. DOI: https://doi.org/10.22099/jtls.2015.2897

Allen, S. (2004). Task representation of a Japanese L2 writing and its impact on the usage of source text information. Journal of Asian Pacific Communication, 14(1), 77-89. DOI: https://doi.org/10.1075/japc.14.1.06all

Attali, Y. (2016). A comparison of newly-trained and experienced raters on a standardized writing assessment. Language Testing, 33(1), 99-115. DOI: https://doi.org/10.1177/0265532215582283

Barkaoui, K. (2010a). Variability in ESL essay rating processes: The role of the rating scale and rater experience. Language Assessment Quarterly, 7(1), 54-74. DOI: https://doi.org/10.1080/15434300903464418

Barkaoui, K. (2010b). Do ESL essay raters' evaluation criteria change with experience? A mixed-methods, cross-sectional study. TESOL Quarterly, 44(1), 31-57. DOI: https://doi.org/10.2307/27785069

Beck, S. W., Llosa, L., Black, K., & Anderson, A. T. (2018). From assessing to teaching writing: What teachers prioritize. Assessing Writing, 37, 68-77. DOI: https://doi.org/10.1016/j.asw.2018.03.003

Brookhart, S. M. (2013). How to create and use rubrics for formative assessment and grading: Alexandria.

Brown, H. D., & Abeywickrama, P. (2018). Language assessment: Principles and classroom practices (3rd ed.). Pearson Education.

Davis, L. (2016). The influence of training and experience on rater performance in scoring spoken language. Language Testing, 33(1), 117-135. DOI: https://doi.org/10.1177/0265532215582282

Dörnyei, Z. (2007). Research methods in applied linguistics: Oxford University Press.

Duijm, K., Schoonen, R., & Hulstijn, J. H. (2018). Professional and non-professional raters' responsiveness to fluency and accuracy in L2 speech: An experimental approach. Language Testing, 35(4), 501-527. DOI: https://doi.org/10.1177/0265532217712553

Eckes, T. (2011).Introduction to many-facet Rasch measurement: Analyzing and evaluating rater-mediated assessments. Peter Lang.

Eckes, T. (2012). Operational rater types in writing assessment: Linking rater cognition to rater behavior. Language Assessment Quarterly, 9(3), 270-292. DOI: https://doi.org/10.1080/15434303.2011.649381

Elder, C., Barkhuizen, G., Knoch, U., & Von Randow, J. (2007). Evaluating rater responses to an online training program for L2 writing assessment. Language Testing, 24(1), 37-64. DOI: https://doi.org/10.1177/0265532207071511

Ericsson, K. A., & Simon, H. (1993). Protocol analysis: Verbal reports as data. MIT Press.

ETS website (2019). The TOEFL family of assessments. Retrieved from https://www.ets.org/toefl.

ETS Website (2020). TOEFL-iBT test independent and integrated writing rubrics. Retrieved from https://www.ets.org/s/toefl/pdf/toefl_writing_rubrics.pdf.

Fahim, M., & Bijani, H. (2011). The effects of rater training on raters' severity and bias in second language writing assessment.International Journal of Language Testing, 1(1), 1-16.

Gallagher, N. (2005). Delta's key to the next generation TOEFL test: Advanced skill practice for the IBT. Delta Publishing Company.

Harsch, C., & Martin, G. (2013).Comparing holistic and analytic scoring methods: Issues of validity and reliability. Assessment in Education: Principles, Policy & Practice, 20(3), 281-307. DOI: https://doi.org/10.1080/0969594X.2012.742422

Hoenig, J. M., & Heisey, D. M. (2012). The abuse of power: The pervasive fallacy of power calculations for data analysis. The American Statistican, 55(1), 19-24. DOI: https://doi.org/10.1198/000313001300339897

Hyland, K. (2003). Second language writing: Cambridge University Press.

James, C. L. (2006). Validating a computerized scoring system for assessing writing and placing students in composition courses. Assessing Writing, 11(3), 167-178. DOI: https://doi.org/10.1016/j.asw.2007.01.002

Khodi, A. (2021). The affectability of writing assessment scores: A G-theory analysis of rater, task, and scoring method contribution. Language Testing in Asia, 11(30), 1-27. DOI: https://doi.org/10.1186/s40468-021-00134-5

Klimova, B. F. (2013). Developing thinking skills in the course of academic writing. Procedia-Social and Behavioral Sciences, 93, 508-511. DOI: https://doi.org/10.1016/j.sbspro.2013.09.229

Krahmer, E., & Ummelen, N. (2004). Thinking about thinking aloud: A comparison of two verbal protocols for usability testing. IEEE Transactions on Professional Communication, 47(2), 105-117. DOI: https://doi.org/10.1109/TPC.2004.828205

Leung, C., & Lewkowicz, J. (2006). Expanding horizons and unresolved conundrums: Language testing and assessment. TESOL Quarterly, 40(1), 211-234. DOI: https://doi.org/10.2307/40264517

Lim, G. S. (2011). The development and maintenance of rating quality in performance writing assessment: A longitudinal study of new and experienced raters. Language Testing, 28(4), 543-560. DOI: https://doi.org/10.1177/0265532211406422

Long, H., & Pang, W. (2015). Rater effects in creativity assessment: A mixed methods investigation. Thinking Skills and Creativity, 15, 13-25. DOI: https://doi.org/10.1016/j.tsc.2014.10.004

Michel, M., Révész, A., Lu, X., Kourtali, N. E., Lee, M., & Borges, L. (2020). Investigating L2 writing processes across independent and integrated tasks: A mixed-methods study. Second Language Research, 36(3), 307-334. DOI: https://doi.org/10.1177/0267658320915501

Myford, C. & Wolfe, E. W. (2003). Detecting and measuring rater effects using many-facet Rasch measurement. Journal of Applied Measurement, 5(2), 189-223. PMID: 15064538

Nikmad, F., & Tavassoli, K. (in press). The impact of test length on raters' mental processes while scoring test-takers' writing performance. Journal of Language Horizons. DOI: https://doi.org/10.22051/LGHOR.2022.37340.1545

Plakans, L. (2010). Independent vs.integrated writing tasks: A comparison of task representation. TESOL Quarterly, 44(1), 185-195. DOI: https://doi.org/10.5054/TQ.2010.215251

Pourdana, N., Nour, P., & Yousefi, F. (2021). Investigating metalinguistic written corrective feedback focused on EFL learners' discourse markers accuracy in mobile-mediated context. Asian-Pacific Journal of Second and Foreign Language Education, 6(7),. DOI: https://doi.org/10.1186/s40862-021-00111-8

Ruiz-Funes, M. (2001). Task representation in foreign language reading-to-write. Foreign Language Annals, 34(3), 226-234. DOI: https://doi.org/10.1111/j.1944-9720.2001.tb02404.x

Şahan, Ö., & Razı, S. (2020). Do experience and text quality matter for raters' decision-making behaviors? Language Testing, 37(3), 311-332. DOI: https://doi.org/10.1177/0265532219900228

Shi, B., Huang, L., & Lu, X. (2020). Effect of prompt type on test-takers' writing performance and writing strategy use in the continuation task. Language Testing, 37(3), 361-388. DOI: https://doi.org/10.1177/0265532220911626

Suskie, L. (2008). Using assessment results to inform teaching practice and promote lasting learning. In G. Joughin (Ed.) Assessment, learning and judgment in higher education (pp. 1-20). Springer Science & Business Media. DOI: https://doi.org/10.1007/978-1-4020-8905-3_8

Swales, J. M., & Feak, C. B. (2004). Academic writing for graduate students: Essential tasks and skills (Vol. 1). University of Michigan Press Ann Arbor, MI. DOI: https://doi.org/10.3998/mpub.2173936

Uludag, P., McDonough, K., & Payant, C. (2021). Does prewriting planning positively impact English L2 students' integrated writing performance? Canadian Journal of Applied Linguistics, 24(3), 166-185. DOI: https://doi.org/10.37213/cjal.2021.31313

Van Moere, A. (2014). Raters and ratings. In A. Kunnan (Ed.), The companion to language assessment, Vol. III: Evaluation, methodology, and interdisciplinary themes (pp. 1358-1374). John Wiley & Sons, Inc. DOI: https://doi.org/10.1002/9781118411360.wbcla106

Weigle, S. C. (1998). Using FACETS to model rater training effects. Language Testing, 15(2), 263-287. DOI: https://doi.org/10.1177/026553229801500205

Weigle, S. C. (2002). Assessing writing. Cambridge University Press.

Wolfersberger, M. A. (2007). Second language writing from sources: An ethnographic study of an argument essay task. Unpublished doctoral dissertation, University of Auckland, Auckland, NZ.

Zabihi, R., Mehrani-Rad, M., & Khodi, A. (2019). Assessment of authorial voice strength in L2 argumentative written task performances: Contributions of voice components to text quality. Journal of Writing Research, 11(2), 331-352. DOI: https://doi.org/10.17239/jowr-2019.11.02.04

Zanders, C. J., & Wilson, E. (2019). Holistic, local, and process-oriented: What makes the University Utah's Writing Placement Exam work. Assessing Writing, 41, 84-87. DOI: https://doi.org/10.1016/j.asw.2019.06.003