Automated Measures of Lexical Sophistication: Predicting Proficiency in an Integrated Academic Writing Task
Аннотация
Background. Advances in automated analyses of written discourse have made available a wide range of indices that can be used to better understand linguistic features present in language users’ discourse and the relationships these metrics hold with human raters’ assessments of writing.
Purpose. The present study extends previous research in this area by using the TAALES 2.2 software application to automatically extract 484 single and multi-word metrics of lexical sophistication to examine their relationship with differences in assessed L2 English writing proficiency.
Methods. Using a graded corpus of timed, integrated essays from a major academic English language test, correlations and multiple regressions were used to identify specific metrics that best predict L2 English writing proficiency scores.
Results. The most parsimonious regression model yielded four-predictor variables, with total word count, orthographic neighborhood frequency, lexical decision time, and word naming response time accounting for 36% of total explained variance.
Implications. Results emphasize the importance of writing fluency (by way of total word count) in assessments of this kind. Thus, learners looking to improve writing proficiency may find benefit from writing activities aimed at increasing speed of production. Furthermore, despite a substantial amount of variance explained by the final regression model, findings suggest the need for a wider range of metrics that tap into additional aspects of writing proficiency.
Скачивания
Литература
Adelman, J. S., & Brown, G. D. A. (2007). Phonographic neighbors, not orthographic neighbors, determine word naming latencies. Psychonomic Bulletin & Review, 14, 455-459. DOI: https://doi.org/10.3758/BF03194088
Appel, R., Trofimovich, P., Saito, K., Isaacs, T., & Webb, S. (2019). Lexical aspects of comprehensibility and nativeness from the perspective of native-speaking English raters. ITL-International Journal of Applied Linguistics, 170, 24-52. DOI: https://doi.org/10.1075/itl.17026.app
Appel, R., & Wood, D. (2016). Recurrent word combinations in EAP test-taker writing: Differences between high-and low-proficiency levels. Language Assessment Quarterly, 13, 55-71. DOI: https://doi.org/10.1080/15434303.2015.1126718
Balota, D. A, Yap, M. J., Cortese, M. J., Hutchison, K. A, Kessler, B., Loftis, B., Neely, J. H., Nelson, D. L., Simpson, G. B., & Treiman, R. (2007). The English lexicon project. Behavior Research Methods, 39, 445-459. DOI: https://doi.org/10.3758/BF03193014
Berger, C. M., Crossley, S., & Kyle, K. (2017). Using native-speaker psycholinguistic norms to predict lexical proficiency and development in second language production. Applied Linguistics, 40(1), 22-42. DOI: https://doi.org/10.1093/applin/amx005
Bi, P., & Jiang, J. (2020). Syntactic complexity in assessing young adolescent EFL learners' writings: Syntactic elaboration and diversity. System, 91(4), 1-10. DOI: https://doi.org/10.1016/j.system.2020.102248
Brysbaert, M., & New, B. (2009). Moving beyond Kucera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behavior Research Methods, 41, 977-990. DOI: https://doi.org/10.3758/BRM.41.4.977
Brysbaert, M., Warriner, A. B., & Kuperman, V. (2014). Concreteness ratings for 40 thousand generally known English word lemmas. Behavior Research Methods, 46, 904-911. DOI: https://doi.org/10.3758/s13428-013-0403-5
Casal, J.E., & Lee, J.J. (2019). Syntactic complexity and writing quality in assessed first-year L2 writing. Journal of Second Language Writing, 44, 51-62. DOI: https://doi.org/10.1016/j.jslw.2019.03.005
Coltheart, M. (1981). The MRC psycholinguistic database. Quarterly Journal of Experimental Psychology Section A, 33, 497-505. DOI: https://doi.org/10.1080/14640748108400805
Coxhead, A. (2000). A new academic word list. TESOL Quarterly, 34, 213-238. DOI: https://doi.org/10.2307/3587951
Crossley, S. (2020). Lingusitic features in writing quality and development: An overview. Journal of Writing Research, 11(3), 415-443. DOI: https://doi.org/10.17239/jowr-2020.11.03.01
Crossley, S., Cai, Z., & McNamara, D. (2012). Syntagmatic, paradigmatic, andautomatic n-gram approaches to assessing essay quality. In P. M. McCarthy & G.M. Youngblood (Eds.), Proceedings of the 25th International Florida Artificial Intelligence Research Society (FLAIRS) Conference (pp. 214-219). AAAI Press.
Crossley, S. A., & McNamara, D. S. (2014). Does writing development equal writing quality? A computational investigation of syntactic complexity in L2 learners. Journal of Second Language Writing, 26, 66-79. DOI: https://doi.org/10.1016/j.jslw.2014.09.006
Crossley, S., Kyle, K., McNamara, D. (2016). The tool for the automatic analysis of text cohesion (TAACO): Automatic assessment of local, global, and text cohesion. Behavior Research Methods, 48, 1227-1237. DOI: https://doi.org/10.3758/s13428-015-0651-7
Crossley, S. A., Roscoe, R. D., McNamara, D. S., & Graesser, A. (2011) Predicting human scores of essay quality using computational indices of linguistic and textual features. In G. Biswas, S. Bull, J. Kay, and A. Mitrovic (Eds.), Proceedings of the 15th International Conference on Artificial Intelligence in Education (pp. 438-440). Springer. DOI: https://doi.org/10.1007/978-3-642-21869-9_62
Dascalu, M., McNamara, D. S., Crossley, S. A., & Trausan-Matu, S. (2016). Age of exposure: A model of word learning. In The 30th Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence (pp. 2928-2934). AAAI Press.
Elbow, P. (1973). Writing without teachers. Oxford University Press.
Fellbaum, C. (1998). WordNet: An electronic lexical database. The MIT Press.
Fitzgerald, J., & Spiegel, D. L. (1986). Textual cohesion and coherence in children's writing.Research in the Teaching of English, 20, 263-80.
Fleckenstein, J., Meyer, J., Jansen, T., Keller, S., & Köller, O. (2020). Is a long essay always a good essay? The effect of text length on writing assessment. Frontiers in Psychology, 11, 562462. DOI: https://doi.org/10.3389/fpsyg.2020.562462
Guo, L., Crossley, S., & McNamara, D. (2013). Predicting human judgements of essay quality in both integrated and independent second language writing samples: A comparison study. Assessing Writing, 18, 213-238. DOI: https://doi.org/10.1016/j.asw.2013.05.002
Hinkle, D., Wiersma, W., & Jurs, S. (2003). Applied statistics for the behavioral sciences (5th ed.). Houghton Mifflin.
Howell, D. C. (2013). Statistical methods for psychology (8th ed.). Wadsworth.
Johns, A. M. (1993). Reading and writing tasks in English for academic purposes classes: Products, processes and resources. In J. G. Carson & I. Leki (Eds.), Reading in the composition classroom: Second language perspectives (pp. 274-289). Heinle & Heinle.
Kamimura, T., & Oi, K. (2001). The effects of differences in point of view on the story production of Japanese EFL students. Foreign Language Annals, 34(2), 118 - 128. DOI: https://doi.org/10.1111/j.1944-9720.2001.tb02817.x
Keith, T. Z. (2019). Multiple regression and beyond: An introduction to multiple regression and structural equation modeling (3rd ed.). Routledge.
Kim, M., Crossley, S. (2018). Modelling second language writing quality: A structural equation investigation of lexical, syntactic, and cohesive features in source-based and independent writing. Assessing Writing, 37, 39-56. DOI: https://doi.org/10.1016/j.asw.2018.03.002
Kim, M., Crossley, S., & Kyle, K. (2018). Lexical sophistication as a multidimensional phenomenon: Relations to second language lexical proficiency, development, and writing quality. The Modern Language Journal, 102, 120-141. DOI: https://doi.org/10.1111/modl.12447
Kiss, G. R., Armstrong, C., Milroy, R., & Piper, J. (1973). An associative thesaurus of English and its computer analysis. In A. J. Aitkin, R. W. Bailey, & N. Hamilton-Smith (Eds.), The computer and literary studies (pp. 153-165). Edinburgh University Press.
Kyle, K., & Crossley, S. (2015) - Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Quarterly, 49, 757-786. DOI: https://doi.org/10.1002/tesq.194
Kyle, K., & Crossley, S. (2016). The relationship between lexical sophistication and independent and sources-based writing. Journal of Second Language Writing, 34, 12-24. DOI: https://doi.org/10.1016/j.jslw.2016.10.003
Kyle, K., & Crossley, S. (2017). Assessing syntactic sophistication in L2 writing: a usage-based approach. Language Testing, 34(4), 513-535. DOI: https://doi.org/10.1177/0265532217712554
Kyle, K., Crossley, S., & Berger, C. (2017). The tool for the automatic analysis of lexical sophistication. Behavior Research Methods, 50, 1040-1046. DOI: https://doi.org/10.3758/s13428-017-0924-4
Kucera, H., & Francis, W. N. (1967).Computational analysis of present-day American English. English. Brown University Press.
Kuperman, V., Stadthagen-Gonzales, H., & Brysbaert, M. (2012). Age-of-acquisition ratings for 30 thousand English words. Behavior Research Methods, 44, 978-990. DOI: https://doi.org/10.3758/s13428-012-0210-4
Landauer, T. K., McNamara, D. S., Dennis, S., & Kintsch, W. (2007). Handbook of latent semantic analysis. Lawrence Erlbaum.
Lei, L., Wen, J., & Yang, X. (2023). A large-scale longitudinal study of syntactic complexity development in EFL writing: A mixed effects model approach. Journal of Second Language Writing, 59, 1-14. DOI: https://doi.org/10.1016/j.jslw.2022.100962
Li, Y., Lin, S., Liu, Y., & Lu, Z. (2023). The predictive powers of fine-grained syntactic complexity indices for letter writing proficiency and their relationship to pragmatic appropriateness. Assessing Writing, 56, 1-15. DOI: https://doi.org/10.1016/j.asw.2023.100707
Lu, X. (2011). A corpus-based evaluation of syntactic complexity measures as indices of college-level ESL writers' language development. TESOL Quarterly 45(1), 36-62. DOI: https://doi.org/10.5054/tq.2011.240859
Lu, X. (2012). The relationship of lexical richness to the quality of ESL learners' oral narratives. Modern Language Journal, 96, 190-208. DOI: https://doi.org/10.1111/j.1540-4781.2011.01232_1.x
McDonald, S. A., & Shillcock, R. C. (2001). Rethinking the word frequency effect: The neglected role of distributional information in lexical processing. Language and Speech, 44, 295-322. DOI: https://doi.org/10.1177/00238309010440030101
McNamara, D. S., Crossley, S. A., Roscoe, R. D., Allen, L. K., & Dai, J. (2015). A hierarchical classification approach to automated essay scoring. Assessing Writing, 23, 35-59. DOI: https://doi.org/10.1016/j.asw.2014.09.002
Paquot, M. (2019). The phraseological dimension in interlanguage complexity research. Second Language Research, 35, 121-145. DOI: https://doi.org/10.1177/0267658317694221
Plonsky, L., & Ghanbar, H. (2018). Multiple regression in L2 research: A methodological synthesis and guide to interpreting R2 values. The Modern Language Journal, 102, 713-731. DOI: https://doi.org/10.1111/modl.12509
Plonsky, L., & Oswald, F. L. (2014). How big is big? Interpreting effect sizes in L2 research. Language Learning, 64(4), 878-912. DOI: https://doi.org/10.1111/lang.12079
Sasaki, M. (2000). Toward an empirical model of EFL writing processes: an exploratory study. Journal of Second Language Writing, 9(3), 259-291. DOI: https://doi.org/10.1016/S1060-3743(00)00028-X
Simpson-Vlach, R., & Ellis, N. C. (2010). An academic formulas list: New methods in phraseology research. Applied Linguistics, 31, 487-512. DOI: https://doi.org/10.1093/applin/amp058
Thorndike, E. L., & Lorge, I. (1944). The teacher's word book of 30,000 words. Bureau of Publications, Teachers College.
Wen, Q., Wang, L., & Liang, M. (2005). Spoken and written English corpus of Chinese learners. Foreign Language Teaching and Research Press.
Wolfe-Quintero, K., Inagaki, S., & Kim, H-Y (1998). Second language development in writing: measures of fluency, accuracy, & complexity. University of Hawaii Press.
Wray, A. (2002). Formulaic language and the lexicon. Cambridge University Press.
Copyright (c) 2023 Национальный исследовательский университет «Высшая школа экономики»

Это произведение доступно по лицензии Creative Commons «Attribution» («Атрибуция») 4.0 Всемирная.
Авторы, публикующие статьи в журнале, соглашаются с условиями политики авторских прав.