Hope Speech Detection Using Social Media Discourse (Posi-Vox-2024): A Transfer Learning Approach

  • Muhammad Ahmad Instituto Politecnico Nacional (CIC-IPN), Mexico City, Mexico https://orcid.org/0009-0003-8799-8212
  • Usman Sardar Institute of Arts and Culture, Lahore, Pakistan
  • Farid Humaira Independent Researcher, California, USA
  • Ameer Iqra Pennsylvania State University at Abington, PA, USA
  • Muhammad Muzzamil Islamia University of Bahawalpur, Pakistan
  • Ameer Hmaza Islamia University of Bahawalpur, Pakistan
  • Grigori Sidorov Instituto Politecnico Nacional (CIC-IPN), Mexico City, Mexico
  • Ildar Batyrshin Instituto Politecnico Nacional (CIC-IPN), Mexico City, Mexico https://orcid.org/0000-0003-0241-7902
Keywords: Hope Speech, BERT, Mashine learning, Twitter Analysis, Social Media, Transfer learning, NLP

Abstract

Background: The notion of hope is characterized as an optimistic expectation or anticipation of favorable outcomes. In the age of extensive social media usage, research has primarily focused on monolingual techniques, and the Urdu and Arabic languages have not been addressed.

Purpose: This study addresses joint multilingual hope speech detection in the Urdu, English, and Arabic languages using a transfer learning paradigm. We developed a new multilingual dataset named Posi-Vox-2024 and employed a joint multilingual technique to design a universal classifier for multilingual dataset. We explored the fine-tuned BERT model, which demonstrated a remarkable performance in capturing semantic and contextual information.

Method: The framework includes (1) preprocessing, (2) data representation using BERT, (3) fine-tuning, and (4) classification of hope speech into binary (‘hope’ and ‘not hope’) and multi-class (realistic, unrealistic, and generalized hope) categories.

Results: Our proposed model (BERT) demonstrated benchmark performance to our dataset, achieving 0.78 accuracy in binary classification and 0.66 in multi-class classification, with a 0.04 and 0.08 performance improvement over the baselines (Logistic Regression, in binary class 0.75 and multi class 0.61), respectively.

Conclusion: Our findings will be applied to improve automated systems for detecting and promoting supportive content in English, Arabic and Urdu on social media platforms, fostering positive online discourse. This work sets new benchmarks for multilingual hope speech detection, advancing existing knowledge and enabling future research in underrepresented languages.

Downloads

Download data is not yet available.

References

Alawadh, H. M., Alabrah, A., Meraj, T., & Rauf, H. T. (2023). English language learning via YouTube: An NLP-based analysis of users’ comments. Computers, 12(2), 24.

Anand, M., Sahay, K. B., Ahmed, M. A., Sultan, D., Chandan, R. R., & Singh, B. (2023). Deep learning and natural language processing in computation for offensive language detection in online social networks by feature selection and ensemble classification techniques. Theoretical Computer Science, 943, 203-218.

Anjum, & Katarya, R. (2024). Hate speech, toxicity detection in online social media: a recent survey of state of the art and opportunities. International Journal of Information Security, 23(1), 577-608.

Arif, M., Shahiki Tash, M., Jamshidi, A., Ullah, F., Ameer, I., Kalita, J., ... & Balouchzahi, F. (2024). Analyzing hope speech from psycholinguistic and emotional perspectives. Scientific Reports, 14(1), 23548.

Austin, D., Sanzgiri, A., Sankaran, K., Woodard, R., Lissack, A., & Seljan, S. (2020). Classifying sensitive content in online advertisements with deep learning. International Journal of Data Science and Analytics, 10(3), 265-276.

Balouchzahi, F., Sidorov, G., & Gelbukh, A. (2023). Polyhope: Two-level hope speech detection from tweets. Expert Systems with Applications, 225, 120078.

Chakravarthi, B. R. (2022). Hope speech detection in YouTube comments. Social Network Analysis and Mining, 12(1), 75.

Chakravarthi, B. R. (2022). Multilingual hope speech detection in English and Dravidian languages. International Journal of Data Science and Analytics, 14(4), 389-406. https://doi.org/10.1007/s41060-022-00341-0

Chinnappa, D. (2021). Dhivya-hope-detection@ LT-EDI-EACL2021: multilingual hope speech detection for code-mixed and transliterated texts. In Proceedings of the First Workshop on Language Technology for Equality, Diversity and Inclusion (pp. 73-78). Publisher. https://aclanthology.org/2021.ltedi-1.11

Davidson, T., Bhattacharya, D., & Weber, I. (2019). Racial bias in hate speech and abusive language detection datasets. arXiv preprint arXiv:1905.12516.

Gowen, K., Deschaine, M., Gruttadara, D., & Markey, D. (2012). Young adults with mental health conditions and social networking websites: seeking tools to build community. Psychiatric Rehabilitation Journal, 35(3), 245.

Ghanghor, N., Ponnusamy, R., Kumaresan, P. K., Priyadharshini, R., Thavareesan, S., & Chakravarthi, B. R. (April). IIITK@ LT-EDI-EACL2021: Hope speech detection for equality, diversity, and inclusion in Tamil, Malayalam and English. In Proceedings of the First Workshop on Language Technology for Equality, Diversity and Inclusion (pp. 197-203). Publisher.

Irfan, A., Azeem, D., Narejo, S., & Kumar, N. (2024, January). Multi-Modal Hate Speech Recognition Through Machine Learning. In 2024 IEEE 1st Karachi Section Humanitarian Technology Conference (KHI-HTC) (pp. 1-6). IEEE.

Khouloud Mnassri, Reza Farahbakhsh, Razieh Chalehchaleh, Praboda Rajapaksha, Amir Reza Jafari, Li, G., & Crespi, N. (2024). A survey on multi-lingual offensive language detection. PeerJ. Computer Science, 10, e1934–e1934. https://doi.org/10.7717/peerj-cs.1934

Kogilavani, S. V., Malliga, S., Jaiabinaya, K. R., Malini, M., & Kokila, M. M. (2023). Characterization and mechanical properties of offensive language taxonomy and detection techniques. Materials Today: Proceedings, 81, 630-633. https://doi.org/10.1016/j.matpr.2021.04.102

Kumar, A. Saumya, S., & Roy, P. (2022, May). SOA_NLP@ LT-EDI-ACL2022: An ensemble model for hope speech detection from YouTube comments. In Proceedings of the second workshop on language technology for equality, diversity and inclusion (pp. 223-228). DOI: 10.18653/v1/2022.ltedi-1.31

Lee, Y., Yoon, S., & Jung, K. (2018). Comparative studies of detecting abusive language on twitter. arXiv preprint arXiv:1808.10245.

Louati, A., Louati, H., Albanyan, A., Lahyani, R., Kariri, E., & Alabduljabbar, A. (2024). Harnessing machine learning to unveil emotional responses to hateful content on social media. Computers, 13(5), 114.

Malik, M. S. I., Nazarova, A., Jamjoom, M. M., & Ignatov, D. I. (2023). Multilingual hope speech detection: A Robust framework using transfer learning of fine-tuning RoBERTa model. Journal of King Saud University-Computer and Information Sciences, 35(8), 101736.

Mnassri, K., Farahbakhsh, R., Chalehchaleh, R., Rajapaksha, P., Jafari, A. R., Li, G., & Crespi, N. (2024). A survey on multi-lingual offensive language detection. PeerJ Computer Science, 10, e1934.

Nagar, S., Barbhuiya, F. A., & Dey, K. (2023). Towards more robust hate speech detection: using social context and user data. Social Network Analysis and Mining, 13(1), 47.

Nath, T., Singh, V. K., & Gupta, V. (2023). BongHope: An annotated corpus for Bengali hope speech detection. Research Square. https://doi.org/10.21203/rs.3.rs-2819284/v1

Palakodety, S., KhudaBukhsh, A. R., & Carbonell, J. G. (2020). Hope speech detection: A computational analysis of the voice of peace. In ECAI 2020 (pp. 1881-1889). IOS Press.

RamakrishnaIyer LekshmiAmmal, H., Ravikiran, M., Nisha, G., Balamuralidhar, N., Madhusoodanan, A., Kumar Madasamy, A., & Chakravarthi, B. R. (2023). Overlapping word removal is all you need: Revisiting data imbalance in hope speech detection. Journal of Experimental & Theoretical Artificial Intelligence, 36(8), 1837–1859. https://doi.org/10.1080/0952813X.2023.2166130

Roy, P., Bhawal, S., Kumar, A., & Chakravarthi, B. R. (2022, May). IIITSurat@ LT-EDI-ACL2022: Hope speech detection using machine learning. In Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion (pp. 120-126). Publisher. https://aclanthology.org/2022.ltedi-1.13

Schmidt, A., & Wiegand, M. (April). A survey on hate speech detection using natural language processing. In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media (pp. 1-10). Publisher.

Snyder, C. R., Rand, K. L., & Sigmon, D. R. (2002). Hope Theory: A Member of the Positive Psychology Family. In C. R. Snyder, & S. J. Lopez (Eds.), Handbook of Positive Psychology (pp. 257-276). Oxford University Press.

Subramanian, M., Sathiskumar, V. E., Deepalakshmi, G., Cho, J., & Manikandan, G. (2023). A survey on hate speech detection and sentiment analysis using machine learning and deep learning models. Alexandria Engineering Journal, 80, 110-121.

Wang, Z., & Jurgens, D. (2018). It’s going to be okay: Measuring access to support in online communities. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 33-45). Publisher.

Yates, A., Cohan, A., & Goharian, N. (2017). Depression and self-harm risk assessment in online forums. arXiv preprint arXiv:1709.01848.

Yenala, H., Jhanwar, A., Chinnakotla, M. K., & Goyal, J. (2018). Deep learning for detecting inappropriate content in text. International Journal of Data Science and Analytics, 6, 273-286.

Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N., & Kumar, R. (2019). Predicting the type and target of offensive posts in social media. arXiv preprint arXiv:1902.09666

Published
2024-12-30
How to Cite
AhmadM., SardarU., HumairaF., IqraA., MuzzamilM., HmazaA., SidorovG., & BatyrshinI. (2024). Hope Speech Detection Using Social Media Discourse (Posi-Vox-2024): A Transfer Learning Approach. Journal of Language and Education, 10(4), 31-43. https://doi.org/10.17323/jle.2024.22443