Synchronic and Diachronic Predictors of Socialness Ratings of Words
Introduction: In recent works, a new psycholinguistic concept has been introduced and studied that is socialness of a word. A socialness rating reflects word social significance and dictionaries with socialness ratings have been compiled using either a survey or machine method. Unfortunately, the size of the dictionaries with word socialness ratings created by a survey method is relatively small. Purpose: The study objective is to compile a large dictionary with English word socialness ratings by using machine extrapolation, transfer the rating estimations to other languages as well as to obtain diachronic models of socialness ratings. Method: The socialness ratings of words are estimated using multilayer direct propagation neural networks. To obtain synchronic estimates, pre-trained fasttext vectors were fed to the input. To obtain diachronic estimates, word co-occurrence statistics in a large diachronic corpus was used. Results: The obtained Spearman`s correlation coefficient between human socialness ratings and machine ones is 0.869. The trained models allowed obtaining socialness ratings for 2 million English words, as well as a wide range of words in 43 other languages. An unexpected result is that the linear model provides highly accurate estimate of the socialness ratings, which can be hardly further improved. Apparently, this is due to the fact that in the space of vectors representing words there is a selected direction responsible for meanings associated with socialness driven by of social factors influencing word representation and use. The article also presents a diachronic neural network predictor of concreteness ratings using word co- occurrence vectors as input data. It is shown that using a one-year data from a large diachronic corpus Google Books Ngram one can obtain accuracy comparable to the accuracy of synchronic estimates. Conclusion: The created large machine dictionary of socialness ratings can be used in psycholinguistic and cultural studies. Changes in socialness ratings can serve as a marker of word meaning change and be used in lexical semantic change detectionDownloads
