The Foreign Language Teaching Anxiety Scale: Preliminary Tests of Validity and Reliability

Although anxiety in the foreign language learning context has been studied extensively, the anxiety experienced by foreign language teachers, who are important stakeholders of classroom contexts and language learners themselves, seems to be overlooked. While research mainly focuses on foreign language anxiety in a learning context, there is not sufficient research to contextualize foreign language teaching anxiety (FLTA). In addition, in the current literature, few studies were performed to measure FLTA. In light of this, this study aims to present the preliminary results of the validity and reliability of the Foreign Language Teaching Anxiety Scale (FLTAS). A background questionnaire and the FLTAS were administered to 100 senior pre-service teachers of English as a foreign language (EFL), before performing Cronbach’s Alpha and exploratory factor analysis. The findings showed that the scale obtains a high reliability coefficient and internal consistency in a five-factor solution. The study ends with recommendations for further research.


Introduction
Language teachers' emotions constitute a bourgeoning field of research acknowledging the emotional labor of the profession; accordingly, Mercer and Gregersen (2020) stated that language teaching, as inherently emotional work, can generate feelings of anger, frustration, disappointment, and anxiety as well as positive feelings such as happiness, excitement, delight, and joy. In line with the research focusing on language teachers' emotional labor, the current study focuses on a specific teacher emotion, foreign language teaching anxiety (FLTA). While many studies have appeared on foreign language anxiety concerning its identification, causes, and effects in the context of learning, FLTA has not drawn much attention among researchers (Tüm, 2012(Tüm, , 2015. The same tendency of investigating predominantly the psychology of language learners while neglecting the psychology of language teachers exists in the field of language learning psychology (Mercer, 2018). In a narrower focus, while foreign language anxiety from the learners' perspective and the ways to measure it in a valid and reliable way have been popular research issues (e.g. Horwitz, 2010;MacIntyre & Gardner, 1994), little research appeared on the complex nature of foreign language teaching anxiety (Horwitz, 1996). It is mostly associated with perceiving language teachers as speakers of the target language in the classroom context (Horwitz, 1996). This view is surely correct considering previous research findings showing the relationship between anxiety and performance in the target language (Woodrow, 2006) or listening comprehension (Bekleyen, 2009;Elkhafaifi, 2005). Among these, some studies reported foreign language anxiety among participants who are language teachers or teacher candidates (Bekleyen, 2009;Tüm, 2015). Nevertheless, no rigorous measurement tool is available to investigate the anxiety of foreign language teachers. The current study intends to fill this gap by providing a reliable and valid tool to measure FLTA.
Measuring FLTA is potentially important to understand a major negative emotion for foreign language teachers, anxiety (Mercer, 2018); thus, more light can be shed on the emotional labor of foreign language teachers and its Aydin, S., & Ustuk, O. (2020). The Foreign Language Teaching Anxiety Scale: Preliminary Tests of Validity and Reliability. Journal of Language and Education,6(2), 44-55. https://doi.org/10.17323/jle.2020.10083 THE FOREIGN LANGUAGE TEACHING ANXIETY SCALE potential impact on teacher attrition (Acheson, Taylor, & Luna, 2016). Moreover, research has showed that learners and teachers are not even aware of the debilitating and subtle factors of anxiety (Tran, Baldauf, & Moni, 2013). But Aydın (2016) showed that factors underlying FLTA are not necessarily related to the anxiety of teachers as foreign language speakers. Instead, many factors that are related to teaching a foreign language in a classroom context appeared. Therefore, in this paper, we present the Foreign Language Teaching Anxiety Scale (FLTAS) to address this gap by designing a scale to measure FLTA and obtain data about its validity and reliability.

A brief overlook to foreign language (teaching) anxiety
Theoretically, anxiety is one of the most commonly studied affective factors in the field of applied linguistics. Anxiety experienced by language learners has been categorized into three main types. First, Scovel (1978) defined trait anxiety to conceptualize the dispositional type of anxiety, anxiety as a behavioral pattern. Second, state anxiety is suggested by Spielberger (1983) to explain anxiety that emerges as a temporary emotion attributed to a particular moment and situation. Finally, situation-specific anxiety is proposed to conceptualize anxiety that is associated with specific situations and events. Foreign language classroom anxiety as proposed by Horwitz et al. (1986) is described as a situation-specific anxiety. They also proposed three constructs that constitute it: communication apprehension, test anxiety, and fear of negative evaluation. Thus far, their theory provides a solid understanding of anxiety experienced by foreign language learners. This situation-specific anxiety is naturally experienced by foreign language teachers, who are also life-long language learners. However, this theoretical framework needs to be revised to bring a more holistic explanation of the anxiety that foreign language teachers experience in the classroom context. FLTA, first discussed by Horwitz (1996), was not seen apart from anxiety in the foreign language-learning context conventionally. In other words, foreign language teachers experience anxiety while teaching in the classroom mostly due to the fact that they are also language learners. This perspective was also echoed by Mercer (2018), who suggested that anxiety as a negative teacher emotion might be provoked among non-native foreign language teachers resulting from their low language proficiencies and/or self-efficacy. Despite their importance, these views do not sufficiently underscore the complexity of FLTA. While Horwitz (1996) claimed that teachers experience anxiety because they are still language learners, Aydın (2016) stated that anxiety in the learning context may differ from anxiety in the teaching context (p. 629). Merç (2011) noted that FLTA has not been defined in the related literature and underlined several factors that included classroom management, specific language teaching approaches, or power-related issues such as supervisor-teacher relations. Drawing on this issue, Aydın (2016) defined FLTA in his qualitative study as "an emotional and affective state that a teacher feels tension due to personal, perceptional, motivational and technical concerns before, during and after teaching activities" (p. 639). In short, the controversy regarding the issue of FLTA remains, and it is evident that the contextual factors underlying FLTA need to be investigated. By this investigation, a more holistic understanding that goes beyond the view perceiving FLTA as a type of foreign language anxiety experienced by language learners can be reached. That is because it is necessary to perform descriptive studies to see the relationship between the levels of foreign language teaching anxiety and the factors that may influence the levels. To this end, it is obvious that a measurement tool needs to be developed in order to see its components and to utilize the scale for a deeper understanding of the relationships between anxiety levels and potential variables in a descriptive research context.

Literature Review
Since there was a focus in language education on learner-centeredness, psychological studies within the field mostly aimed to empower learners in language learning, whereas little attention has been paid to FLTA (Mercer, Oberdorfer, & Saleem, 2016). Similarly, Mercer (2018) underlined the imbalance between the research focused on learners and teachers and argued that this imbalance should be addressed. They argued that there is an urgent need to have better insight into teachers' psychological responses to education. Furthermore, Mercer et al. (2016) stated that positive "teacher psychology is not only beneficial for teachers themselves, but teachers' well-being is vital for learners, too" (p. 216). However, the current literature shows that language teachers suffer from a variety of stressors that affect the positive psychology of language teachers. Below, studies on language teachers' emotions and various stressors are synthesized. Then, the literature in a narrower focus on FLTA is presented.
Research shows that there are certain stressors that are specific to foreign language teachers (Cowie, 2011;Wieczorek, 2014Wieczorek, , 2016. This is probably because foreign language teaching requires many non-native teachers to use a language within an instructional context. Since using a language is a skill-based competence (Mercer et al., 2016), the low performance of non-native foreign language teachers can arguably lead to foreign language anxiety among those teachers, which eventually has negative consequences on the process of language teaching (Horwitz, 1996).
Specifically, studies that regarded FLTA as an affective factor for foreign language learners mainly used a qualitative research design and focused on the factors that cause FLTA. In addition to the use of the target language (Horwitz, 1996;İpek, 2016;Mercer, 2018;Tüm, 2012Tüm, , 2015, some other factors include insufficient grammatical knowledge, difficulties with time management (Numrich, 1996), mentor observations, low levels of language proficiency, problems related to classroom management (Kim & Kim, 2004) and a lack of familiarity with technology (Ali Merç, 2011). Moreover, the fear of failure (İpek, 2006, 2016), low levels of learner proficiency (Kongchan & Singhasiri, 2008), using the native language while teaching (İpek, 2016), and the lack of preparation (Yoon, 2012) are related factors that may provoke FLTA. Furthermore, research indicates that FLTA is interrelated with pedagogical competence and the use of the target language (Tüm, 2012) while Güngör and Yaylı (2012) demonstrated that FLTA was not correlated with self-efficacy.
Two earlier studies were noted regarding scale development in terms of FLTA. In the first one, a holistic scale that assessed FLTA was developed by Kim and Kim (2004). They administered a 30-item test to 147 Korean inservice EFL teachers. In the study, while Cronbach's Alpha was calculated as .96, no factor analysis was performed. Therefore, serious concerns may emerge in regard to the validity of the scale. In the second scale development study, Ipek (2006) constructed a 26-item, five-factor scale as a result of a two-phase doctoral dissertation study. The first phase was a qualitative study to compose the item pool for the scale. Based on a diary study with 32 non-native EFL teachers, a preliminary scale with 42 items was structured. After the pilot test, the final version of the scale was administered to 241 in-service non-native EFL teachers. The reliability of the scale was calculated at .92; moreover, a series of factor analyses reduced the number of items to 26. In this comprehensive inquiry, FLTA was discussed as a teacher affect that is related to factors such as teaching a particular language skill, worrying about target language performance, making mistakes, being compared to colleagues, and using their native language instead of the target language. Here, it is also important to note that Merç (2010) investigated the experiences of pre-service EFL teachers and developed a scale in his dissertation called the Foreign Language Student-Teacher Anxiety Scale. However, his scale was limited to preservice teachers.
In conclusion, several issues drawing on the existing literature and prior discussion on FLTA guided this study. First, while the literature mainly focuses on anxiety in a language-learning context rather than teaching context, there is a strong need to contextualize FLTA as a complex construct that includes aspects of both the teacher as a language learner/user and the teacher as an instructor. A measurement tool as FLTAS can include multiple factors as perspectives to better understand the complex phenomenon of FLTA. For this purpose, a new measurement tool should be developed since only one study has focused on the reliability of the scales that aim to measure FLTA and there are no studies regarding validity. Second, given that the learning and teaching contexts are different, valid and reliable tools should be developed to perform descriptive studies. In this way, it will be possible to reach a comprehensive definition of FLTA. To conclude, this study aims to present preliminary results of the development of an FLTAS in terms of reliability and validity. The paper consists of the initial findings of a longer project that aimed to redefine FLTA and contribute to the ongoing discussions on teachers' emotions with a specific focus on FLTA from a post-positivist perspective. The findings of the pilot study and preliminary tests of the FLTAS are demonstrated. Having said that, two research aims drove the current study. First, we wanted to measure the reliability scores of FLTAS. Second, the factor structure of the FLTAS was investigated and discussed in the light of the related theoretical framework.

Methodology Participants
The participants in this study were 100 senior pre-service EFL teachers enrolled in a state university in Turkey. Of the participants, 71 were female and 29 were male. The gender distribution was a reflection of the overall population in the Department of English Language Teaching at the research site. Their mean age was 22.4 with a range of 21 to 28. The participation criteria included at least one semester of teaching practice. Given the practicum practice in Turkey, all of the participants had taught more than one semester before participating in the current study.
All of the participants were senior pre-service teachers who had their teaching practicum during the fall and spring semesters of the 2017-2018 academic year. This means they had school experience including teaching practice at various state schools. In the Turkish context, pre-service EFL teachers are undergraduate students, who study in an eight-semester English Language Teaching BA program. Students may start teaching in their fifth semester as community service, which is a part of their formal studies. Their practicum starts in the seventh semester and continues until their last semester of their undergraduate program; accordingly, they observe and teach EFL in public schools designated by the administration of the local public school districts. Therefore, senior pre-service teachers had at least one semester of classroom EFL teaching experience by the time the data were collected.

Procedure
The study consisted of three main phases: (1) qualitative data collection, (2) designing and administering the FLTAS, and (3) statistical procedure. The procedure of the scale development framework suggested by DeVellis (2016) was followed for the FLTAS. The three phases that constituted this procedure are as follows.

Phase 1: Qualitative data collection
To obtain items for the FLTAS, a qualitative data collection procedure was carried out, which was reported in the previous study authored by Aydın (2016). The participant group in this step consisted of 60 pre-service teachers of EFL who were studying in the English Language Department (ELT) of a state university. The group contained 32 female (51.7%) and 29 (48.3%) male students with a 21.6 mean age within the range of 20 and 28. The first data collection tool was a background survey examining the participants' ages and gender. These variables were specifically investigated because age (e.g. Dewaele, 2007;Onwuegbuzie, Bailey, & Daley, 1999) and gender (e.g. Dewaele, MacIntyre, Boudreau, & Dewaele, 2016) were found to significantly influence foreign language learner anxiety but little is known about their influence on FLTA. Second, essay papers, reflections, and semi-structured interviews were utilized to collect qualitative data. The participants reflected systematically on their teaching activities with respect to what they learned, how they felt about their teaching performances, problems they encountered in their practices, and the strategies they developed to overcome these problems (if any). The first author supervised the participants during the data collection procedure, interviewed them, and instructed the participants on other essay papers and reflections. All data were collected in participants' native language (Turkish), and translations were member-checked with the designated participant for validity purposes before they were used as data excerpts in the study. As the participants were pre-service EFL teachers and felt proficient enough to check the data that were related to them.
The procedure of this stage included instruction, practice, data collection, and analysis. In other words, the participants were instructed about general topics on teaching EFL from a theoretical perspective. Throughout the practicum when they are assigned to teach in actual classroom settings, the participants wrote reflections and essay papers. They were also interviewed regarding specific details about their teaching activities, their performance, the problems encountered, and the strategies to solve the potential problems. The reason for using the three data sources was to ensure the trustworthiness of the data. After the statements related to teaching anxiety were found, the data were transferred into three concept maps. Since the triangulation indicated that the data obtained was trustworthy and valid, the data from the three concept maps were combined and listed. Below, Figure 1 is a sample concept map.

Figure 1
Sample concept map as shown in Aydın (2016, p. 635) As reported in the study, the author indicated that several anxiety-provoking factors such as personality aspects, perceived proficiency and language skills, fear of negative evaluation (both as a language speaker and as a teacher), teaching demotivation and amotivation, lack of experience and technical and technological concerns emerged (Aydın, 2016). Following the content analysis of the data, an array of sources of teaching anxiety were also presented (p. 636) as in the following list: Drawing and building upon this earlier study published as a part of the same research agenda followed by the current study, the construction of the FLTAS began. The process continued as described in the following phases.

Phase 2: Designing and administering the FLTAS
From the data obtained from the qualitative research, 45 items in relation to teaching anxiety comprised the item pool and were utilized in the FLTAS. As suggested by DeVellis (2016), the 45 items were written to reflect the purpose of the FLTAS. As for the response format of the items, a Likert scale ranging from one to five (never=1, rarely=2, sometimes=3, often=4, always=5) was utilized. These 45 items on the Likert scale constituted the pilot form of the FLTAS.
The pilot form was reviewed by two external experts who had experience teaching English as a foreign language. One of them was a native speaker EFL teacher, whereas the other was an experienced non-native EFL teacher. They agreed upon the comprehensibility of the items as well as their scope.

Phase 3: Statistical procedure
The data collected were analyzed via the Statistical Package for Social Sciences (SPSS, v. 21.0) software. After participants' gender frequencies were found, the mean score for the participants' age was calculated. Then, the calculation of Cronbach's Alpha was performed to determine the extent to which the items in the FLTAS represented reliability. Finally, an exploratory factor analysis was carried out to see the extent to which the FLTAS reflected the construct validity. To accomplish this, principal component analysis and the Varimax method were performed. After this step, 18 items that did not show function or relate to any factor were removed from the scale, leaving 27 items in the FLTAS (See Appendix A).

Descriptive data
Within the range of 27 to 135, the range of scores obtained from the data set was from 32 to 126 with a mean score of 70.89. The range value is 4 for all of the items within the range of 1 and 5. The standard deviation was 21.80. Below, the descriptive details for each item in the FLTAS are given.

Reliability
The values demonstrated that the reliability level of the FLTAS was acceptable. In other words, the internal consistency was found to be .95 for Cronbach's Alpha as illustrated in Table 2. Furthermore, Table 3 presents reliability scores for each factor constituting the FLTAS.

Validity
As previously noted, the FLTAS was analyzed by explanatory factor analysis. In this analysis, the principal components with Varimax rotation were performed. The items and their loadings on each factor, presented in Tables 4 and 5, indicated that the rotated factors explained 69.09% of the variance. In the FLTAS, where the 12 items loaded on the first factor explained 45.47%, the five items loaded on the second factor explained 53.57%. For the four items loaded on the third factor, the cumulative percentage was 59.70%, whereas, for the three items loaded on the fourth factor, the cumulative percentage was 65.07%. Finally, the three items loaded on the fifth factor explained 69.09%. In sum, a five-factor solution was found to account for 69.09% of the variance. The eigenvalues, the scree test, and the amount of variance explained showed that the FLTAS reached an optimal factor solution, as seen in Table 3 and Figure 2.

Figure 2
Scree plot

Discussion
This preliminary study was performed to develop and examine the FLTAS. Regarding the first research aim, the FLTAS showed a high level of reliability. Results showed that the FLTAS is as reliable as the statistical tool developed by Kim and Kim (2004). Their tool consisted of 30 items and its reliability was calculated as .96; however, the calculation method was not specified. The FLTAS is composed of fewer items (n=27) and had a very similar reliability score (.95), which was calculated using Cronbach's Alpha. In the other FLTA scale constructed in Ipek's dissertation study (2006), the reliability co-efficient was calculated as .92 using Cronbach's alpha.
The second conclusion reached in the study was that the scale obtained a high level of internal consistency. More specifically, the scale resulted in a five-factor solution based on pre-service teachers' self-perceptions of foreign language proficiency, teaching inexperience, lack of student interest in classes, fear of negative evaluation by observers and students, and difficulties with time management. The related items for each factor can be found in Table 4.
The theoretical background of the FLTAS can be discussed in comparison to earlier studies on FLTA. To illustrate, the FLTAS supported prior studies in terms of negative emotions among foreign language teachers in regard to teacher's (perceived) proficiency in the target language (Horwitz, 1996;Tüm, 2015), time (Numrich, 1996) and classroom management (Kim & Kim, 2004), fear of negative evaluation by mentors (Ali Merç, 2011) and learners (İpek, 2006), and low levels of learner proficiency (Kongchan & Singhasiri, 2008). In addition to these alignments with the previous studies, the FLTAS also demonstrated some other factors that are new to FLTA research; specifically that teaching inexperience was an important factor in the FLTAS. This was probably due to the participant profile as participants were mostly new to teaching in a classroom setting, notwithstanding their EFL teaching practicum experience. Moreover, FLTA was also associated in this study with students' lack of interest. This might have been due to issues related to student engagement as the items related to this factor mainly included negative affect among foreign language teachers as a result of lack of student interest and engagement in EFL classrooms.

Conclusion
These conclusions provide evidence for the potential use of the FLTAS as an appropriate tool to measure teaching anxiety among foreign language teachers. On the other hand, as this study presents the results of the preliminary tests of validity and reliability of the scale, further research may focus on an additional investigation of the factor complexity in larger and more diverse sample groups to find evidence on the relationships with the factors emerged in the current study. The readers should note that this study presenting the preliminary results of the FLTAS is a part of the research process. This process began with an earlier study by the first author, which explored the classroom phenomena underlying FLTA in a qualitative research design. As the FLTAS is being developed, this paper presented the results of the pilot administration, which led to the 27-item FLTAS with the reliability and validity measurements of it. Obviously, our findings are limited to the research context, and these limitations can be addressed in future studies.
Another limitation is that the participants in the current study included pre-service teachers whose teaching experience was mostly limited to their teaching practicum. The lack of experienced/veteran non-native EFL teachers might have influenced the results. Therefore, it is very important to use the FLTAS with a wider group of foreign language teachers who have more varied backgrounds in terms of experience. Finally, further studies should also include native speaker foreign language teachers and investigate whether their insight could help researchers gain a better understanding of FLTA. In light of these limitations, the authors' research agenda includes a descriptive study to further discuss the effectiveness of the FLTAS with a larger sample that also represents the teacher population on a global scale. Therefore, the results of this preliminary research report can be developed and investigated further. Anxiety is a multifaceted and dynamic phenomenon; it cannot be limited to a certain set of universal factors. Therefore, prospective studies should consider investigating FLTA in exploratory and explanatory mixed-method studies with an ecological approach. For these future studies, the FLTAS can serve as a quantitative tool that should be supported by qualitative contextual data.
Several implications can be drawn from the findings of the preliminary research on the FLTAS. First, as teaching inexperience was an important factor underlying FLTA, administrators and policymakers should take all the necessary precautions while working with teachers with a lack of experience; accordingly, the FLTAS can serve as an applicable tool to measure the phenomenon. Secondly, as FLTA is closely related to insufficient student engagement and interest, motivating students to increase classroom engagement can be seen as a strategy to overcome FLTA; once students' interest and engagement in the classroom increase, one factor leading to FLTA can be eliminated. Nevertheless, more correlational research should be conducted to support those implications.