AI-driven language learning in higher education: an empirical study on self-reflection, creativity, anxiety, and emotional resilience in EFL learners

The analysis of this study began with descriptive statistics to understand the statistical properties of the variables, as shown in Table 1. Following the table, all the 205 respondents filled the questionnaire; thus, no missing values in the data where majority of the respondents are females and the average age is 19 years. However, with respect to the language proficiency level (LPL) of the respondents, the majority are intermediate, then followed by beginners, and then advanced.
Result presentation of RQ1: How does the types of AI-powered feedback (e.g., corrective vs. motivational) causally affect self-reflection processes in EFL learners?
Figure 2 displays the graphical result presentation of the SEM with respect to model 1. It can be glanced from the figure that all the respective explanatory variables, i.e., CAIF, MAIF, Age, Gender, and LPL are positively related with the response variable, i.e., SR as shown along the arrows running from the explanatory variables to the response variable. Looking at the explanatory variables, there is positive correlation between CAIF and MAIF, CAIF and Gender, and CAIF and LPL. However, negative correlation between CAIF and Age of 0.23, 0.01, and 0.02, −0.18, respectively. There is positive correlation between MAIF and LPL, but negative correlation between MAIF and Age, MAIF and Gender of 0.03, −0.12, and −0.01, respectively. There is positive correlation between Age and Gender, Age and LPL, but negative correlation between Gender and LPL of 0.16, 0.04, and −0.01, respectively, as shown along the double-headed arrows running between each pair of the explanatory variables.

Result presentation of SEM for model 1.
Table 2 displays the result of the relationship between the type of AI-powered feedback (corrective and motivational) and self-reflection (SR) of the EFL. From the table, corrective AI-powered feedback (CAIF) such as grammar and vocabulary corrections and motivational AI-powered feedback (MAIF) such as encouragement and progress tracking are positively related with the SR of the EFL learners, and the impacts are statistically significant at 5% and 10% level, respectively. This has also been reaffirmed by the coefficient of the language proficiency level of the learners (LPL), which is both significant and positive. When looking at the statistical healthiness tests of the model, it can be seen that the p-value of the likelihood ratio (LR) test statistic is 0.07, the value of the comparative fit index test is 0.96, and the value of the root mean square error is 0.05. The rule is that, for a model to be fit, the LR probability value must be greater than 0.05 (Suhr, 2006); acceptable model fit is indicated by a CFI value of 0.90 or greater (Hu and Bentler, 1999); and acceptable model fit is indicated by an RMSEA value of 0.06 or less (Hu and Bentler, 1999). For GFI, AGFI, and TLI, the values are 093, 0.90, and 0.91, respectively, where the health values are to be greater than or equal to 0.90, i.e., ≥0.90 (Hooper et al. 2008). This means that the estimated model is statistically fit. Therefore, considering the content of the questions used in measuring the self-reflection of the EFL learners, it means that corrective related AI-powered feedback such as grammar and vocabulary corrections and motivational related AI-powered feedback such as encouragement and progress tracking are causing the EFL learners to improve their self-reflection, identifying areas for improvement in their English language skills, and setting goals for improving their English language skills. The implications of these findings are significant for both pedagogical practice and the design of AI feedback systems. Corrective feedback, particularly in terms of language mechanics like grammar and vocabulary, provides learners with concrete areas for improvement. Meanwhile, motivational feedback, through encouragement and tracking progress, fosters a sense of achievement and forward momentum. Both types of feedback appear to facilitate a deeper level of self-reflection, enabling learners to identify their strengths and weaknesses and set meaningful goals for language improvement.
Result presentation of RQ2: What is the causal relationship between the use of AI-powered feedback and the development of creativity in EFL learners?
Figure 3 shows the graphical result presentation of the SEM with respect to model 2. It can be observed from the figure that the respective explanatory variables, i.e., AIFC, Gender, and LPL are positively related to the response variable, i.e., C except Age, which is negatively related to it as shown along the arrows running from the explanatory variables to the response variable. In line with the explanatory variables, there is a positive correlation between AIFC and LPL, Age and Gender, and Age and LPL. However, negative correlation between AIFC and Age, AIFC and Gender, and Gender and LPL of 0.4, 0.2, 0.4, −0.2, −0.02, and −0.01, respectively, as shown along the double-headed arrows running between each pair of the explanatory variables.

Result presentation of SEM for model 2.
Table 3 reports the result for examining the causal effect between the use of AI-powered feedback and the development of creativity (C) in EFL learners. In accordance with the table, creativity-related AI-powered feedback AIFC coefficient is positively related with the creativity of the EFL learners and the impact is statistically significant at 1% level. This has also been supported by the coefficient of the language proficiency level of the learners (LPL) which is also positive. From the part of the diagnostic tests of the model, it is evident that the p-value of the LR test statistic is 0.06, the value of the comparative fit index test is 0.93, and the value of the root mean square error is 0.05. Been the LR probability value is greater than 0.05 (Suhr, 2006); values of the CFI, GFI, AGFI, and TLI are greater than 0.90 (Hu and Bentler, 1999; Hooper et al. 2008); and the value of RMSEA is below 0.06 (Hu and Bentler, 1999); the model estimated is statistically healthy. Therefore, looking at the content of the questions used in measuring the creativity of the EFL learners, it means that creativity-related AI-powered feedback is making the EFL learners to be creative in their English, confidence in expressing their created ideas in English, and enjoying writing/speaking in English to express their creativity. These findings have significant implications for both educators and AI feedback system designers. By integrating creativity-oriented feedback mechanisms, educators can promote not only language proficiency but also creativity in their learners. Encouraging creativity in language learning, through personalized AI-powered feedback, helps learners to feel more confident in their ability to express ideas in English and may inspire more engagement with creative tasks, such as writing and speaking.
Result presentation of RQ3: To what extent does AI-powered feedback reduce performance anxiety in EFL learners, and how mediating variables (e.g., learner’s familiarity with AI, feedback delivery style) influence this relationship?
Figure 4 illustrates the graphical result presentation of the SEM with respect to model 3. It is evident from the figure that the respective explanatory variable, i.e., AIFRA, is positively related to the response variable, i.e., RA while Age, LPL, Gender, AIFM1, and AIFM2 are negatively related with it as shown along the arrows running from the explanatory variables to the response variable. According to the explanatory variables, there is a positive correlation between AIFRA and LPL, Age and Gender, and Age and LPL but a negative correlation between AIFRA and Age, and AIFRA and Gender, and Gender and LPL of 0.05, 0.16, 0.04, −0.10, −0.01, and −0.01, respectively, as shown along the double-headed arrows running between each pair of the explanatory variables.

Result presentation of SEM for model 3.
Table 4 shows the result for investigating the extent to which AI-powered feedback reduces performance anxiety (RA) in EFL learners, and how mediating variables (e.g., learner’s familiarity with AI (AIFM1), feedback delivery style (AIFM2) influence this relationship. From the table,the reduction of anxiety-related AI-powered feedback AIFRA coefficient is statistically insignificant and the sign is not in line with the a priori expectation (i.e., negative); while for the mediation variables, though none is significant, they both satisfied the a priori expectation (i.e., negative) with the anxiety reduction, and this has also been supported by the coefficient of the language proficiency level of the learners (LPL), which is also negative and even statistically significant. When checking for the statistical healthiness of the model, it can be observed that that the p-value of the LR test statistic is 0.07; values of the CFI, GFI, AGFI, and TLI are greater than 0.90 (Hu and Bentler, 1999; Hooper et al. 2008); and the value of the RMSEA is 0.04; the model estimated is statistically healthy. Therefore, with respect to the content of the questions used in measuring the reduction of the performance anxiety of the EFL learners, familiarity of AI-powered feedback and the method in which the learners are receiving feedback on their English learning including by written comments, audio/video recordings, and face-to-face conversations are partially reducing the performance anxiety of the EFL learners when speaking/writing in English, not being worried about making mistakes in English, and not being nervous when you speaking English. These findings suggest that while AI-powered feedback alone may not significantly reduce performance anxiety, the way feedback is delivered and learners’ familiarity with AI tools could hold the key to alleviating anxiety in EFL contexts. This points to the need for educators and AI developers to focus on not only the content of feedback but also the manner in which it is presented to learners.
Result presentation of RQ4: Does the integration of AI-powered feedback in language learning affect the long-term improvements in EFL learners’ emotional resilience?
Figure 5 shows the graphical result presentation of the SEM with respect to model 4. Following the figure, the respective explanatory variable, i.e., AIFER and Age is positively related to the response variable, i.e., ER while Age and LPL are negatively related to it as shown along the arrows running from the explanatory variables to the response variable. In line with the explanatory variables, there is a positive correlation between AIFER and LPL, Age and Gender, and Age and LPL, but a negative correlation between AIFRA and Age, and AIFRA and Gender, and Gender and LPL of 0.04, 0.15, 0.04, −0.21, −0.05, and −0.01, respectively, as shown along the double heading arrows running between each pair of the explanatory variables.

Result presentation of SEM for model 4.
Table 5 presents the result for analyzing whether the integration of AI-powered feedback in language learning affects the long-term improvements in EFL learners’ emotional resilience (ER). Following the table, the emotional resilience-related AI-powered feedback AIFER coefficient is positively related with the emotional resilience of the EFL learners and the impact is statistically significant at 1% level. The statistical checks of the model show that the p-value of the LR test statistic is 0.13; values of the CFI, GFI, AGFI, and TLI are greater than 0.90 (Hu and Bentler, 1999; Hooper et al. 2008); and the value of RMSEA is below 0.06 (Hu and Bentler, 1999); the model estimated is statistically fit. By observing the content of the questions used in measuring the emotional resilience of the EFL learners, it suggests that emotionally resilient related AI-powered feedback is making the EFL learners to be relaxed in English learning challenges and be confident in handling language learning setbacks. These results have significant implications for both educators and the design of AI-powered feedback systems. Emotional resilience is a key factor in sustained language learning success, particularly in environments where learners may face repeated failures or difficulties. AI-powered feedback designed to strengthen emotional resilience could play a crucial role in helping learners stay motivated and maintain a positive outlook, even when confronted with challenging language learning tasks.
Robustness check
Table 6 reports result of quantile regression for model 1 where, according to the table, CAIF and MAIF are positively related with SR and the impact are statistically significant at 5% and 1% levels, and at both 0.25 Qtr. and 0.75 Qtr., respectively. When combining the impacts of the significant coefficients quantiles for each and then compare the level of the impact of the two, that of the CAIF is more greater than that of the MAIF, thus, corrective related AI-powered feedback is more effective than motivational related AI-powered feedback. Furthermore, from the lower part of the table, it is the quantile slope equality test and the symmetric quantile test. From the quantile slope equality test it is evident that the test rejects the null hypothesis of slope equality at 1% level, which means that the slope equality is different across quantile levels. Likewise, the test of symmetry rejects the null hypothesis of asymmetry at 1% level and thus, there is evidence of asymmetry across the quantiles. Hence, the quantile coefficients across the 0.25, 0.5, and 0.75 are significantly different, and their marginal effects are of significantly different magnitudes. Therefore, CAIF and MAIF are encouraging the growth of SR. This result of the quantile regression reaffirms the earlier result presented by the SEM estimate for model 1 as presented in Table 2.
Figure 6 is the graphical presentation of coefficients of the quantile regression across the various quantiles for model 2. From the figure, it can be glanced that the variables CAIF and MAIF are significantly and positively influencing SR at lower and upper quantiles, and the impact of the CAIF is greater than that of the MAIF.

Graphical presentation of coefficients of the quantile regression for model 1.
Table 7 shows results of quantile regression for model 2. In line with the table, AIFC is positively related with C and the impact is statistically significant at 1% level and at 0.25 Qtr. Furthermore, from the lower part of the table, the quantile slope equality test rejects the null hypothesis of slope equality at 10% level, which suggests that the slope equality is different across quantile levels. Moreover, the test of symmetry rejects the null hypothesis of asymmetry at 5% level, which means evidence of asymmetry across the quantiles. Hence, the quantile coefficients across the 0.25, 0.5, and 0.75 are significantly different; and their marginal effects are significantly of different magnitudes. Thus, AIFC is significantly promoting the growth of SR. Therefore, this result reiterates the earlier finding reported by the SEM estimate for model 2 as presented in Table 3.
Figure 7 is the graphical presentation of coefficients of the quantile regression across the various quantiles for model 2. From the figure, it can be glanced that the variable AIFC is significantly and positively influencing SR at lower quantiles.

Graphical presentation of coefficients of the quantile regression for model 2.
Table 8 presents result of quantile regression for model 3. In line with the table, AIFRA is positively related to RA and but the impact is statistically insignificant; however, this violates the a prior, or rather the theoretical assumption of the relationship. Hence, the AIFRA is not causing the RA. Furthermore, the mediation variables, namely AIFM1 and AIFM2 are negatively related with the RA, but neither is significant; however, this is in-line with the a priori or rather the theoretical assumption of the relationship. Moreover, from the lower part of the table, both the quantile slope equality test and the test of symmetry reject the null hypothesis of slope equality and the null hypothesis of asymmetry, both at 1% levels; thus, the slope equality is different across quantile levels, and there is evidence of asymmetry across the quantiles, respectively. Therefore, familiarity with AI-powered feedback (AIFM1) and the manner in which EFL learners receive this feedback (AIFM2) is partially reducing the performance anxiety of the EFL learners. These findings echo the earlier finding reported by the SEM estimate for model 3 as presented in Table 4.
Figure 8 is the graphical presentation of coefficients of the quantile regression across the various quantiles for model 3 which shows that the variable AIFRA is positively related to the RA and the impact is significant, whereas the variables AIFM1 and AIFM2 are positively related to the RA, but neither is significant. Hence, AIFM1 and AIFM2 are partially reducing the RA but AIFRA has no impact on it.

Graphical presentation of coefficients of the quantile regression for model 3.
Table 9 indicates result of quantile regression for model 4. In line with the table, AIFER is positively related with RA and the impact is statistically insignificant at 1% and 10% levels, and at 0.25Qtr. and 0.75Qtr., respectively. Furthermore, from the lower part of the table, both the quantile slope equality test and the test of symmetry reject the null hypothesis of slope equality and the null hypothesis of asymmetry both at 1% levels; which means that the slope equality is different across quantile levels and there is evidence of asymmetry across the quantiles, respectively. Therefore, AI-powered feedback related to emotional resilience (AIFER) is enhancing emotional resilience (ER) of the EFL learners. This finding confirms the earlier finding reported by the SEM estimate for model 4 as presented in Table 5.
Figure 9 is the graphical presentation of coefficients of the quantile regression across the various quantiles for model 4, which reveals that the variable AIFER is positively related to the ER, and the impact is significant around lower and upper quantiles.

Graphical presentation of coefficients of the quantile regression for model 4.
link