This study investigates complexity, accuracy and fluency (CAF) features of speaking performances on the Aptis test across different Common European Framework of Reference (CEFR) levels, as an effort to examine criterion-related and cognitive validity evidence for the test. Benchmark speech sets from 125 examinees (25 sets from each level of A1-C) were sampled, each including responses to four speaking tasks, amounting to a total of 500 speech samples. An array of CAF features was measured, spanning six sub-components: lexical sophistication and appropriateness, grammatical complexity and accuracy, fluency, and pronunciation. These linguistic features were then subjected to both univariate and multivariate statistical analyses to identify distinguishing CAF features that can significantly predict examinees’ CEFR levels.
The results of this study revealed distinguishing features in all three CAF components. Post-hoc comparisons showed significant differences on various features between all adjacent levels except for B2 and C. Findings of this study provide supporting evidence for the criterion-related and cognitive validity of the Aptis speaking test, evidencing the alignment between key criteria assessed in Aptis and components of speaking ability on the CEFR. The discriminating CAF features can also assist in rater calibration and training processes for the test.