Glyn Jones. The Core Inventory for General English aims to inform teachers about the levels at which learners of English master certain aspects of the language. This project set out to check the accuracy of this information by examining written answers to the British Councilʼs Aptis test in order to find out whether candidates really do reproduce those aspects at the expected levels.

The British Council – EAQUALS Core Inventory for General English (CIGE), (North, Ortega and Sheehan, 2010) lists the linguistic features – classified as functions, grammar, discourse markers, vocabulary and topics – which, according to its authors, characterise each of the first five levels of the Common European Framework of Reference for Languages (CEFR), A1 to C1. The aim of the present project was to investigate the validity of this information with respect to discourse and grammar features. A corpus of 416 responses to the Writing module of the Aptis test was compiled, and instances of the respective linguistic features were coded manually using qualitative data analysis software. The occurrences of each feature were counted and cross-tabulated with the CEFR levels (as assigned by the Aptis rating equivalents) of the respective responses in which they occur. In addition, a sub-corpus of 115 responses were rated blind by a panel of 12 judges, all experienced language teachers and/or applied linguists, who had undergone CEFR familiarisation and training. The judgesʼ ratings were analysed using Rasch statistical analysis software in order to derive a “fair average” CEFR rating for each response. A marked disparity was found between these ratings and the CEFR grades awarded to the same responses by Aptis raters. For the purpose of the study, alternative CEFR cut-scores were derived based on the judgesʼ ratings and these were used to re-grade the 416 responses analysed. Using these revised ratings, it was possible to obtain validity evidence with respect to approximately a quarter of the CIGE inventory items under consideration. Slightly under half of these items appeared consistently in responses at the expected level. In a substantial additional proportion of cases, the evidence was inconclusive. A small number of items appear to be characteristic of a lower level than that assigned to them in the CIGE.