ALTE - The Association of Language Testers of Europe
 


Home

About ALTE

ALTE Members

Membership

Affiliates

Events

Framework

QA

Projects

Resources




Site Search :



Studies in Language Testing 6 (Cambridge University Press)

The Multilingual Glossary of Language Testing Terms was originally developed by ALTE members with funding from the European Commission's LINGUA programme (94-09/1801/UK - lll). The idea of producing a multilingual glossary of assessment terms grew out of the needs experienced by the members of ALTE - the difficulties of talking about language testing issues in the range of languages of ALTE member organisations.

The glossary is of use not only to members of ALTE but to many others who are involved in language testing and assessment. The original glossary contains entries in ten languages: Catalan, Danish, Dutch, English, French, German, Irish, Italian, Portuguese and Spanish. It is available in paperback and on CD-ROM.

Available from Cambridge University Press. For more information go to publications.

The Glossary has been produced in other languages by ALTE partners. For instance, under the TiPS Development Programme for Testing in Polish and Slovene – a Socrates Lingua 2 project – an English, Polish and Slovene version was produced. TIPS. Similarly, under the Socrates funded Devprothell project, a group of ALTE partners produced a glossary in Estonian, Hungarian, Latvian and Lithuanian.

Some example terms from the glossary follow

ACCREDITATION
The granting of recognition of a test, usually by an official body such as a government department, examinations board, etc.

ADMINISTRATION
The date or period during which a test takes place. Many tests have a fixed date of administration several times a year, while others may be administered on demand.

ANCHOR ITEM
An item which is included in two or more tests. Anchor items have known characteristics, and form one section of a new version of a test in order to provide information about that test and the candidates who have taken it, e.g. to calibrate a new test to a measurement scale.

ASSESSOR
Someone who assigns a score to a candidate’s performance in a test, using subjective judgement to do so. Assessors are normally qualified in the relevant field, and are required to undergo a process of training and standardisation. In oral testing the roles of Assessor and Interlocutor are sometimes distinguished. Also referred to as Examiner or Rater.

CALIBRATION
The process of determining the scale of a test or tests. Calibration may involve anchoring items from different tests to a common difficulty scale (the theta scale). When a test is constructed from calibrated items, then scores on the test indicate the candidates’ ability, i.e. their location on the theta scale.

CLERICAL MARKING
A method of marking in which Markers do not need to exercise any special expertise or subjective judgement. They mark by following a mark scheme which specifies all acceptable responses to each test item.

COMMUNICATIVE TASK / ACTIVITY
A classroom or examination exercise which involves or tests an individual’s ability to deal with a communication event.

COMPONENT
Part of an examination, often presented as a separate test, with its own instruction booklet and time limit. Components are often skills-based, and have titles such as Listening Comprehension or Composition. Also referred to as subtest.

COMPUTERISED MARKING (SCORING)
Various ways of using computer systems to minimise error in the marking of objective tests. For example, this can be done by scanning information from the candidate’s mark sheet by means of an optical mark reader, and producing data which can be used to provide scores or analyses.

CONJUNCTION
A word used to connect clauses or sentences or words in the same clause: for example and, but, if.

CONTENT ANALYSIS
A means of describing and analysing the content of test materials. This analysis is necessary in order to ensure that the content of the test meets its specification. It is essential in establishing content and construct validity.

DESCRIPTOR
A brief description accompanying a band on a rating scale, which summarises the degree of proficiency or type of performance expected for a candidate to achieve that particular score.

DIRECTED WRITING TASK
See definition for Guided Writing Task.

DISCRETE ITEM
A self-contained item. It is not linked to a text, other items or any supplementary material. An example of an item type used in this way is multiple-choice.

DISCRIMINATION
The power of an item to discriminate between weaker and stronger candidates. Various indices of discrimination are used. Some (e.g. point-biserial, biserial) are based on a correlation between the score on the item and a criterion, such as total score on the test or some external measure of proficiency. Others are based on the difference in the item’s difficulty for low and high ability groups. In item response theory the 2 and 3 parameter models estimate item discrimination as the A-parameter.

DISCURSIVE COMPOSITION
A writing task in which the candidate has to discuss a topic on which various views can be held, or argue in support of personal opinions.

DOUBLE MARKING
A method of assessing performance in which two individuals independently assess candidate performance on a test.

EDITING
The process by which examination materials submitted by item writers are modified and put into the form in which they will appear on an examination paper.

EXAMINER
Refer to definition for Assessor.

FACILITY INDEX
The proportion of correct responses to an item, expressed on a scale of 0 to 1. It is also sometimes expressed as a percentage. Also referred to as facility value or p-value.

GAP-FILLING ITEM
Any type of item which requires the candidate to insert some written material – letters, numbers, single words, phrases, sentences or paragraphs – into spaces in a text. The response may be supplied by the candidate or selected from a set of options.

GRADE
A test score may be reported to the candidate as a grade, for example on a scale of A to E, where A is the highest grade available, B is a good pass, C a pass and D and E are failing grades.

GRADING
The process of converting test scores or marks into grades.

GUIDED WRITING TASK
A task which involves the candidate in the production of a written text, where graphic or textual information, such as pictures, letters, postcards and instructions, is used to control and standardise the expected response.

INFORMATION TRANSFER
A technique of testing which involves taking information given in a certain form and presenting it in a different form. Examples of such tasks are: taking information from a text and using it to label a diagram; rewriting an informal note as a formal announcement.

INTERVAL SCALE
A scale of measurement on which the distance between any two adjacent units of measurement is the same, but in which there is no absolute zero point.

INTONATION
The tone given to words with the effect that, for example, a question can be distinguished from a statement.

ITEM
Each testing point in a test which is given a separate mark or marks. Examples are: one gap in a cloze test; one multiple-choice question with three or four options; one sentence for grammatical transformation; one question to which a sentence-length response is expected.

ITEM BANKING
An approach to the management of test items which entails storing information about items so that tests of known content and difficulty can be constructed. Normally, the approach makes use of a computer database, and is based on latent trait theory, which means that items can be related to each other by means of a common difficulty scale.

ITEM RESPONSE THEORY
A group of mathematical models for relating an individual’s test performance to that individual’s level of ability. These models are based on the fundamental theory that an individual’s expected performance on a particular test question, or item, is a function of both the level of difficulty of the item and the individual’s level of ability.

LANGUAGE FOR SPECIFIC PURPOSES (LSP)
Language teaching or testing which focuses on the area of language used for a particular activity or profession; for example, English for Air Traffic Control, Spanish for Commerce.

LEXIS
A term used to refer to vocabulary.

LINK ITEM
Refer to definition for Anchor Item.

MARK
The outcome of an examination, often expressed as a percentage. Because of adjustments such as heavier weighting for some items, the mark is not always the same as the total score.

MARK SCHEME
A list of all the acceptable responses to the items in a test. A mark scheme makes it possible for a Marker to assign a score to a test accurately.

MARKER
Someone who assigns a score to a candidate’s responses to a written test. This may involve the use of expert judgement or, in the case of a clerical Marker, the relatively unskilled application of a mark scheme.

MARKING
Assigning a mark to a candidate’s responses to a test. This may involve professional judgement, or the application of a mark scheme which lists all acceptable responses.

MATCHING TASK
A test type which involves bringing together elements from two separate lists. One kind of matching test consists of selecting the correct phrase to complete each of a number of unfinished sentences. A type used in tests of reading comprehension involves choosing from a list something like a holiday or a book to suit a person whose particular requirements are described.

MEASUREMENT
Generally, the process of finding the amount of something by comparison with a fixed unit, e.g. using a ruler to measure length. In the social sciences, measurement often refers to the quantification of characteristics of persons, such as language proficiency.

MULTIPLE-CHOICE GAP-FILLING
A type of test item in which the candidate’s task is to select from a set of options the correct word or phrase to insert into a space in a text.

MULTIPLE-CHOICE ITEM
A type of test item which consists of a question or incomplete sentence (stem), with a choice of answers or ways of completing the sentence. The candidate’s task is to choose the correct option (key) from a set of three, four or five possibilities, and no production of language is involved. For this reason, multiple-choice items are normally used in tests of reading and listening. They may be discrete or text-based.

MULTIPLE-MATCHING TASK
A test task in which a number of questions or sentence completion items, generally based on a reading text, are set. The responses are provided in the form of a bank of words or phrases, each of which can be used an unlimited number of times. The advantage is that options are not removed as the candidate works through the items (as with other forms of matching) so that the task does not become progressively easier.

NARRATIVE TEXT
A text in which a story is told or events recounted.

OBJECTIVE TEST
A test which can be scored by applying a mark scheme, without the need to bring expert opinion or subjective judgement to the task.

OPEN-ENDED QUESTION
A type of item or task in a written test which requires the candidate to supply, as opposed to select, a response. The purpose of this kind of item is to elicit a relatively unconstrained response, which may vary in length from a few words to an extended essay. The mark scheme therefore allows for a range of acceptable answers.

OPTICAL MARK READER (OMR)
An electronic device used for scanning information directly from mark sheets or answer sheets. Candidates or Examiners can mark item responses or tasks on a mark sheet and this information can be directly read into the computer. Also referred to as scanner.

PAPER CONSTRUCTION
The process of selecting the items which will make up an examination paper, and adding rubrics and an answer key.

PREPOSITION
A word which expresses the relationship between a noun or pronoun and another word: for example on, with, for.

PRETESTING
A stage in the development of test materials at which items are tried out with representative samples from the target population in order to determine their difficulty. Following statistical analysis, those items that are considered satisfactory can be used in live tests.

PROMPT
In tests of speaking or writing, graphic materials or texts designed to elicit a response from the candidate.

PROOF-READING TASK
A test task which involves checking a text for errors of a specified type, e.g. spelling or structure. Part of the task may also consist of marking errors and supplying correct forms.

QUESTION
Sometimes used to refer to a test task or item.

RASCH MODEL
A mathematical model, also known as the simple logistic model, which posits a relationship between the probability of a person completing a task and the difference between the ability of the person and the difficulty of the task. Mathematically equivalent to the one-parameter model in item response theory. The Rasch model has been extended in various ways, e.g. to handle scalar responses or multiple facets accounting for the ‘difficulty’ of a task.

RAW SCORE
A test score that has not been statistically manipulated by any transformation, weighting or re-scaling.

REGISTER
A distinct variety of speech or writing characteristic of a particular activity or a particular degree of formality.

ROLE PLAY
A task type which is sometimes used in speaking tests in which candidates have to imagine themselves in a specific situation or adopt specific roles.

RUBRIC
The instructions given to candidates to guide their responses to a particular test task.

SCALE
A set of numbers or categories for measuring something. Four types of measurement scale are distinguished - nominal, ordinal, interval and ratio.

SCALE DESCRIPTOR
Refer to definition for Descriptor.

SCAN
To read something quickly, in order to look for a specific piece of information or answer to a question. A scanning exercise often consists of questions placed before a text.

SCRIPT
The paper containing a candidate’s responses to a test, used particularly of open-ended task types.

SEMI-AUTHENTIC TEXT
A text taken from a real-life source that has been edited for use in a test, e.g. to adapt the vocabulary and/or grammar to the level of the candidates.

SENTENCE COMPLETION
An item type in which only half of a sentence is given. The candidate’s task is to complete the sentence, either by supplying suitable words (possibly based on the reading of a text) or by choosing them from various options given.

SENTENCE TRANSFORMATION
An item type in which a complete sentence is given as a prompt, followed by the first one or two words of a second sentence which expresses the content of the first in a different grammatical form. For example, the first sentence may be active, and the candidate’s task is to present the identical content in passive form.

SETTING
The whole process by which examination materials are produced and papers constructed.

SKIM
To read rapidly so that the main point is understood, although details will be missed.

SPECIFICATIONS
A description of the characteristics of an examination, including what is tested, how it is tested, details such as number and length of papers, item types used, etc.

STRESS
The emphasis put on a syllable or word in spoken language.

STRUCTURAL COMPETENCE
Structural competence refers to an individual’s ability in and knowledge of the grammatical structures of a language.

SYLLABUS
A detailed document which lists all the areas covered in a particular programme of study, and the order in which content is presented.

SYNONYM
Two words which mean the same, or almost the same, as each other; for example, ‘shut the door’ and ‘close the door’.

SYNTACTIC STRUCTURES
The grammatical structures of language.

'TABLE-TOP' MARKING
A method of marking examination papers which involves gathering all the Markers together to mark for a limited period of time, rather than sending papers out to be marked by people in their own homes.

TASK
A combination of rubric, input and response. For example, a reading text with several multiple-choice items, all of which can be responded to by referring to a single rubric.

TEST METHOD CHARACTERISTICS
The defining characteristics of different test methods. These may include environment, rubric, language of instructions, format, etc.

TEXT
A piece of connected discourse, written or spoken, used as the basis for a set of test items.

THRESHOLD LEVEL
An influential specification in functional terms of a basic level of foreign language competence, published by the Council of Europe in 1976 for English, and updated in 1990. Versions have since been produced for a number of European languages.

TRANSFORMATION ITEM
Refer to definition for sentence transformation.

UTTERANCE
A chain of spoken words.

VETTING
A stage in the cycle of test production at which the test developers assess materials commissioned from item writers and decide which should be rejected as not fulfilling the specifications of the test, and which can go forward to the editing stage.

WAYSTAGE LEVEL
A specification of an elementary level of foreign language competence first published by the Council of Europe in 1977 for English and revised in 1990. It provides a less demanding objective than Threshold, being estimated to have approximately half the Threshold learning load.

WEIGHTING
The assignment of a different number of maximum points to a test item, task or component in order to change its relative contribution in relation to other parts of the same test. For example, if double marks are given to all the items in Task One of a test, Task One will account for a greater proportion of the total score than other tasks.

WORD FORMATION
An item type where the candidate has to produce a form of a word based on another form of the same word which is given as input.