Talking with experts in ELT!

Interviews on education and language teaching

George Drivas explores the ‘why’ and the ‘how’ and investigates different forms of assessment.


George Drivas studied English Literature at the University of Athens and Theoretical Linguistics at the University of Reading, UK. He has worked in Foreign Language education since 1981 as a teacher and teacher trainer. He was Director of Studies at the Department of Foreign Languages at Doukas School between 1994 and 2021. He is an inspector for the European Association for Quality Language Services and a certified assessor for the European Foundation for Quality Management.

A lot of people are involved in assessing someone’s abilities; from teachers, DOS, external exams institutions, to managers and head of departments in any field.

In ELT, it is commonly believed that assessment, evaluation and testing are the same and most teachers use these terms interchangeably. However, this is not quite true, since there are a few differences among them. Historically, standardized tests test grammar, vocabulary, pronunciation, reading and listening comprehension, and writing ability.

Are these traditional and standardized tests valid, practical, reliable and authentic? If the answer is yes, then you have found the perfect test for your students. Unfortunately such tests lack either one or more than one of the above qualities.

George Drivas* explores the ‘why’ and the ‘how’ and investigates different forms of assessment.


  • Why do we have school examinations?

In this day and age, school examinations or tests are administered to provide evidence that students learn, and teachers instruct. In this sense, tests present a record of performance as well as the work of all parties involved. They are also used to provide a safeguard of objectivity against student negligence and teacher bias. The expressions “I passed” or “They failed me” are indicative of where praise or blame is placed according to results.

There are some basic distinctions that need attention: testing and assessment. The terms are often used interchangeably. However, they carry significant differences.

Testing is used to examine someone’s knowledge of something to determine what that person knows or has learned. It measures the level of knowledge that has been reached by the examinee. Testing also means experimentally verifying that a process, etc., actually works.

Assessment is the systematic process of documenting and using empirical data on the knowledge, skills, attitudes, and beliefs. Assessment also means expressing an opinion about the value of a process, etc.

Testing focuses on the “here and now”, providing a snapshot of what examinees have achieved at a particular point in their development. A norm-referenced test is a uniform test. It ranks and compares students in relation to one another. Also, they measure performance based on the theoretical average. A criterion-referenced test is a style of test which uses test scores to generate a statement about the behavior that can be expected of a person with that score. Most tests and quizzes that are written by schoolteachers can be considered criterion-referenced tests.

Assessment can be focused on the individual learner or all individuals together, like the whole class, an institution or specific program. Formative assessment will give you an overview of your students in the beginning of your instruction. It gives you the opportunity to still have the chance to improve your instruction. Summative assessment will give you the outcome of the whole instruction.

  • Do we have exams to provide a public record of how well an examinee understands a subject? Is this public record important?

Public records are important to be able to hold stakeholders accountable or to be able to attribute praise. They provide transparency. Without transparency, there would be no evaluation of outcomes, no formal appraisal of performance, no educated decision-making. As such, examination public records provide evidence of the status of the educational system, its advantages and its possible short fallings.

However, the data included in the public records – in this case the test results - are only as valid and relevant as the tests they are based on. For instance, a test of the least frequently used words may yield a high level of failure. Any conclusions based on these results can be grossly misleading: Learners are not engaged, or Teachers are underperforming; most significantly: The educational system is failing.

The test results that are included in the public records should be consistent and dependable. They should reflect the teaching goals realistically and over time. A more demanding test may yield scores that are lower than expected whereas a less demanding test may yield scores that are higher than expected. These results are not comparable until the difficulty factor has been factored in.

  • Are examinations what they seem to be? That is are they viewed as a thorough and systematic investigation into something, looking at it from many angles in order to provide an objective account of it?

This may easily be considered a case of perception vs reality. Perception, in simple terms, can be defined as the way an individual thinks. The thinking patterns differ from one individual to another, and the way of thinking is decided by several factors. Reality, on the other hand, refers to the true state of something that may not be realized by individuals easily.

Test taking is considered tantamount to education. This is a common perception. No learning has taken place unless it is validated by a test. In many cases, the more rigorous the test the better. Language exams are a testament to this. Certain exams are considered more demanding, so more suitable for high achievers.

However, reality is different. Craftsmen from Eastern European countries, who acquired their skill as apprentices are considered more reliable and more capable than their western European counterparts who acquired their trade through formal vocational training.

In this respect, school examinations are perceived as a thorough and systematic tool to investigate the effectiveness of an educational act. Yet, they fall short of providing anything more than a snapshot of teacher and learner performance within a given educational framework at a particular point in time.

  • Are there ways of investigating thoroughly what a learner knows about something that does not involve grading and marking?

Grading and marking are considered safe ways to make quality statements about an individual’s progress and development. Perception: numbers do not lie.

However, not all learners are created equal, not all learners have the same skills, not all learners have the same needs. One way to overcome the obstacle of “one size fits all” is portfolio assessment.

Portfolio assessment is based on the systematic collection of learner work (such as written assignments, drafts, artwork, presentations, etc.) that represents competencies, exemplary work, or the learner's developmental progress according to specific goals and needs. In addition to examples of their work, most portfolios include reflective statements prepared by learners. Portfolios are assessed for evidence of learner achievement with respect to established learning outcomes and standards looked at from different perspectives.

Portfolio assessment has an added bonus that transcends education: It offers students the opportunity to be responsible for their own learning, ownership of their work. Portfolio assessment is an authentic method, it determines meaningful work and often has personal relevance. It promotes creativity, individuality, and uniqueness in the assessment of learning. It encourages motivation as well as visibility of the final assessment process.

  • In a grading system, it is assumed that a mark given for a script is an objective record unbiased by an examiner’s values. Leaving aside idiosyncratic judgments, even if all examiners shared the same set of values is there guarantee that they would all give exactly the same mark?

Save for the Multiple Choice test, where all answers are either black or white, in all other forms of testing or assessment there is an inherent degree of bias, either perceived or real. The more itemized the test is, the easier it is to be true to the test creator’s intentions. The more global the test is the greater the need for grading guidelines, moderation, and standardisation. In addition, there may be a need for evidence gathering and record keeping by the student depending on the scale of work. Experience in grading is pivotal in assigning the same ranking to a piece of learner work. Intentional bias as an action of vengeance I have not experienced in my 40+ years in teaching. Yet, differences do exist between subject matter cultures. The well tested, though cost ineffective solution, of assigning two examiners to blindly mark a script remedies the problem to an extent. In the foreseeable future, artificial intelligence may come to the rescue of weary examiners.

  • Is the idea that some of us are inherently ‘bright’ and others ‘slow’ or ‘stupid’ still massively influential in the way teachers think of learners and organize their classes? Are these differences associated with social class, race, gender, or location?

Before exploring what factors influence or define cognitive ability, the following needs to be made clear: terms like the ones mentioned in the question presuppose a quality judgment on the part of the speaker.  Today, when multiple intelligence theory is shaping our thinking and actions a single measure is inadequate.

Multiple intelligences is the theory of human intelligence first proposed by the psychologist Howard Gardner in his book Frames of Mind (1983). At its core, it is the proposition that individuals have the potential to develop a combination of eight separate intelligences, or spheres of intelligence; that proposition is grounded on Gardner’s assertion that an individual’s cognitive capacity cannot be represented adequately in a single measurement, such as an IQ score.

Rather, because each person manifests varying levels of separate intelligences, a unique cognitive profile would be a better representation of individual strengths and weaknesses, according to this theory. It is important to note that, within this theory, every person possesses all intelligences to some degree.

Furthermore, Bloom’s Taxonomy was created by Benjamin Bloom in 1956, published as a kind of classification of learning outcomes and objectives that have been used for everything from framing digital tasks and evaluating apps to writing questions and assessments.

The original sequence of cognitive skills was Knowledge, Comprehension, Application, Analysis, Synthesis, and Evaluation. The framework was revised in 2001 by Lorin Anderson and David Krathwohl, yielding the revised Bloom’s Taxonomy. The most significant change to the Cognitive Domain was the removal of ‘Synthesis’ and the addition of ‘Creation’ as the highest-level of Bloom’s Taxonomy. And being at the highest level, the implication is that it is the most complex or demanding cognitive skill–or at least represents a kind of pinnacle for cognitive tasks.

In other words, learners display a range of different abilities. It should be the aim of the teacher and the teaching act to support and enhance those that are required to successfully achieve the learning goals, rather than try to use a single approach in testing and assessment to verify what has been accomplished. 

Of course, learner knowledge, skills and values are influenced by social class, race, gender, or location.  The dominant beliefs in a particular society affects directly how an individual develops, what goals an individual sets, what the respective role in society will be. Differences among learners based on these factors definitely exist, but if the teacher, the educational system in general, is aware of them, then they can be uprooted.  For instance, the mere access to the internet is providing a powerful alternative to the hindrance posed by location.

  • Can students be compared to each other in different ways – in the same school, in the same country, or across the world?

If the aim of education is to help individual students achieve their personal goals, even help them transcend the boundaries set by their social and professional environment, such comparisons are unwarranted.

However, they may serve a greater purpose: They can assist in identifying “good practices” or “practices of excellence”.

Consider the following: The Practice Principles articulate how teachers can deliver the curriculum and engage students. They are designed to link directly to a school’s documented teaching and learning program, which outlines what is to be taught, and the approach to assessment, which helps teachers determine student learning needs and how students can demonstrate their levels of understanding. Each Principle is supported by a theory of action that describes how the work of teachers can generate improved student learning over time. It explains the specific changes that can be expected and creates a brief evidence-based synopsis.

  • Has the importance of character in education, especially the promotion of qualities such as grit and resilience completely disappeared today?

Consider the following: Character education is not old-fashioned, and it is not about bringing religion into the classroom. Character is the "X factor" that experts in parenting and education have deemed integral to success both in school and in life.

Knowledge, skills and values are the three aspects that should define and shape the educational system. They cannot but be interlinked. One flows into the other. Each one is influenced by and modeled on the other. Education should be holistic rather than atomistic. Consider for instance the fact that Education needs to shift from the atomistic view that isolates subjects to a holistic perspective. Content Teaching cannot, and should not, be separated from Skills Development that are necessary to comprehend and internalize, and from Character Education that is paramount in actually putting them in beneficial use.



Assessment | Definition of Assessment at Dictionary.com

Assessment vs Testing: what's the difference? | Onlineassessmenttool.com

CETL- Assessment Resource Centre (hku.hk)

Criterion-Based Assessment in a Norm-Based World: How Can We... : Academic Medicine (lww.com)

Difference Between Perception and Reality | Compare the Difference Between Similar Terms

Multiple intelligences | psychological theory | Britannica

Norm-Referenced vs. Criterion-Referenced Assessments - BrightHub Education

Practice principles for excellence in teaching and learning (education.vic.gov.au)

Test:  Definition of Test at Dictionary.com

The Benefits of Character Education - The Atlantic

The Criterion Referenced Assessment: What It Is and Other Types (growthmastery.net)

The Difference Between Assessment and Testing  Learning Sciences International

The importance of public records - Americans for Prosperity

What Is Bloom's Taxonomy? A Definition For Teachers (teachthought.com)