Deliberations Logo
Home PageNewsDeliberations ForumFeedbackAbout Deliberations
Search
OCSLD Publications
In this section:
Alan Booth
Malcolm Swannell and Ian Solomonides
Norrie Edward
Diversifying Assessment 2: Setting standards
Diversifying Assessment 5: Involving students
Introduction
John Biggs
Paul Ramsden
John T.E. Richardson
Liz Beaty
Catherine Tang
Noel Entwistle
J.H.F. Meyer
Barry Jackson
R.D. Gregory, G. Harland and L. Thorley
Pauline Hunt and Liz Beaty
J. Blumhof and D. Pearlman
B. Matthew
P. Atrill and E. McLaney
R.Craig and J.Amernick
M. Healey and B. Ilbery
Les Simpson
Seymour Roworth-Stokes
Katy Macleod
Andrew Charlett
Stuart Laverick, Julie Hilton and Kevin Johnston
Paul Hyland

Diversifying Assessment 2: Setting standards

Section 2: Setting standards

Reproduced with permission from Brown, S. Rust, C. and Gibbs, G. Strategies for Diversifying Assessment in Higher Education Oxford: Oxford Centre for Staff Development (1994)

If assessment is to achieve any of the functions listed in the previous section, it is vital that it has validity. Whether it is intended as feedback to students on their progress or as part of the grading process for a final qualification, common standards must be applied. However, this is easier said than done; assessment is by no means a pure science. Nevertheless, marking - of essays and exams in particular - is something virtually all academics have to do and most rapidly gain a good deal of experience in it. The marking exercise below draws on this experience.

2.1 An essay marking exercise

Assume that the two essays reproduced below were written by students on an introductory course in technology and its social implications. Mark each of them out of 10 (with 4 as a pass mark). In the spaces indicated at the end of each essay make comments on their strengths and weaknesses, and give advice to the students on how to improve their essays .

Assess the noise pollution problems caused by Concorde around airports.
Answer 1

The sound limit at Kennedy airport, New York, is 112 PNdB*, and at Heathrow, London, 110 PNdB. The manufacturers of Concorde (Sud-Aviation and the British Aircraft Corporation) have promised that Concorde will range between 104 and 108 PNdB, depending on its weight at take-off.

At the start of Concorde operations at Heathrow, 21 of the first 35 departures exceeded 110 PNdB, and in the first eight months of operations 72% of the 97 departures exceeded 110 PNdB. Overall in 1976 there were 109 infringements of Heathrow's limit by Concorde. These measurements of Concorde were about 7 PNdB lower than during its early endurance trials. At the same time there were 1,941 infringements by subsonic jets. Concorde rarely features in the list of the ten noisiest take-offs each month at Heathrow, 3 and subsonic aircraft at Kennedy have been recorded at 121 PNdB - twice the limit.

At Dulles airport, Washington, Concorde has averaged 119.9 PNdB at take-off and 117.8 PNdB on landing. This is 12-13 PNdB higher than the averages for subsonic aircraft. The noise levels have been j going down, and with them, the number of complaints. In September 1976 the average level was 121.3 - PNdB and there were 186 complaints (29 of these to one take-off). In October the average was 117.4 PNdB and there were 101 complaints. During this time polls of opinion concerning Concorde's trial period at Dulles showed an initial opposition of 36.9% drop to 26.2%. In New York, opposition to Concorde landing at Kennedy has dropped from 63% in January 1976 to 53% in April 1977.

While 500,000 people are affected by aircraft noise in Washington, 2,000,000 are affected at Kennedy. It has been estimated that 40,000 extra people will be affected by noise if 80 Concordes serve 12 US cities. This represents a 1% increase. Bumps in the runway at Kennedy force Concorde to take off closer to heavily populated areas, but due to advanced flight control characteristics Concorde can begin to bank at an altitude of 100 ft. compared with an average of 480 n. for subsonic aircraft, and so can turn away from heavily populated areas sooner after take-off.

*PNdB means Perceived Noise Decibels - a logarithmic scale of noise

Strengths:

Weaknesses:

How to improve your essay:

Mark out of 10:

Answer 2

Opposition to Concorde based on arguments concerning noise pollution takes two main themes. The first is concerned with the 'sonic boom' - a phenomenon of supersonic flight unique to Concorde amongst commercial aircraft. The second is concerned with noise levels around airports caused during take-off and landing. This second theme is common to all aircraft, and the issue at stake is whether Concorde is significantly noisier than subsonic aircraft.

Comparisons with other aircraft are complicated by the changing nature of jet fleets. Early jet aircraft (e.g. the DC8 and 707) used turbojet engines, and whilst these have been quietened, they are much noisier than second-generation fan-jet engine aircraft (e.g. DCl0 and Jumbo 747). Eventually those older aircraft will be phased out, but at the moment Concorde is being compared with them.

There are also problems of measurement. Objective measures (meters giving a reading in decibels) cannot give any impression of 'shrillness' or subjectively experienced nuisance. An aircraft giving higher decibel readings may not be experienced as 'noisier' by someone hearing it take off.

Subjective measures also involve problems such as 'noise', which is a multi-faceted phenomenon, and different people use different criteria in assessing it. There are dangers, also, in questionnaire surveys of reactions of people living around airports. Average ratings of 'nuisance' change over time without any changes in objectively measured decibel levels or frequency of aircraft movements and so other factors must be involved. These factors can be political. Boeing took care to subcontract for parts for its SST at factories surrounding Kennedy airport, so that votes concerning whether SST's should be allowed to use the airport would be influenced by residents' concerns for their jobs! Workers at Filton and Toulouse would hardly try to ban Concorde landing near their homes, however, noisy it is!

Finally, there is a variation in recorded noise levels dependent on the skill of the pilot, and load factors of ~ the aircraft. Subsonic aircraft have been measured at twice the legal noise level, struggling to take off with heavy loads in adverse conditions. Concorde has been flying under-loaded, with skilled pilots, who have even been reported banking away from noise monitors.

Given this variety of problems, it would seem likely that Concorde causes even more noise pollution than data suggests, and that in comparison with subsonic jets will become comparatively worse as time goes on.

Strengths:

Weaknesses:

How to improve your essay:

Mark out of 10:

This exercise was previously published in Gibbs, G. Module 3 Assessment, Certificate in Higher Education by Open Learning, OCSD (1989).

Commentary

Many hundreds of teachers from all subject disciplines have marked these two essays. There are broad similarities in their comments and their marks, but also important differences. While teachers are often doing something quite similar when they are assessing students' work, they can also disagree profoundly.

Essay 1 is usually given a mark in the range 3-7 with an average of about 5. It is generally considered to have no introduction, no main body, and no conclusion. The order in which the content of the essay is presented could be changed and it would make little difference, because there is no developing argument. The essay consists of facts in a list with no analysis. It is usually thought not to answer the question. The student who wrote it is often commended for having done some relevant background research but criticised for having made little appropriate use of this research.

In contrast Essay 2 is usually given a mark in the range 5-10, with an average of 7. Tutors comment that it goes beyond the scope of the question and considers the nature of the question itself, and the difficulty of reaching simple conclusions. It has a clear argument, draws a conclusion of its own, has a beginning, a middle and an end, and raises and considers a number of issues along the way. It is often considered to be well written.

In short, the author of the first essay is regarded as a poor student and the author of the second a comparatively good student. Whatever our subject expertise, we seem capable of making such general assessments of students and their work.

However there are complications. The general unanimity in the tutors' assessment of the two essays masks some sharply divergent views. The range of marks given for each essay is wider than the difference between the essays. Some people give Essay 1 7/10 almost a first. And some give Essay 2 a bare pass. Some even give higher marks for Essay 1 than Essay 2. This variation extends to the comments teachers write. Some think that Essay 1 does all you could reasonably ask: it provides a good range of facts upon which a decision can be made. Essay 2 is then described as being all over the place, providing little evidence to back up its irrelevant assertions. To some extent, those who feel this way tend to come from the 'hard' sciences. It would appear that sometimes, but not always, different sets of values and expectations operate in different subject areas.

These similarities and differences illustrate some important points about assessment:

  • there is broad agreement as to what constitutes quality: about what we should be looking for when we assess students in higher education;
  • there are wide differences between teachers in their standards and judgements: often the difference between teachers' appraisal is greater than the difference between students' performance;
  • there are identifiably different values operating, to some extent related to subject disciplines;
  • these similarities and differences have little to do with the knowledge the authors of the essays displayed: these general characteristics are exhibited independently of the specific content of the essay.

2.2 Making the criteria explicit

You were able to mark the essays in the previous exercise because you had certain criteria in your mind. The reason why teachers in higher education vary in the marks they award to the two essays is that they are using different criteria or, if they are using the same criteria, they are giving different weightings to them in terms of importance. This exercise underlines the importance of working to explicit criteria.

There are good reasons for making criteria explicit.

To be fair to the student.
When you set a task or assignment, make clear the criteria on which it will be assessed, so the student can tailor her or his response to your requirements.
To avoid 'academic drift'.
This is the term used to describe student responses where the criteria are unclear; they assume the key criterion to be an emphasis on content.
To encourage staff to adopt common standards in marking.
If a number of different staff members are involved in marking a piece of work, clearly stated criteria should help to bring closer the standards to be applied. It also gives a clearer basis for discussion of any disagreements either between staff or between students and staff.

Assessment criteria can be introduced to the students in a number of ways. Some of these are suggested below.

Simply specify them

Opposite is an example of a marking protocol where the assessment criteria have been standardised for a whole degree programme within a modular structure. This grid is reproduced in the front of each module's student handbook. While not every criterion would necessarily be relevant to every piece of work in every module, where the criterion is relevant the standards for the different grades are clearly defined. They can then be applied in exactly the same way in each module. The student's overall grade for the piece of work is then based on the column in which the statements are concentrated.

Marking criteria protocol for Health Care Studies degree in a modular programme

Refer/Fail Grade C Grade B Grade B+ Grade A
Failure to
address
the actual
question
asked/task
set.
      Overall
presentation
shows
a professional
and innovative
approach
to the topic.
Purpose and
meaning
of assignment
unclear.

Language,
grammar
and spelling
poor.

Meaning
apparent,
but...
language
not always
fluent,
grammar
and spelling
still poor.
Language
mainly
fluent.
Grammar
and spelling
mainly accurate.
Thoughts and
ideas clearly expressed.
Grammar
and spelling
accurate
and language
fluent.
Clarity of
expression
excellent.
Consistently
accurate use
of grammar and
spelling with
fluent
professional/
academic writing
style.
Fails to
demonstrate
understanding
of the subject/
topic area.
Attempts
a logical
and coherent understanding
of the subject
area.
Demonstrates understanding
in a style which
is logical,
coherent
and flowing.
Consistent understanding demonstrated in
a logical,
coherent and
lucid manner.
Work shows a
well co-ordinated, grounded and
reasoned
understanding
of topic and its
relevance to
practice.
Significantly
under/over
required length
as specified
in module
guide.
       
Referencing
inaccurate or
absent.
Referencing
present
but had
inconsistencies
and
inaccuracies.
Minor
inconsistencies
and inaccuracies
in referencing
using the
Harvard System.
Referencing
relevant and
mostly accurate
using the
Harvard
System.
Referencing
clear, relevant
and consistently
accurate using
the Harvard
System.
Inaccurate
or inappropriate content/theory.
Appropriate
selection
of content/
theory but
some key
aspects missed/
misconstrued.
Most key
theories included
in work in an appropriate
manner.
Insightful and appropriate
selection of content/theory
in key areas.
Assignment demonstrates considerable
innovation in
the handling of content/theory.
Little or no
evidence of
reading around
the subject.
Evidence of
some
limited
reading
around the
subject.
Clear evidence
and application
of readings
relevant
to the subject
within the text.
Ability to
appraise
critically the
theory and
literature
from a variety
of courses,
developing
own ideas
in the process.
Has developed
own ideas and
justified using a
wide range of
sources of
theories and
literature
which has been thoroughly
analysed, applied
and tested.
Makes no
attempt
to address
module focus,
aims or themes
of the
assignment.
Some of
the writing
is focussed on module aims
and themes of assignment.
Mainly focussed
on aims and
themes of the assignment.
Clear focus
on module
aims and
themes of the assignment.
Module's aims
and themes
are integral to
the assignment.
No attempt at
evaluation within assignment.
Some attempt
at evaluation
within
assignment.
Evaluation
reasonably well
carried out.
Good clear
evidence of
evaluation
carried out
within
assignment.
Evaluation within assignment
rigorous and
appropriate.
Unsubstantiated/
invalid conclusion,
based on
anecdotes
and
generalisations
only.
Limited
evidence
of findings and
conclusions supported by
the literature
and theory.
Evidence of
findings and
conclusions
grounded in
theory/
literature.
Good
development
shown in
summary of arguments
based on theory/
literature and beginnings
of synthesis.
Analytical and
clear conclusions
well grounded
in theory and
literature,
showing
development
of new concepts.
Lack of critical thought/analysis/
reference to
theory.
Some evidence
of critical
thought and rationale for
work.
Demonstrates applications of theory/critical
analysis to the
topic area.
Clear evidence
of application of theory/critical analysis.
Assignment
consistently demonstrates
application of theory/critical
analysis
integrated.
Failure to apply
topic to personal,
societal and
professional
practice.
Superficial application
to personal, societal and
professional practice.
Begins to show application
to personal,
societal and professional
practice.
Appropriate
application to personal,
societal and professional
practice.
Application of
topic to personal,
societal and professional
practice
relevant and innovative.
(For oral
presentations)
Unsatisfactory
speed of delivery
and audibility in presentation.
(For oral
presentations)
Speed of
delivery and audibility
fluctuate
during presentation.
(For oral
presentations)
Well paced
delivery.
(For oral presentations)
Well paced
and clear and
confident
delivery.
(For oral presentations) Excellent
clarity, pace
and confident delivery.
Inability to
stimulate/
facilitate
discussion.
Some ability
to facilitate discussion
but tendency
to miss opportunities
or be directive.
Some ability
to stimulate
and facilitate
discussion.
Clear evidence
of ability to
stimulate,
facilitate and summarise
discussion.
Excellent
enabling pacing
and summarising
of discussion.
Consider whether a version of this grid might be useful on your courses. What changes might you need to make? What problems might there be?

Specify the criteria, and get the students to use them in a 'marking' exercise

However clearly criteria are stated, students may not immediately understand them, or appreciate which are the most important, especially at the beginning of a course. A marking exercise can help to overcome this.

Get students to mark work.
Take one or two examples of work, possibly from a previous year, and copy them for this year's cohort to mark individually using your criteria. They can then discuss each other's marking in groups and negotiate an appropriate grade. The experience of applying the criteria and discussing the outcomes is likely to reveal any lack of understanding and help students to see what is required of a 'good answer'.
Set a marking assignment.
Give students the task of marking three pieces of work from a previous year's students. Grades for this can be based on how closely their marks for the three pieces correspond to those given by the tutor. This may sound harsh but the tutor on one engineering course where this has been done argues that as he will be marking much of their work for the next three years, it is important that students understand what he is looking for!

If you do not currently involve your students in any kind of marking exercise, consider how and where it might be possible to introduce it on your course.

Run an exercise for students to generate criteria

  • For inexperienced students, this could be as simple as asking them to identify the characteristics of 'the perfect essay' individually or in pairs.
  • Consider using the 'First class answer' overleaf.

This is an exercise you might use to help your students see the standards, values and criteria that are being applied in the assessment of their work. Try it for yourself.

Consider the following essay question:

Compare and contrast the consequences of blindness and deafness on language development.

The marks students will achieve in responding to this question will be determined to a considerable extent by whether they answer the question. Or to put it another way, the standards students achieve will be judged by whether they undertake the task implied by the question or whether they respond as if to some other task. The accounts below describe the tasks students appear to have set themselves, and record how this was reflected in the grade awarded.

A First Class Answer
Identify the consequences of blindness and deafness for language development. Compare and contrast these consequences, drawing conclusions about the nature of language development. Comment on the adequacy of theories of language development in the light of your conclusions.

Upper second
Identify the consequences of blindness and deafness for language development. Compare and contrast these consequences.

Lower second
List some of the features of blindness and deafness. List some of the consequences for development, including a few for language development.

Third
Write down almost anything you can think about blindness, deafness, child development and language development. Do not draw any conclusions.


Take an essay or exam question, laboratory report, project or other assessed task which you set your students and rewrite it in a similar way: define the task that students are actually responding to when they achieve each grade.


Question or task as set:

First:

Upper second:

Lower Second:

Third:

From your definition of the task you expect of a student producing a first class answer, it should now be possible to identify the assessment criteria.

(This exercise was previously published in Habeshaw, T., Gibbs, G., and Habeshaw, S., (1987), 53 Interesting Ways of Helping your Students to Study. Bristol TES. )

Once your students have generated their own criteria, you then have to decide whether you use them, or ignore them and use your own. A compromise may be to have certain criteria which are compulsory and non-negotiable, but to allow the students to add their own to these.

For example, in assessing an essay your three non-negotiable criteria might be:

  • length of between two and three thousand words;
  • use of at least three referenced texts;
  • attention to appropriate syntax, punctuation and spelling.

Feedback sheets

Making the criteria explicit not only makes clear to students what is expected of them but also offers you a way of giving detailed feedback in a relatively undemanding way, e.g. by turning the listed criteria into a feedback sheet, as in the example [below].

Essay assessment profile  
Name Mark
GOOD POINTS SUGGESTIONS
STRUCTURE/ARGUMENT STRUCTURE/ARGUMENT
Introduction
Thesis/focus well defined
Introduction
Clearer focus needed Tick
Explain your approach
Argument
Well-stated Tick
Well supported by data/secondary sources
Historically sound
Good synthesis skills
Critical use of material
Good understanding of question
Argument
Explain the direction of the argument
Analyse more; describe less
Deeper analysis needed
Identify problems with interpretation
Rethink organisation
Stay in touch with question
Re-work/extend conclusion
Style presentation
Content
Well-chosen, relevant
Avoids trivial content
Differences between historians noted
Ideological positions explained Tick
Good selection of books used
Style presentation

Develop neater presentation
Improve your handwriting
Did you proof-read? Tick (Careless errors)
Get sentence structure right
Avoid bland quotes. Paraphrase!
Ensure meaning is clear. Be critical!

Technical
Sources acknowledged
References/bibliography correctly set out



Subject specific skills
Content
Be ruthless with irrelevant material
Is all the detail strictly necessary?
Incorporate a range of interpretations
Use more sources - good ones! Tick
Demonstrate use of sources
  Technical
Check your referencing format Tick
Use references
Check layout of bibliography
Additional comments

Feedback sheets can also form a useful starting point for any dialogue with the student in a subsequent tutorial.

A feedback sheet can help to make explicit the allocation of marks to particular criteria within the overall mark. This indicates to the student which are the most valued criteria and which therefore they need to concentrate on. The use of these sheets is considered in more detail later in the section on mechanising assessment (Section 8.3).

The strategies outlined in this section are all designed to make explicit to students the criteria by which their work will be assessed. These strategies stem from the conviction that students will then be better placed to deliver work likely to fulfil these criteria. Detailed feedback on the extent to which students have been successful in meeting the criteria then becomes the point of departure for the next 16 phase of the student's learning.

     

Contact deliberations@londonmet.ac.uk

  Page last updated 25 July 2005

ISSN 1363-6715

© 2010 London Metropolitan University