The Scottish Learning Festival 2009
Graham Samuel Maxwell – Educational Consultant
Imagining the Future of Educational Assessment:
Lessons from Queensland and other Places
Thanks, I am really pleased to be here, the Scottish Learning Festival is quite an exciting occasion, and I think I might be taking back some ideas about how we could do something similar in Queensland – nothing like it.
I am also pleased to be here be here because my heritage is Scottish; I am a descendant of five pairs of great grandparents who migrated to Australia in the 1860/70s, and I think it is appropriate in a year of Scottish homecoming that I should be able to come and talk to you. My background is about fifty years of work in education; teaching, teaching teachers, research, consultancy, and for three years I was Deputy Director of the Queensland Studies Authority, which is a bit like, I suppose, the Scottish Qualifications Agency. So I bring that kind of perspective to things.
Now I put down this theme for the talk today, and then of course when you come to write these kind of things you have second thoughts about whether you have taken on too much, and I think perhaps imagining the future is always a dangerous kind of thing, and most people get it wrong don’t they? There’s a tendency to overestimate what will happen. And I will really speak a bit about that, but I thought this was a better title really “The teacher’s role in educational assessment; Towards the Future” the teacher’s role in educational assessment, and some sense of where might we hope things are going to g in the future about it. So I have changed the focus a little bit from what was advertised, but tried to keep faith with what I said in the abstract for the session.
So what I am going to do today is first of all give you a bit of background on Queensland assessment, because there are a couple of things in there that are really important to know about, in terms of where I am coming from. Although some of what I am going to say is really me, my perspective on things, not an official, or necessarily exactly what happens at the moment either. Because I want to talk about teachers as assessors, and explore that idea a bit. And for that I have got five broad issues that hang under the title of teachers as assessors, and the last one of those deals with moderation. And I will probably try to structure things in such a way that I amplify a bit on that because of the interesting features that come out of Queensland on that.
So some Queensland background first of all. We abolished external public examinations in 1974, for thirty-five years we haven’t had external public examinations at any level in schools. At the time they were abolished there was one at what you would think of as secondary 4, we would say year ten, so my slides say year ten for reference to that level, and one at year 12, that is S6. The one at year ten atrophied and disappeared a long time ago, and I was having an interesting conversation a moment ago about the fact that you still do think that is really important, to have a certificate at the end of year ten, but we don’t. Mainly because people can’t do much with a school qualification at that level, and need to go on and do some form of education or training in order to obtain satisfactory employment.
What we did was replace the external examinations with a system of what we refer to as externally moderated school based assessment. So all of the assessment is done by teachers, and not by external agencies, and not even in the sense of giving them any set kinds of tasks to apply to students within the schools, the teachers invent it all. Moderation is based on external review panels, and the review panels are themselves constituted of teachers who are released by schools for the times when they need to do the work of the review panels. They are trained in the role, and they tend to be expert teachers, and people want to be on them, for reasons that might become obvious later. The review panels offer feedback to schools on samples of student work in every subject that the school offers. And they are called review panels because they do actually review what the school has done; these are not panels that second mark anything, that look at what the schools have done and provide advice as to whether they are on track and applying a common set of standards across the state.
This system is unique in many ways, and yet I think we would say it is a fairly obvious kind of thing to be doing, and teachers certainly don’t want to give it up now; they might have had some doubts at the beginning, but after thirty-five years it is business as it should be conducted in Queensland. But it is unique even within Australia, there are eight states and territories in Australia, which you probably know, and education is a responsibility of the states more or less - I will say something about that in a moment - and so each state tends to have a distinctive way of doing things. Even the transition between primary and secondary differs across states.
So this system is well entrenched, it is supported by teachers, it is supported by students, because it is an interactive kind of process in which transparency is really important, students know where they stand, it is supported by the public, it is supported by the politicians, and I have to add also supported by the universities. The universities no longer have any explicit or direct control over the secondary school curriculum. The syllabuses of study for subjects are devised by syllabus advisory committees, on which there might be university representation, but it is not any direct control, it is assumed that secondary education has its own rationale and reason for being, and services other needs besides universities.
A couple of caveats; first of all what I have been talking about so far applies only to the final two years of secondary education, and that leads to the Queensland Certificate of Education now, it used to be called the Senior Certificate. And there are two tiers of subjects that are offered within those two years – well they haven’t got this title but I thought it was better to represent them as being first tier subjects - which are substantially cognitively demanding and are seen as servicing preparation for university, as at least an important function that they serve - not the only - so they have that kind of attention. The second tier subjects tend to have a vocational orientation, and can be tied into explicit vocational education certificates. And there is a difference in the way those two tiers of subjects are treated for quality assurance, so we need substantial quality assurance for the first tier, but in the second tier the consequences that result from those subjects are more diffuse, less obviously critical, and therefore are treated somewhat differently, and I am not going to go into those sorts of procedures.
When it comes to years 1-10, or P-10 if you include the pre-school year, then the amount of external surveillance is extremely low indeed at the moment. It has always been felt that there has been little need for external comparability of what sorts of standards schools are applying, but, of course, that is being questioned these days, and the result is that there is a tendency now to start introducing tests in order to monitor schools, and their quality of delivery. I want to argue that it would be preferable to boost the capacity of teachers to undertake assessment, and to introduce systems of moderation in those years in order to ward off the worst of the affects of introducing tests that have high stakes attached to them.
Some recent developments have been the development of a framework, it is referred to as essential learnings – I hate putting the letter S on the end of learning personally, but that is how it is represented – so a set of expectations about what students might be learning in different years. And it is supported by these things, a set of comparable assessment paths, which are modelled on the kinds of assessments that are done in years 11 and 12, which tend to be largish talks that are accomplished over several days perhaps, and with some flexibility about the time constraints involved, and they are not tests in the sense of sit down and do it in a particular time frame. And these are being taken in years 4, 6 and 9, there is a process to support the judgements of teachers about the quality of the students performance on these tasks, and that involves consultations amongst teachers and a fairly low key, almost, consensus approach to that, because this is still not high stakes stuff, it is simply growing a process that may develop into something more. There are guidelines given for reporting the performance on these tasks against five grades. Now the five grades issue A-E came in because the federal government legislated that all schools would use A-E or some equivalent, of five grades to report student performance at the end of each semester or year, and that came about become of some confusions about what different schools were doing – different schools approach that in a different way – I think it is fairly unfortunate that they did go down that line, and I might say some things about that in a moment. And the last thing was the assessment bank, that is really only just beginning to be done, to provide examples to teachers of the kinds of assessments that they might choose to use.
I think the limitations of what we have got in place at the moment are that the prototype of Queensland comparable assessment tasks is one type of assessment only, and it needs to be extended to other forms of assessment; besides the big task maybe smaller tasks, maybe even longer tasks in some cases. And I think also the focus on the summative role of assessment inherent in the notion that we are going to grade it, rather than represent performance in some other way, I think is an unfortunate view too.
I am really impressed with the way in which Scotland has framed your Curriculum for Excellence, and especially your assessment is for learning, and the kinds of support ideas that are provided for doing that, you have got a more comprehensive view about what assessment involves, which incorporates all of the activities that teachers might be engaging in for assessment of students. And I think we are lacking that at the moment, and I think you could teach us quite a few things on that.
Recent developments in Australia as a whole: the establishment of a curriculum authority at the national level, in order to produce a national curriculum, starts to impose on what the separate states are doing, and it will be interesting to see how that plays out, particularly with respect to assessment, it is a bit ambiguous at the moment. We don’t expect that it will undercut particular ways in which states undertaken assessment in particular – we expect it will not displace the school-based assessment system for years 11 and 12 – but it is early days yet, and it won’t even be implemented until 2012, and that is only for a small number of subjects in the first instance, a handful of the core subjects. What will happen to the others I don’t know, there are sixty-two subjects offered at first tier level in Queensland – not all in an individual school obviously, but there are sixty-two subjects that can be chosen from – and the national curriculum at the moment will only deal with five. There is a national assessment programme at the lower levels, census testing in years 3, 5, 7 and 9 in literacy and numeracy, and periodic testing in those other fields that is on a three year cycle – one of those each year. The same testing allows you to do more interesting things, but, of course, it doesn’t tell you about every student, it only tells you about the overall state of learning in those subjects.
So I want to say a few words about the teacher’s role in assessment, just to say I think it is a key part of teacher’s work to be involved in assessment. And we highlight it to the extent that external assessments have no role at all to play in the certification of students, and I really think that that is preferable. The teacher is in the situation of being able to make sense of the variety of evidence, even if there is external testing, and ought to be in the driving seat to reconcile the differences between the information they have about student progress, and anything that might be an external test piece of information that is only derived on the basis of a one-off test on a single occasion when the student might be perhaps not performing at their best.
And I think those are three things I would say about assessment as a key part of the teacher’s work. Assessment is about knowing what it means to learn something, and recognising when learning has occurred, and monitoring and directing student progress; this marries the notion of formative and summative. Also making sense of the variety of evidence, gathering and interpreting information, recognising and so on. And those making professional judgements are the sort roles that I picked for the teacher in assessment.
I am going to quickly go through these five points about assessment, and I want to skip down as quickly as possible to the last of these, so I might actually skip a few of my slides along the way. But first of all, the need for broader authentic tailored assessment. All of our aims for learning of students have expanded enormously haven’t they? We expect that students be able to apply and to use knowledge, not just simply to reproduce, and not just to reproduce under constrained circumstances. We keep on talking about broader and deeper learning aims for the twenty-first century, and there are lots of lists of those, and various movements around the world talking about the skills that we need for the twenty-first century.
I will talk too about authentic situations and tasks, and how we need to make the learning meaningful for students, which means we need to think about tasks – even assessment tasks – that students can feel are actually worth doing, rather than simply things that are imposed on them. And engaging with the students in a negotiation about what those task might be is an important feature of our system.
I really think handwritten examinations are archaic, and it is about time we admitted that. How many of you have recently engaged in writing a piece of connected prose in half an hour, by handwriting, on a topic which you have been handed five minutes in advance, and on which some serious consequence might hang? Too many people doing that kind of thing in life? No I don’t think so, not really those kinds of things, not even letter writing if you want to think about that as an activity of that kind. We need to move on, particularly in technological ways, to deal with the need to assess students in ways that go beyond the handwritten examination I think. That is not to say that in our system some of that kind of activity doesn’t go on, but it is by no means the dominant form of assessment within schools. I think we need to value, and recognise and nurture all worthwhile learning, and recognise that assessment shapes the learning. And that is a bit of a problem, so long as you constrain your assessment to one particular form, then that particular form dominates the way in which students learn. I learned French for four years in High School under the old system and never learned really to speak it, because we were never asked and never examined in terms of being able to communicate in the language orally, only to write about it on a piece of paper, and I think that is something that has completely changed now. Communication and oral communication is a really serious part of learning a language, in the Queensland system of education at least.
The second thing I wanted to say something about was formative and summative purposes of assessment, and say that these are really not different types of assessment, but ought to be seen as different uses to which assessment is put. And so you can use external test date formatively with students, and you can use teacher judgements about the progress that students are making in their learning for summative purposes, which is what we do of course even for certificate, but we need to recognise that is a legitimate part of teacher’s jobs, they ought to compliment each other. The teacher is actually in the driver’s seat on this – you as teachers are in the driver’s seat - you know your students in great detail; you probably aren’t confident that you know your students in detail because you have been told for a long time that you are not really expert in this, but believe me you really are. And if you are not, in a sense, it is the next task to engage in is to develop up skills in order to be able to feel that your judgements of progress that students are making. You can’t really teach adequately if you don’t have some way of gauging whether students are learning, it is as simple as that really isn’t it? The way in which you gauge student learning is therefore part of your teaching, and the information you are gaining from moment to moment and day to day and week to week and month to month on students progress is legitimate information that ought to be recognised and reflected in the way in which we report officially on students.
So I have said some things there, but I am not going to expand of all of those. There is a thing about detail and aggregation too, where it is sometimes seen as a conflict between formative and summative uses, because formatively we need a lot of information, and it is a lot of information that you use when you are teaching students and adapting things to suit the circumstances. Whereas reporting requires succinct information; the capacity to sum it up and provide “How is little Johnny going?” in a succinct way, but behind that there needs to be the capacity to drill down and have more detailed conversations with parents, and with students, about the detail. So I don’t see them as being conflicting, it is simply two different ways of representing information.
Managing learning expectations and performance standards: I just want to say just a couple of brief things here, because there are some other things I want to say about moderation and I am running out of time, but part of the problem about some of the discourse at the moment is that where we set minimum standards, even if minimum standards are supposed to be reasonably high, students who persistently fail to meet minimum standards are in a bit of a bind. And in terms of distorting effects, an example is the United States, where under the no child left behind legislation schools are required to get as many students as they can over the minimum standard boundaries. And what happened, of course, is that teachers decided that there were some students who were too far below the minimum standard to be worth spending the effort on, and it was better to spend the effort on trying to get students who were fairly close and could be got over the boundary with some degree of effort, but with a reasonable chance of success. And ignore the students above the boundary of course, and that has distorting effects on our education, teaching for the test and so on. Teaching for the test is a distortion, in other words, because it narrows the curriculum etc.
High standards: there is a lot of discussion too about setting high standards – we want everybody to achieve excellence in a sense – but that of course can be alienating as well if students can’t actually achieve them. And I would suggest to you that we need to have personalised expectations of students performance that are adaptable to the individual student, and that are challenging but achievable within reasonable time frames, rather than to have set things that we expect of all students at all year levels, it is just not going to happen.
Tracking student development and reporting student progress: I just want to say briefly, and move on, that I have a problem in general with using grades, because they do leave some students with a persistent “I am an E student and a failure” kind of problem, with the possibility of alienation from students. And I think the notions of key stage developmental targets is a much better idea, and it seems that you have a chance of realising that, I think, whereas in Australia the only state that has managed to keep that idea alive in the current environment has been Victoria, with its Victorian essential learning standards, and I think that’s a better way to go, at least for years 1-10 up to secondary 4. But the thing that hasn’t been recognised is that if we adopt that kind of approach it actually has radical implications for the way in which schools are managed, the notion of classes that are year level classes becomes a bit stretched. Still I don’t want to expand on that, I want to just simply say this; we would be better to report student development positively, choose supportive language and so on.
I want to talk about quality assurance and moderation, because that is the interesting thing about Queensland with years 11 and 12. For public confidence in school-based assessment at any level you need some form of quality assurance. Some forms of quality assurance simply look at the kinds of things that teachers are doing in schools in order to assess students, but quality assurance can be light or strong. That is, in the certification years it needs to be strong because of the greater degree of expectation on certificates, whereas in the early years of education perhaps a lighter form of quality assurance is possible. Moderation, as a part of a quality assurance process, focuses on the comparability of the outcomes, the equivalents of the way in which you have established standards for reporting across the whole school community. And that is needed for high stakes, we need to do it proactively, you can’t do it after the fact, you need to assure the product, and it is based on comparability. Part of that is training up teachers to do this, to do assessment, and I think that is sometimes forgotten as being an important thing. Training teachers to do good assessment is a part of any quality assurance system and that is what we do. And the last thing, I think, is important as well, the collegiality idea is important.
Queensland moderation in the senior years: I want to say just a few words about that, and you can ask me some questions if you want more information. I have said school-based, within school; within school consistency is obviously important as a prelude to any between school consistency, so if there is more than one teacher teaching a subject within a school, clearly those teachers need to get together and moderate their assessments within the school before ever anything happens externally. In our system there are subject requirements, and there has to be an approval of a school implementation plan for a subject. And that is an important part of the whole process, it is actually considered part of the moderation process, but I would really say it is actually a part of an overall quality assurance process, and moderation hangs of it, depends on it. In our system there is continuous assessment over the two years, but at the end of the process of two years, when we give the certificates to students, the final result is dependant on the latest and fullest information about the student’s performance. And some earlier examples of a students work fall off, they are taken out, and are not used, and so too can be anomalous performances if the school considers that that is appropriate. Student work is gathered into a portfolio, and the portfolio becomes the basis on which a judgement about the final result for that subject is made.
There are five standards, only five standards – they are not A-B-C-D-E, they are very high achievement down to very low achievement, but the same principle – they are called levels of achievement in order to stress the notion that these are objectively assessed standards. I will just give you a couple of examples of standards; in senior English there are three dimensions, or criteria, to develop a rubric, in drama for example there are those three criteria or dimensions. And that is one set of representations of standards using words to describe the five level for ... I have lost what that was about, but it doesn’t really matter, it is just giving you a visual representation of how a set of standards is written on one of the dimensions that I just referred to. I think it was English, and it was the first dimension. There is also, in English, an expression of a minimum standard for a sound level of achievement, which is the C. And all three of the dimensions are represented there with an explanation about what the minimum standard looks like for getting a sound level of achievement. So there is sort of objectification of the standard. And that is another example from a design course, but when I tried to do this it chopped off the A, so I am just giving you the B, C, D and E there, but it gives you an idea that there is more elaboration of what is involved at the higher levels of performance and the lower levels of performance.
Judgement against those standards is considered to be a holistic judgement, which of these standards matches the student’s performance. Or, if you like, which of these standards best represents the performance of the student. And it is an on-balance judgement, because if you have got three dimensions and three different sets of standards for the dimension you have got to work out the trade-offs; they might be better on one and lower on the other, so how do you reconcile the differences across the three dimensions in order to arrive at an overall grade in the subject. But it is a judgement, not something that is arrived at by an algorithmic method of adding up marks. We don’t do it that way; you don’t have to have any marks against any of the assessments that students do, it is what’s in the folio at the end. We might have a profile, and some of that profile might actually be marks on tests, for example, but at the end of the day it is what is in the folio and how you make a judgement of the performance shown by the examples of work in the folio against the standards for each of those grades.
The review panels consider only a small sample of portfolios from each school in each subject and there is a review panel for every subject. The state is broken up into thirteen districts and there is a review panel in each subject – that is sixty-two subjects – in each district, and then there is a state review panel that moderates across the district review panels for the state as a whole, to make sure that they are all lining up as well. And the important feature here is that that panel operation is collegial; they are looking in the portfolio – not second marking it – to find the evidence that justifies the decision of the school as to the standard that it represents. And that is the absolutely critical fulcrum idea in the whole system. And then advice is given back to schools, which might say we can’t find the evidence to justify giving this student a B, how do you arrive at that judgement? And there may be a process of discussion between the panel and the school about that until some resolution is arrived at, and some agreement is reached as to what shifts in the results that are being awarded to students need to occur in order for there to be consistency across all schools.
So the school has agency; the school is getting advice on the basis of a sample of its student work, and it is assumed that there be attention to that, and that it is not just those samples that are affected by any advice back to the schools, but the whole set. And there are mechanisms for checking up on whether the school has actually taken the advice on board, and has actually made it’s adjustments in the way in which it has been agreed it should. So there is some agency from the school, but there is some also controlled by the Queensland Study Authority, the studies authority that controls all this.
Qualities of the system: continuous assessment distributes the pressures. Students aren’t engaged in a last minute frightening panic of having to sit for an external examination, and it’s progressively over a time they are getting feedback along the way. They are being told what they need to do in order to move up to the next level of performance, the next time of an assessment of that kind is done, and their best effort is represented in the portfolio. There is transparency; teachers and students have a set of standards against which everything is being judged, and students have to be engaged in conversations about those standards. Revealing what the standards are is an important part of the whole process. So, at the end of the day, there should be no surprises for students, there may be some adjustment at the margins in some schools because of the advice that comes back about their work from the panels. But even that advice comes back before the very last day of school, and therefore before everything is all wrapped up. Students should know exactly what result they are getting the day they leave school.
There is accountability of teachers, of course, because you are accountable upwards towards the panels. You are accountable down to the students, because you have got to be able to justify your positions, the students will ask you, why did you give me a B? And if you haven’t got a good answer it is not good enough, there will be an expectation that you can provide an answer about the judgement that you have made. And sideways in the sense that you have got peers with whom you are working as well, and so there is a collegiality issue there.
All of this is seen as enhancing teacher professionalism, which is why the teachers like this system, and prefer it to anything else. There is a high degree of comparability, we do an annual random sampling across the whole state to see how things have shaken down each year, and try to test the places where it is likely to have fallen down for some reason or other. And we get extremely high agreement across the state for those levels of achievement.
Benefits: I have run out of time, but I am just going to quickly run through these. I think that assessment moderation creates communities of practice, it supports teachers, it values their judgements and so on. It disseminates new ideas; assessment never stands still in Queensland because teachers are inventing new ways of doing assessment from year to year, and saying I think I can improve that next year and do it a different way. Now people see other examples of assessment in other schools, from their participation on panels, and take them back to their own schools. We do get a convergence of judgements of standards, and that is a good thing as well, it builds confidence and we get a degree of comparability. It provides professional development, one shouldn’t underestimate that. The professional development consequences for being involved in panels is really quite strong, and the reason why there are waiting lists for some panels – teachers and schools – some schools take great pride in the fact that they have a very large number of their staff who are involved in the panels, and use it as an advertising aspect for their school as well. Obviously you have got very high participation in something that is important.
If you want more information, there is the website www.qsa.qld.edu.au, but you can just say Queensland Studies Authority in Google and go there. And you can get the annual reports from state review panels about how things went in a subject each year. You can look at the random sampling reports from each year, way back to about 1998 I think. And y9u can get other information about syllabuses and about the whole thing.
Five lessons on assessment: I started off talking about these kind of thngs, and I think we need to emphasis the authentic and comprehensive aspects of learning. We need to take teacher’s assessment roles seriously, we need to personalise student learning goals, we need to report student development positively – that might mean that grades are not the best way to do that, but a developmental sequence might be - and we can use moderation to develop a common understanding of performance standards.
Some conclusions on teachers as assessors:-
- I think teachers are uniquely placed for assessment of students, and we ought to value that and use it even in the public domain.
- We need to train teachers for these assessment roles, and I don’t think we do a very good job of that at the moment.
- Teachers should use and reconcile all available data for them; I am thinking mainly in years 1-10 rather than in years 11 and 12, but the same principal really applies if you are going skill based at that level.
- And we should value and support and develop teacher professional expertise in assessment.
Thank you.
[End of Recording]
Find us on