The Cheese Story, Part Two

In 2015, pre-blog, I was not the dedicated thug that I am today. I had already taken a few potshots at VCAA, but I hadn’t gone in hard. I understood, of course, that VCAA was prone to being arrogant and inept, but I hadn’t yet concluded that they were systemically arrogant and inept. I hadn’t yet realised the magnitude of the target. So, in 2015, when two teachers approached me complaining about a VCE exam question, I handled it differently than I would now. Then, I was polite and patient with VCAA. We all learn.

The exam question that troubled the teachers was a now infamous multiple choice question on that year’s Further Mathematics Exam 1. The question concerned a block of cheese, pictured above. As indicated, a cut is to be made and the question begins,

A smaller, similar wedge of cheese is cut from the larger wedge of cheese, as shown in the diagram.

This question is undoubtedly familiar to many readers. For all you other guys, we’re guessing it took about a half second to see why the question is mangled beyond repair.

To complete the question, which turns out to matter, the problem was to determine the pictured distance, d, so that the “similar wedge” has half the volume of the original wedge. Cutting the wedge as intended, this leads to d ≈ 2.3 cm: answer B. If, however, students shrank the wedge to be similar, this leads to d ≈ 1.7 cm, which was also an option: answer A.

The appearance of such nonsense on an exam prompts two questions. First, how did such nonsense get approved? Then, secondly, what did VCAA do once the nonsense was called out? I have been told an answer to the first question, but I don’t think I am permitted to tell that story. This post is an answer to the second question.

The exam was held on a Friday, and the following Monday one of the two teachers, let’s call her Joanne, emailed VCAA’s Mathematics Curriculum Manager, politely noting the contradiction between the diagram and the term “similar”. (The other teacher decided not to formally complain because VCAA.) The MCM replied the same day, indicating he had forwarded her email to VCAA’s Examinations Unit.

Two weeks later, because VCAA, Joanne received a reply from the Examinations Unit, from let’s call her Robyn:

Dear [Joanne],

Thank you for your email regarding multiple-choice question 9 in the Geometry and trigonometry module of Further Mathematics Examination 1.

When responding to a question, it is expected that students will read the question in its entirety. In Question 9 this includes the diagram of the wedge of cheese in addition to the words in the question. Students are twice directed to look at the diagram. It is clear from the diagram and the wording of the question that only onecut [sic] is made to the wedge of cheese. Given this information, the word ‘similar’ in the question is used in its natural language sense and not in the mathematical sense, as the one cut to the wedge does not change the length dimension of the prism.

Regards,
[Robyn]
VCE Examinations Unit

Unsurprisingly, Joanne was less than thrilled with this response. With no obvious next move, Joanne contacted the MAV, and me and Burkard. The MAV did nothing. I did something, although in the end my intervention probably made no difference to the outcome.

I consulted with a few of my usual consultants, and then emailed the Examinations Unit, with attention to Robyn. This was now three weeks after the exam. After noting the blatant flaw in the question, and arguing against Robyn’s strained defence of the question, I addressed VCAA’s unwillingness to face reality:

… What concerns me as much as the inclusion of such a question is the VCAA’s apparent reluctance to acknowledge failings. From my knowledge of past (in particular Mathematical Methods) exams, this appears to be somewhat of a systemic issue. No one expects the VCAA to handle the very difficult undertaking of VCE exams in a perfect manner. However, I believe we do have the right to expect the VCAA to acknowledge and to address the inevitable imperfections with a significantly greater degree of professionalism, openness and honesty.

A few days later, Robyn replied:

Dear Mr Ross,
Thank you for your email regarding Question 9 of the Geometry and trigonometry module of Further Mathematics Examination 2.
The VCAA is continuing to monitor student responses to this question, and all others.  If there is evidence of students being misled by the wording of the question, then we will take the appropriate action to ensure fairness to students.
Regards,
[Robyn]

I responded the next day:

Dear [Robyn],

I can’t imagine how you could regard your email reply to me as in any sense adequate. Please provide me with the name of your supervisor.

Regards, (Dr.) Marty Ross

(I said I was polite. It doesn’t mean I was a wimp.)

Robyn quickly replied, with the requested email address of her Manager, who turned out to be the Executive Director, Curriculum Division. Let’s call him Dr. David Howes.

That same day, and before I had had a chance to email David, he emailed me, offering to discuss the issue. After a polite “let’s discuss” exchange, David provided a substantial update:

Marty,

I should first apologise that I did not make clearer to you, through [Robyn]’s response earlier in the week, that we were in the process of analysing all the data from the Further Mathematics exam, and that we would be in contact  following the conclusion of that analysis.

It would have been pre-emptive for us to immediately respond to issues that were raised (and that we identified ourselves in the course of the marking process) in relation to Q9 in module 2 prior to having completed the kind of statistical analysis we undertake whenever any issues are identified about exam questions.

We have now completed that analysis. …

Our analysis of student responses to Q9 provides strong evidence that some students may have interpreted the term “similar” using its mathematical meaning rather than its natural language meaning, the latter being the intended usage. This being the case, we are going to accept both options A and B as correct answers for the purpose of awarding the one mark available for this question. In retrospect, the inclusion of the term “similar” clearly did not achieve the objective of making the intent of the question clearer. We will in future be even more rigorous in checking our exams to ensure that, wherever possible terms, the use of any term that may have an ambiguous meaning is avoided.

Thank you for raising this issue with us. I am always keen to engage in constructive discussion that enables us to keep identifying any areas in which we can improve our examinations, so am happy to discuss further any aspect of the above.

Best wishes,
David

In many respects, David’s reply was very good. Most importantly, he indicated that the grading of the question would be adjusted as reasonably as it could be: a mark for either of the two plausible answers. David acknowledged that question was unclear (which was also later acknowledged in the exam report). He was more forthcoming with information than he was required to be, and he was apologetic for Robyn’s previous fobbing reply to me. David seemed genuinely appreciative of the criticism, and his offer to discuss matters further was definitely sincere. (For a couple reasons, I declined.)

For all that, there was something truly maddening, and mad, about David’s reply. Fundamentally, David and/or the Examinations Unit seemed to have no clue how to evaluate the soundness of a mathematics exam question. A question cannot be primarily judged by “statistical analysis” of “student responses”; first and foremost, a question must be judged by the mathematical meaning, or meaninglessness, of the words used. As such, it should have taken VCAA half an hour, rather than the month they did, to conclude that their exam question was stuffed. Nothing in David’s reply indicates any understanding of this.

David also maintained an odd silence about Robyn’s original response to Joanne. That response came two weeks after the exam, and so presumably after VCAA “identified [the issues] ourselves in the course of the marking process”. As such, and anyway, Robyn’s response to Joanne was inexcusable. At some point I indicated to David that Joanne was owed an apology; my memory is that she received one.

This 2015 episode is not all bad; 2022 was way, way worse. But the episode also suggests that there is something systemically flawed in the way VCAA handles exam issues when they inevitably arise. The issue with the cheese question was fundamentally a mathematical issue, with fundamentally a mathematical answer. So why did the Mathematics Curriculum Manager appear to do nothing beyond handballing the question to the Examinations Unit? Did he tell the Examinations Unit that the question was stuffed? Did the Examinations Unit ask him? If so, what did he reply? There are no answers to these questions that reflect well upon VCAA.

Finally, there is The Cheese Story, Part One. As I wrote above, I don’t think I am permitted to tell Part One. I’ll just say, as it was told to me, Part One is black hilarious, much worse than Part Two.

14 Replies to “The Cheese Story, Part Two”

    1. A very good description of the response Joanne received.

      My impression is that VCAA does not think a question is mathematically defective so long as their psychometricians say there was reasonable correlation between student ability and student performance indicating that students were not likely to be disadvantaged. I wonder what would have happened to that single student in Queensland (https://mathematicalcrap.com/2023/10/25/qcaa-vcaa-and-the-tests-of-integrity/#more-24797) if QCAA operated like this.

      The percentage cohort responses for Options A and B were 21% and 41%. I can’t help but wonder how different your (Marty) email from Call Him Dr. David Howes might have been if the Option A response had been ‘significantly’ less (as calculated by the psychometricians).

      1. No need to be cute with the “psychometrician” thing. Yes, that’s a cheap stunt that VCAA loves to pull, but I don’t think they were pulling it here.

        But your “what if” pondering is spot on. The implication is that if fewer students had opted for the similar solution then VCAA would have denied any issue with the question, or at least enough of an issue to have changed the grading. It’s pretty clear that some such calculation was behind the misgrading of the 2022 complex question.

      2. Indeed BiB – that’s exactly what would have happened.

        This is a beautiful case study of modern bureaucracy in action and highlights trends over recent years.

        Firstly it highlights the difference between how a bureaucrat and how a ‘professional’ eg teacher/mathematician thinks. Bureaucrats operate in a grey ambiguous complex political world. When you make a mistake, firstly you hope no one has noticed and cover it up if possible – especially since mistakes can be used against you and the organisation (particularly with so much media and Twitter these days). That explains the first response.

        Secondly you check what effect it might have had. If no one was affected too much, that’s fine. That explains the second response. In a bureaucrat’s eyes, if people weren’t affected, then it doesn’t really matter – it’s an ambiguous world, where words shift in meaning constantly and are very malleable (the word ‘clarify’ is often used to fix mistakes).

        Once upon a time, the public service had content experts – a doctor would run the Health Dept, a former Principal the Education Dept etc etc. Now such people (doctors, teachers, engineers etc etc) are VERY few and far between at any level of the bureaucracy – at senior levels they are replaced by Deloitte types – consultants mouthing big words who know very little about the real world, but a lot about generic management and internal politics.

        Furthermore, once upon a time there were many independent statutory authorities. The last one to be subsumed into the public service was VicRoads. Governments have been centralising power for some time (particularly in Vic). VCAA is just a division of the Education Dept. There are a number of things called ‘Authority’ to give the impression of independence, but there is very little such independence these days.

        To have good exams, you need both good bureaucracy (to organise it all) and good content knowledge (because there is such a thing as right and wrong in maths). The linkages between the bureaucracy and the real world have significantly withered and relationships with academics are few and far between these days. One of the challenges will be to build structures that actually incorporate better linkages between the universities and exam development that are likely to stand the centrifugal pressure over time to centralise things. Other states seem to be able to do it better, so it seems to be possible.

        The ability of wider society to act as a check may also have been reduced through acceptance of Govt funding (eg my quick reading of MAV’s last annual report indicates they received over 20% of their revenue from Dept Education – for projects no doubt, but revenue none the less). And of course we all know about the decline in universities…..

  1. I briefly checked the 2015 VCE report FM 1 Examination report and immediately spotted two further errors:

    Page 4 says “students needed to recognise that only two dimensions of the large wedge of cheese were HALVED”, but it should say “REDUCED” instead of “HALVED”.

    Page 5, d-8 should be replaced with 8-d. Oh well, what’s a factor of minus 1 between friends?

    1. Thanks, Terry Trevor. The d – 8 is very funny. I hadn’t notice that.

      What’s your concern to use “reduced” rather than “halved”? I think I get it, but don’t think I agree.

  2. Sorry, Marty, but I don’t agree that it’s mangled beyond repair. There is a problem here but it is not in the use of the word “similar”. With the inclusion of a diagram they removed (almost?) all ambiguity and their only mistake was to include answeres for two different interprtations of the same word. If they had not included a good answer for a mistaken interpretation no student would have been led to submit that wrong answer.

    They erred in giving full credit for a wrong answer. That might be a”nice” thing to do but it’s not honest. Mathematicians shun ambiguity and it is not cruel to pass this ideal on to students. Teach them to strive to answer exactly the same question as was asked.

    So on to the cheese question and all the controversy that it has engendred: I might be the only one but I see no serious problem with it except what I’ve already noted. That stuff about “half a second see WHY the quest ion is mangled beyond repair” – I have spent well more than half a second and can’t see the WHY. Everything goes back to two different meanings for one word, an every day meaning and a special meaning, used here or there in some particlar place. But that is not any problem. Does any mathematician get confused by the words like GROUP, RING, FIELD, or . . . So whats the big deal with SIMILAR?

    1. Because similar, when talking about shapes, and in math (where this question is), has a specific meaning. It is also a very common meaning, and, it is used a lot in high school geometry.

      So in other words, the question is asking students to do something unfair and impossible. To either ignore this obviously mathematical term, or to take it into account. Each choice leads to an answer.

      Do you not see this as problematic?

    2. Hi, Jack. My response is pretty much “what Glen said”, although I think the use of “similar” in this question is significantly worse than Glen suggests.

      I can elaborate if you wish, but I’m honestly puzzled by your comment. Why is it so difficult for you to see the issue with the use of the word? Why did you consider it required an essay to make your point?

  3. > and that we identified ourselves in the course of the marking process
    Unsurprising! One of my teachers is a VCAA Physics examiner, and I’d assume the process is similar for Physics and Maths. According to him, none of the examiners, including the chief examiner, are allowed to see the exam beforehand. The chief examiner creates the marking guide, so that means VCAA don’t create the marking guide before the exam

    1. Thanks, Jay. Your mean your teacher is a grader (I think what VCAA calls an assessor)?

      I don’t know the VCAA process. Ideally, each subject (not each exam) should have a Chief Examiner, who oversees everything, including the writing, the vetting and the creation of the marking scheme. That the worker ant graders, even a “chief grader” (if that exists), don’t see the exam beforehand is sensible. But if VCAA has it that people who have never previously seen the exam then create the marking scheme, that would be very nuts.

      Whatever the process, glitches can always arise. The issues are how to minimise the probability of glitches and what to do if and when a glitch arises. For both, VCAA is the model of what to not do.

      1. In mathematics, the exam is prepared by panel X. The marking scheme is written by Chief Assessor Y. I do not know if Y is given solutions from X and then writes a marking scheme to fit those solutions. \displaystyle Y \notin X.
        The marking scheme gets ‘refined’ at the Assessor Training Day – each assessor gets exams to ‘practice’ marking and any potential issues with the marking scheme that emerge from this are discussed and resolved. Y has the final say.

        Each subject has its own way of doing things. I know that the Training Day for the Physics exam is on-line. I have heard many complaints about it this year, including the Y muting (without warning) assessors who were trying to discuss ‘irregularities’ in how some questions were to be marked, and being ignored in the chat feed and when their iconic hand was raised. The process sounded unreasonably over-controlling and uncollaborative.

        It was harder to ignore assessors when the Training Day was in person. Apparently there are no plans to return to that format because on-line affords greater opportunity to teachers in rural Victoria to be involved (and, I assume, get muted). Assessors I have talked with don’t know why the Training Day cannot be conducted using a ‘hybrid’ model of in-person and on-line.

Leave a Reply

Your email address will not be published. Required fields are marked *

The maximum upload file size: 128 MB. You can upload: image, audio, video, document, spreadsheet, interactive, text, archive, code, other. Links to YouTube, Facebook, Twitter and other services inserted in the comment text will be automatically embedded. Drop file here