WitCH 72: High Significance

Here is one more from the 2019 Specialist Exam 2, once again courtesy of student PURJ (who is vetting the exams much better than the exam vetters). It consists of the last four parts of Q6, the final question on the exam, concerning a machine that packages noodles. The answers from the examination report appear below the questions.

c) \color{blue}\boldsymbol{H_0: \mu = 375, \quad H_1: \mu \neq 375}.

d) \color{blue}\boldsymbol{p=0.046}.

e) \color{blue}\mbox{\bf As }\boldsymbol{p<0.05 \mbox{ \bf reject } H_0.}

f) \color{blue}\boldsymbol{\mbox{\bf Pr}\left(\overline{X}<x_c\right)=0.025\ \Rightarrow \ x_c = 372.1}.   (01/11/21 – brackets fixed.)

UPDATE (02/11/21)

OK, I mucked this one up. There is no crap here of which I am aware. (Well, it’s stats, …)

The issue is around part (f). The probability equation gives \boldsymbol{ x_c =372.06}, and the question is whether to round up or round down. The examination report correctly rounds up, which ensures to give a value where H0 is still not rejected. I got turned around and figured otherwise.

I’m sorry for the screw up.

UPDATE (02/11/21)

Rule Number 1: There’s always crap; it’s just a question of where.

Thanks to SRK for point out the corresponding question, 6(e), on the 2018 Exam. In that case, the examination report indicates, without a word of explanation, that both the rounded up and rounded down answer were accepted. Then, the next year, it is implied, without a word of explanation, that only the correct rounding was accepted. These people are nuts.

23 Replies to “WitCH 72: High Significance”

      1. I think Terry means that x_c takes two different values as it stands; the parenthesis “)” should be moved further to the right, a minor typo.

        1. Ah, thanks, Christian. Fixed now. (There’s a substantive problem I have with what I’ve been told about this question, but I’ll leave the post as is for now.)

  1. My main issue with this problem is that you are doing a 2-tailed test.

    Surely, since the mean mass of the samples came in under the prediction, a 1-tail test would be the first instinct?

    Or am I being too simplistic and not really understanding statistics?

    1. RF, see the update to the post.

      With the2-tail test thing, I think it is fine, although others are better placed to discuss the nuance. I think, before the same is taken, one does not know whether any change in mean is likely to an increase or decrease; so, a 2-tail test makes sense.

      1. I’ve done work like this for a food manufacturer. You don’t want the packet to be “light on” because you want the customer to get at least what was paid for. You don’t want the packet to contain too much product because even little excesses mount up to a substantial loss of income. So a two-sided test can be justified.

        The following is not a criticism of the problem itself, but let us suppose that you carry out a significance test, and reject the null hypothesis. Rejecting H_0 is, in itself, not particularly useful information. A good book on this issue, by an Australian author, is Cumming, G. (2012).”Understanding the new statistics: Effect sizes, confidence intervals, and meta-analysis”. Routledge.

      2. Yes, unless you have reason to believe that the machine is not working properly one way or the other (eg. you suspect that packets are over-weight for some reason), a two-tailed test should be used.

    2. RF, I believe the issue here is that you formulated your hypotheses after you collected your data. It is my understanding you first formulate the hypotheses then separately collect data to test the hypotheses.

      In this scenario, the machine is assumed to be repaired. As a result, we have no basis for assuming that the current noodle weight is above or below the historical mean so a two-sided test would be appropriate.

      We need to use different data to formulate our hypotheses. For example, if there were customer complaints that the noodles packets were light then this information could be used to justify a one-tailed test. We could then collect data to test the hypothesis that the noodle packets are underweight. The key idea here is the data used to test the hypotheses wasn’t used to formulate the hypotheses.

        1. Thanks everyone for the clarification on this. I’ve only taught this part of the course once thanks to it being cut last year and… other developments that happen in schools… so I’m a bit rusty on the finer points (which don’t really seem to be tested anyway).

          As far as I’m concerned, a student should simply read that the question calls for a 2-tail test and then get on with the button pushing.

          Rounding has always been an issue in these exams, including in conditional probability questions (using the unrounded answer from a previous part will give a different answer to the next part compared to using the number written as the rounded answer).

          As with all these things, I know I COULD ask VCAA to clarify. Maybe I will when I’m next asked to teach the subject.

          1. Hi, RF. I’m glad my screw-up has been of assistance. Obviously the First Rule of VCE is to do whatever the question asks you to do. But, if the question doesn’t tell the student whether to use a 1-tail or 2-tail test, there may be an issue. See the discussion on this WitCH.

            1. I remember it well.

              I think the point about deciding the type of test after gathering data has irrefutable merit while also basically saying all VCAA questions on this topic are junk.

              Even though we could debate at length whether or not statistics (or networks, or Boolean algebra…) should be part of the Specialist Mathematics curriculum, the main game for teachers is navigating through the excrement to maximise student results.

              Such is the nature of the beast.

            2. Yeah, a classic case of VCAA explicitly DIScouraging correct methodology (the correct methodology is explained by tango). Because the VCAA writers simply do not understand statistics.

              RF, good luck with getting the clarification from VCAA. (And I hope circumstances will have you wanting to seek it sooner than later …) Maybe the new Mathematics Manager, whoever gets appointed, will be a much-needed and much over-due breath of fresh air.

              1. Thanks JF. We live in rather unusual times…

                Sometimes a new head makes a difference, sometimes the problems are so in-grained that it takes a bit more than one person to undo years of damage.

                Looking at Year 12 (or equivalent) papers from around the world, we are not alone in this struggle, which brings me back to the eternal question:

                WHY?

  2. Marty, I think there is crap here, in part f, not least because this issue of whether to round up or down appeared in the previous year (2018 Exam 2, 6e), and BOTH answers were accepted. I think there is a case to be made for either rounding up or down, depending upon how you read the question. The crap here is that despite the 2018 examiners being aware of the ambiguity in the 2018 question, and accepting both answers, the same ambiguity has appeared again in 2019 and there is no comment in the examiners report of whether both answers were accepted. A student sitting this exam who wrote down 372.0 – believing that since it was accepted in 2018, it’ll be accepted again – would be rightly pissed off if the mark was withheld.

    1. Ah, that’s interesting. So, crap spanning the years? Thanks, SRK, I’ll check it out.

      I don’t see how the question is ambiguous, and how it is arguably legitimate to round in either direction.

    2. SRK, regardless of what VCAA has accepted in previous years, there is only one correct answer. And here’s why:

      The critical value is xc = 372.06, rounded to 2 dp (more accuracy can be used if required but it’s not necessary here).
      H0 is rejected if our sample has a mean less than xc.
      H0 is accepted, that is, not rejected, if our sample has a mean greater than xc.

      Sample mean rounded to 372.1: 372.1 > 372.06 therefore H0 is not rejected.
      Sample mean rounded to 372.0: 372.0 < 372.06 therefore H0 is rejected.

      So to 1 dp the smallest value of mean mass "for H0 to be not rejected" is 371.1. It's an open and shut case. Take a bow, VCAA. You got the right answer.

      But … VCAA has screwed:

      1) A little bit: "smallest value of mean mass for H0 to be not rejected" is dumb wording. It's clearer to ask
      "smallest value of mean mass for H0 not to be rejected"
      Or perhaps the even clearer “smallest value of the mean for H0 to be accepted.”
      It's only a small thing, but I bet this stupid wording tripped students up.

      2) A lot: SRK, you're correct. No explanation is given in the 2018 Report as to WHY two answers were accepted (I assume it was a random (lol!) decision made for no good reason). Two answers were acceptable in 2018 but apparently only one answer – the correct answer – was acceptable in 2019. No stats in 2020, so the deciding vote is the 2021 exam …

      SRK, there’s no ambiguity in what the correct answer is, but there’s ambiguity in what VCAA will accept as a correct answer.

      So there is crap, Marty. Your only mistake was to think you'd made a mistake. There's ALWAYS crap.

      1. Sorry mea culpa: this mightn’t eliminate our disagreement, but I stupidly somehow read the critical value as something like 372.02…. But for the critical sample mean 372.0600… I agree – 372.1 is the only reasonable answer.

        If it had turned out that the critical sample mean was 372.02514…., what would you say in that case? I think in that case 372.0 is a reasonable answer: I found the critical value, then I rounded to one decimal place. Or does “correct to one decimal place” imply that the *rounded* value should also, if it were the sample mean, result in H0 not being rejected?

        This was the issue that arose in the 2018 version of this question, where the critical value is 146.5107… and the question asks students to give the answer “correct to two decimal places”. Both 146.51 and 146.52 were accepted.

        1. 146.51 lt xc = 146.5107 therefore H0 is rejected.

          146.52 gt xc = 146.5107 therefore H0 is not rejected.

          (Apologies for no inequality sign, inequality signs aren’t working)

          There is only ONE correct answer (in the sense of making a decision). It’s an open and shut case. It was dishonest and wrong of VCAA to say otherwise. So maybe in 2019 they were trying to set the record straight. We’ll see what record is played in 2021.

Leave a Reply

Your email address will not be published. Required fields are marked *

The maximum upload file size: 128 MB. You can upload: image, audio, video, document, spreadsheet, interactive, text, archive, code, other. Links to YouTube, Facebook, Twitter and other services inserted in the comment text will be automatically embedded. Drop file here