(The title is very clever, but you have to think about it.)*

This one is the last part of the last question of 2022 Specialist Mathematics Exam 2 (not online). It sparked a lot of discussion on the exam post, but seems worth its own WitCH. The question is clearly a mess, but what was intended, and how to think about the mess is not so clear, at least to us. (We showed the question to a professor of statistics, whose first reaction was “Ow!” We’ve applied the smelling salts, and we should be in possession of the professor’s second reaction soon.)

For clarification, we are (kinda sorta) told in the question stem that the masses of the empty cans are normally distributed, but we are told nothing else relevant, other than what’s given below.

Go for it (again).

*) Proving that the title is not really that clever.

### UPDATE (27/10/23)

The exam is here and the report is here (Word, idiots). The report treats the Can mass and Total mass as independent, without explanation.

I’ve just shown this exam question and solution to a professor of statistics. Their summary: “Crazy!”

I think the question stem to part (e) might also be relevant:

“The equipment used to package the soft drink weighs each can after the can is filled. It is known from past experience that the masses of cans filled with the soft drink produced by the company are normally distributed with a mean of 406 grams and a standard deviation of 5 grams.”

I included that, didn’t I? Did I miss something?

OK, I think I know your point. I didn’t intend to imply that the preamble to (e) was irrelevant. I’ve made a short (not indicated) edit to clarify that. Thanks, John.

No problem. In fact, it was probably worth getting out in the open the need to refer back to that earlier question stem (located at the very top of the page). Possibly not obvious and easily over-looked, particularly under time pressure (and it was the last question on the paper).

Well, this has already been pointed out in the original, but the independence of the soft drink distribution and the empty can distribution is never mentioned in the question. So, one issue that , but covariance is not covered in high school, so students have to hope that the question assumes the random variables are independent of each other, which feels like poor practice. In reality, I don’t see why they’d be dependent on each other (unless you devise some strange scenario in which they are connected), and students may not be aware of what covariance is.

Even if independence is ‘obvious’, it still shouldn’t have to be assumed. And the court of intelligent public opinion is very clear: variables to use, which to define, and which are independent is less than obvious.

Students shouldn’t have to be worrying about this, it should be clearly stated. The (or at least what appears to be the intent) of the question is already challenging enough without the extra cognitive load from things like this.

I’ll re-outline my thought process for the question.

Let T = total mass

C = can mass

And L = liquid mass

Now, as a purely algebraic relationship between masses, it is clear that,

The total mass is equal to the sum of the liquid mass and the can mass.

We are told that T is a normally distributed variable.

T~N(406,5^2)

We are also told that the mass of the can is a a normally distributed variable.

C~N(15,0.25^2)

Now, we are told nothing of INDEPENDENCE.

Hence, it is unclear of whether,

I) T and C are independent

And

II) L and C are independent

i) L’s distribution DEPENDS on the fact that T is distributed N(406,5^2)

Or,

ii) T’s distribution DEPENDED on how L was distributed.

i) Would imply that L=T-C

and

ii) Would imply T = L+ C

In case,

I) T’s distribution was determined first, and the distribution of L was made to suit.

II) L and C are independent random variables and the distribution of T is a consequence.

To delve into this further is to assume the motivations of the factory workers.

I will say as a high school student I know nothing of co-variance and this is my reasoning within the framework of what I have been taught and know.

* since T is normally distributed it implies that L must be normally distributed.

Hi Names, and thankyou for the effort you’ve taken in posting your thoughts. This helps a lot in trying to understand your fundamental difficulty. I will try and help …

We know everything about the distribution of T and C and nothing about the distribution of L. T and C are ‘given’ random variables, but L is a random variable introduce and must define. So, for me, it makes sense to worry about whether T and C are independent (I think it’s reasonable to assume they are) and then conclude that we must define L = T – C and get its distribution (impossible to do btw unless T and C are assumed to be independent).

As for L = T – C => T = L + C. When you’re dealing with random variables, this sort of algebra doesn’t work (as I commented to Anonymous below). This is not obvious and is very counter-intuitive. I hope to have some time later tonight where I can think about and then post a simple example (involving discrete random variables) that can (hopefully) make what I’ve said plausible to you and maybe others.

VCAA has done all of us (especially the students who sat this exam) a great disservice by neglecting to provide any information about independence in this question. Deeper thinking about this question leads to many things outside the scope of the Specialist Maths course.

Thanks, John. Just to (hopefully) clarify your second paragraph: given T and C handed to us, we can (more or less) always define L = T – C. However, if T and C are not assumed independent then we (VCE students) cannot say anything (or at least enough) about L. Is that the gist?

Yes. That is the gist.

On the algebra of random variables:

If you take X, and Y to be independent random variables with and , then define , the resulting distribution should be a triangular distribution. If I’m not mistaken, the distribution should not be uniform… correct? (I think it may become an case of the Irwin-Hall distribution:https://en.wikipedia.org/wiki/Irwin%E2%80%93Hall_distribution) That being said, it’s not a good example seeing as you require convolutions.

That’s a nice example, Sai. But it requires just as much faith at the Yr 12 level as the result it’s trying to make plausible. I would like to offer the example below, which only requires Maths Methods.

Let the discrete random value X have probability distribution

We note that and .

Let the discrete random value Y have probability distribution

We note that and .

Now the random variable W = Y – X and assume that Y and X are independent. The probability distribution of W is

We note that and .

OK …. Now re-arrange W = Y – X to get Y = W + X.

The random variable Y defined in this way is the same Y we started with! The probability distribution of W + X is clearly different since our new Y can take on the values , 0, 1 ,2 with non-zero probability. The interested reader can calculate the probability distribution of W + X (you will need to assume independence of W and X) and note that but , not .

We also note that the correct variance of W is not obtained by re-arranging . Hopefully this is both less surprising, less confusing and (perhaps even) ‘obvious’ now.

The (non-intuitive) algebra of random variables has struck!

What is beyond the Study Design are the calculations that show:

1) If we define W = Y – X, then W and Y (and W and X) are not independent.

2) If we define Y = W + X, then Y and W (and Y and X) are not independent.

The calculations that prove these results require knowing about and a couple of theorems (*). Such stuff is typically taught in university mathematical probability and statistics subjects – beautiful subjects that should not to be confused with the dross masquerading as mathematics in the Study Design (**).

* They can be posted if there’s sufficient interest.

** Probability in Specialist Maths could have been great, but that chance was lost from confidence intervals onwards.

I realise that with normal random variables, the ‘domain’ of the probability distribution of any linear combination stays the same as the individual random variables. However, the stuff I said about the variance remains true. Unfortunately, as alluded to by Sai, this can’t be demonstrated within the scope of the Study Design with an example using continuous random variables. Sai’s example is as ‘simple’ as it gets for continuous random variables and must be taken on faith in VCE. But hopefully the discrete example I’ve offered bolsters that faith to the point of belief.

Are the initial cans 375ml? It didnt mention it anywhere. How do we know that?

If the weight of 1ml is 1.04g and a can is 15g… and there’s 375ml in a can this would be 405g?

What do we assume?

This question confused the sugar out of me… I had no idea what was going on in the final part? Nothing added up… likewise my students ..

We don’t know anything about the volume of the cans (good pickup).

My solution involved E(T)=E(C+L) and after some rearranging and substitution, E(L)= 391

Now,

1.04g -> 1mL

375×1.04 -> 375

390g -> 375ml

And we want Pr(L<390)

Students who also do chemistry would find that part easy.

The snag in doing it like this is finding the correct variance for L. This is the bone of contention.

Unfortunately, the algebra of random variables is very different to ‘normal’ arithmetic-type algebra. For example, it’s not obvious (and in fact, it’s quite counter-intuitive) that

T = C + L is equivalent to L = T – C.

In particular, calculating Var(T) = Var(C + L) and then re-arranging to get Var(L) is incorrect when the goal is to define the distribution of L by finding Var(L). Very confusing (for teachers too!)

Other things that may be less than obvious include:

If T = C + L, then T and C (or T and L) are independent even though C and L are independent.

However, if L = T – C, then T and C can be independent but L and T (or L and C) are independent.

All of this confusion is avoided in a question that clearly states which variables are independent. Such a statement not only enables the calculation to proceed, but also better guides the student to the ‘natural’ (or intended) random variable they must define and use from the get-go. And it helps students avoid the pit-fall of re-arranging random variables at some later stage.

Thanks, John.

Let’s take it as given that VCAA stuffed up (again). They obviously should have declared something was independent.

Let’s also take as given that we’re not going down the road of covariance. That can be a fruitful discussion later, but it doesn’t help now.

For me the natural place to start is to ask:

(A)

Are there two of the three variables that are naturally (in the real world) considered independent?Then (and only then), one can ask whether that natural assumption of independence permits us, together with the information provided in the exam, to answer the exam question.

Alternatively, one can ask:

(B)

What assumption of independence allows us to naturally/easily/at-all answer the exam questionThen (and only then) one can ask whether that assumption is (real world) natural.

It is always difficult to focus a blog discussion, but this one seems particularly hard, because different people are discussing along the lines of (A) or of (B), or distracting themselves and others by (justifiably) going back to bashing VCAA, or covarianting everything.

I agree with Marty that going along (A) and (B) are both legit. To me, their concurrence in the discussion is natural: his (A) is about “I want to answer this question”, while (B) is about “I want to fix this question [=make it mathematically rigorous]”. Personally, I would much prefer a question that is silly in terms of real nature (or real machinery, that is), as long as the mathematics is rigorous and in particular, that the necessary assumptions are all there, which in this case is not the case. But I understand that from a high school perspective (or even an applied perspective at uni), needs may differ.

At the risk of (real) side-tracking, and with no claim to originality whatsoever, I want to offer my view that this question shows once more how simple questions lead to not-quite-as-simple (i.e., the need to consider covariance) answers. To me, the crux lies in the notion that random variables have an “algebra” at all. I would find the term misleading; while it works for realisations of random variables (take full can and subtract can’s weight), it does work for distributions of random variables. (Classic, albeit not simplest, example: normally distributed with mean zero; so is by a simple mean and variance calculation, as these two parameters determine the distribution; then identically. Ugh.) Why the blather? Because I was once told that university students (not in mathematics, perhaps) had trouble distinguishing random variables / probability distributions on the one hand, and realisations thereof on the other. Who said that to me? A Monash professor.

One more note, to raise more mud still from this VCAA pond/quagmire: even knowing the covariance would not be enough; knowledge of the – distribution (perhaps routinely taken as a normal distribution) would help us out.

Hi Christian. But what about this:

The weight X of people entering a lift in a building is a normal random variable with a given mean and variance. Find the probability that nine people in the lift will have a combined weight of less than whatever.

There is a definite ‘algebra of random variables’ required here and it is different from the ‘normal’ arithmetic-type algebra.

The ‘algebra of random variables’ provides rules for the symbolic manipulation of random variables. In this case it says that we must consider the random variable

where the are clones of X

not the random variable . It says that .

The algebra of random variables provides rules for the symbolic manipulation of random variables (like in my above example), not the distribution of random variables. I don’t think anyone is saying that it applies to distributions.

I haven’t formally learnt measure theory etc. so, to dumb things down a bit, is it fair to say that

we have a random variable X,

a cdf for X (from which it may or may not be possible to get a pdf when the cdf is continuous) that defines the distribution of X, and

a realisation of X, namely X < x or X = x (if X is discrete).

PS – An enjoyable read is the textbook "The Algebra of Random Variables" by M. D. Springer: https://www.ime.usp.br/~jmstern/wp-content/uploads/2020/11/Springer1979.pdf

Hi John,

Your ‘lift’ example is algebraic in one sense, in that it is solved by adding parameters (means and variances) to produce the parameters (mean and variance of the sum) that are needed to, say, compute the probability of that sum (that is the combined weight of people in the lift) being less than some number. Thus the question that is being answered here is about a distribution (namely that of the combined weight). Adding the parameters was the shortcut that was available thanks to (i) independence and (ii) the stability of the normal distribution under forming independent sums. That allowed to avoid the toil of explicitly “adding”, in a sense, distributions, which – if we assume that we are dealing with densities for the distributions – runs under the name “convolution” of densities in analysis. A slightly nasty operation in general. What commenter Sai noted above may be paraphrased by saying that triangular densities arise from forming the convolution of uniform ones (the rigorous verification of which would be well within reach of high school maths).

The upshot, at least in my opinion, is that any symbolic manipulation of random variables is merely a bit of a playful game as long as it is not ultimately tied to distributions of some relevant random variables. In the ‘lift’ example, that tie was easy to make, due to the very strong assumption of independence. Strong, at least, in terms of the world of probability, although perhaps less so in the real-world context.

Thank you for the book, which at a cursory glance looks really worth a closer look and tackles interesting questions about functions of random variables. It uses tools and notions outside high school maths. Writing something like in terms of distributions, for independent , would usually be nightmarishly complicated. To be able to write something as a pithy quotient (which one may further manipulate) is nice, but to me, it does not allow to speak already of an ‘algebra’ of random variables. But, if the point is mostly to say to the student: Look, random variables don’t obey algebraic relations such as numbers, then I am probably OK with that for didactic reasons. All of this is of course just my personal take and not claimed to be original.

I second all of what you write in the second-last paragraph, with the one caveat that a realisation of X is one number which may or may not be less than, or equal to, some given , say. The abstract probability space is usually not that important, except for some beautiful techniques such as the so-called coupling.

Sorry if I overshot here. HTH.

Thanks for your efforts John I really appreciate it. I will be sure to go through your explanations thoroughly.

As I have just finished my last exam I will beering up to celebrate, and will get around to it soon.

Thanks

No problem. I’ve got skin in this game too – I’ll find what I’ve posted useful myself down the track! (So don’t think I’ve done it just for you!)

As for getting around to going through it – If it’s a matter of sooner or later, I suggest later. Much later. You and all your peers have been through the wringer these last 3 years and I would suggest the time should be stuck on beer o’clock for a while. (But take care, watch out for your mates and stay safe).

We need to calculate Pr(volume of drink less than 375 ml).

We have the conversion factor between volume of drink (ml) and mass of drink (g), which tells us that we need to calculate Pr(mass of drink less than 390 g).

And we have the distributions for both the mass of the can when filled and the mass of the can when empty, which is where the real fun starts ….

A lot of cognitive effort is required to try and understand what’s going on. Made significantly worse by a lack of clarity on which random variables are independent.

Hi,

I would expect the question to state the assumption of independent variables (or provide the covariance info)

and then calculate the Z value for Prob(X<390g) for around 42%

Steve R

BTW i would also assume the drink is not carbonated otherwise other factors come in to play outside of the scope of the course

https://chemistry.stackexchange.com/questions/9067/what-is-the-carbon-dioxide-content-of-a-soda-can-or-bottle

Hi Steve. Covariance is not part of the course, so it’s “state the assumption of independent variables” or be damned.

Re: Carbonation … The biggest problem with VCAA embedding its questions in ‘real life’ contexts is that the ‘real life’ is rarely real (*)

BTW I heard that the drinks get delivered by stunt cars that have infinite initial acceleration (**)

* As for MCQ 4 ( https://mathematicalcrap.com/2022/11/09/posww-31-really/ ), it was embedded in an imaginary context that wasn’t real.

“Is this the real life? Is this just fantasy? Caught in a landslide, no escape from reality. Open your eyes, look up to the skies and see …”

** Nope, I’m never gonna let that one go:

I have come to expect problems with questions on probability and statistics in VCE mathematics examinations. When a new exam comes out, these are the questions I first go to.

In the case in point, the examiners should decide the purpose of the question. What is it that they want to assess? Quite often one can come up with straightforward questions that satisfy these criteria.

These are the questions I last go to, and preferably never go to.

My views on probability and statistics in the curriculum

2015-Vinculum-52(4)-stats

Excellent editorial.

Note that the current Australian Curriculum titles the stream “Statistics and Probability”, and the new Curriculum lists the “Statistics” stream before the “Probability” stream. This is no accident, and it is not a trivial matter. The ABS has a lot to answer for.

Dear Colleagues. Apologies to you all, particularly Marty, Name (*) and anonymous others. A veil has been lifted from my eyes … The question is even worse than I thought. Marty will undoubtedly explain in his own clear and inimitable way. But here’s my attempt at redemption:

You can calculate Pr(L < 390) where L is defined from L = T – C (my dogged solution solution, based on the prejudice of wanting an explicitly defined L in terms of T and C) or T = L + C (an equally valid but defined L).

The L's are in each case (this has to be, the calculated variance is different in each case) but each definition is valid and reasonable.

So the calculations based on either L = T – C or T = L + C are valid. There's no way of 'rejecting' one or the other (except to use personal bias – which one seems more 'realistic').

Even if you're told which two variables are independent, that still doesn't make one or the other choice 'correct'. It only means there's a preferred choice purely from a calculating point of view.

There are correct answers, each based on a different variance, which in turn is based on a valid choice. An assumption (or even a clear statement, had it been given) of independence simply makes the calculation of choice possible, it doesn’t mean the other choice must be rejected (**). Whether VCAA knows it or not, and whether VCAA likes it or not, VCAA must accept both answers. What a shemozzle.

* @Name – beer o’clock just got better. Your ‘lamented’ marks have returned. In fact, they never went missing.

** Although it does get rejected, because now it can’t be calculated.

In short: There are two different but valid ways of calculating an answer, and each way gives a different answer.

OK, I think we’re all finally on the same two pages. Now, does anyone wish to remark on my very clever title?

Marty, it’s the least you deserve for wading through crap that you particularly loathe. And maybe the title was far cleverer than you thought …

VCAA’s question is about cans. So we have cans – cannery. One meaning of row is a noisy argument or fight. So we have an argument, a fight, about cans. A fight against VCAA’s question. On the other hand …

Cannery Row is a novel by American author John Steinbeck. Cute. Clever.

But it’s even cuter …

The novel is set during the Great Depression. And I think we can agree that this question, along with all VCAA’s errors and blunders, causes a great depression among teachers.

But there’s even more, and here’s where the title gets funnily prescient …

The plot of Cannery Row is very simple: JF and his friends are to do something nice for their friend Marty, who has been good to them without asking for reward. JF hits on the idea that they should post helpful comments about a horrible VCAA stats question (they know how much Marty hates this statscrap). Unfortunately, the posts rage out of control, and Marty’s blog is ruined – and so is Marty’s mood. In an effort to return to Marty’s good graces, JF and the boys (well, JF) decide to post more comments – but make it work this time.

But there’s even more …. Cannery Row is the section of town in Monterey, California where the now closed fish canneries are located. There are probably dead fish. And dead fish stink, just like this VCAA question.

You got the joke in the title, so apology accepted. All good.

Wow, I’m almost convinced Marty spent most of his time on a clever title. I was surprised at the synopsis too!

Yep, the title takes some beating. And …

*Ahem* I should have acknowledged a somewhat lazy paraphrasing of the synopsis at Wikipedia.

Hi just thought about this but they wanted us to find the probability distribution of the amount of drink??????

Lets set up some parameters:

Filled Softdrink can ~ N(406, 5^2)

Empty Softdrink can ~ N(15, 0.25^2)

Number of 1ml Softdrink droplets, and its culmilative mass ~ unknown distribution, but supposedly normal, thus the normal distribution of the mass of actual soft drink that is in a can is: N(1*X, (StandardDeviation^2)*(X^2)) where *StandardDeviation* is the standard deviation of the distribution of the amount of softdrink droplets and X is the mean amount of droplets.

Thus, because of their given probability properties,

Var(X+bY) where b is an coefficient and X,Y are distributions = Var(X)+b^2Var(Y)

Thus Var(soft_drink_total_filled)=Var(Soft_drink_empty_can)+X^2*Var(Distribution_Of_Number_of_1ML_droplets)

therefore 5^2, or 25, must equal to 0.25^2+X^2*StandardDeviation^2

We see inherently there is a problem because this is unsolveable. Thus the question is extremely flawed and completely incorrect.

I wonder what the statistic professor you should this to had to say afterwards. This question seems to be a real *can* of worms…

Foolishly, they tried to make sense of the thing. No definitive answer (since none can be given), but they seemed to lean towards independence of can and fluid at least being a natural assumption.

They were not impressed with the question, and more generally they regard VCE stats as pretty absurd.

one var is insignificant in comparison with the other

Yes, good point. But it depends who decides what is or is not significant. In this case …

There are two answers, differing in the fourth decimal place. Insignificant yes, except that the exam asks for an answer correct to four decimal places.

3 dec places, not 4

I get 0.420838 and 0.420642, which differ in the fourth decimal place when rounded to four decimal places.

required 3 dec places

one answer was by taking the small var as insignificant, how did you get the alternative answer?

Yes, you’re right. I went back and checked. Three decimal places was asked for. Sorry.

(I suppose I’m too used to the questions typically asking for four decimal places.

Maybe VCAA asked for 3 dp because they know there’s two approaches that would differ in the fourth …? (*) What’s the probability of that?)

Probability[X < 390, X ~ NormalDistribution[391, Sqrt[25.0625]]] // N

Probability[X < 390, X ~ NormalDistribution[391, Sqrt[24.9375]]] // N

But the rounding is moot. 3 dp, 4 dp, 100 dp … The actual probabilities are not the same, they depend on what assumptions are made. The question should have said somewhere which two random variables are independent.

* You could argue that with appropriate rounding, the answers to the defective 2016 Maths Methods Exam 2 Section B Question 3 part (h) are the same regardless of the method used therefore it's all OK. Rounding lets us wallpaper over errors.

Probability[X < 390, X ~ NormalDistribution[391, Sqrt[25.0625]]] // N

This is based on the two random variables being independent, which is definitely false and cannot be considered as a possible answer.

The whole point of this post is that the question can only be answered if independence of two of the random variables is assumed by the student.

If you think such an assumption is false (*), then can you explain:

1) Why?

2) How should the question be answered? (Or do you advocate that a student must be brave enough to say that based on 1) the question cannot be answered?)

* Which is a separate issue.

Z=X+Y, Z is dependent on X, Y.

Y=Z-X, .: Z and X are dependent, and cannot be assumed to be independent.

Var(drink) approx= var(total)=25, because var(can) is insignif

.: the issue of dependency is avoided

Jesus.

Anonymous, what did you expect VCE students to do with this question? What do you think VCAA expected VCE students to do with this question?

Re: Z=X+Y, Z is dependent on X, Y.

Are you saying that X and Y are dependent? Why?

Re: Y=Z-X, .: Z and X are dependent.

Why?

Can you define your random variables Z, X and Y before answering.

in general, for independent X and Y random variables

if Z=X+Y, Z is dependent on X, Y.

Y=Z-X, .: Z and X are dependent as stated above

Z total

X can

The random variables in Z = X + Y have a different distribution to the random variables in Y = Z – X.

They might be called by the same names but they different.

The Y in Z = X + Y is different to the Y in Y = Z – X (*)

Yes, if Z = X + Y then Z depends on X and Y.

If Y = Z – X, then the Y here is different to the Y above and cannot be used to draw any conclusion about the (in)dependence of Z and X.

The problem with the question is not that all the random variables are dependent, the problem is that independence must be assumed by the student in order to answer the question and there are different assumptions that can be reasonably be made. VCAA should have clearly stated in the question which assumption to use (**).

* This has been discussed in previous comments.

** Then the only debate is whether or not the assumption is reasonable. VCAA is not known for its realistic assumptions or models.

Put in simple terms

The spread of the can weight is too small to affect the spread of the total weight, and it can be ignored when determining the spread of drink weight, so to eliminate the dependency issue.

I’ll repeat my questions:

What did you expect VCE students to do with this question?

What do you think VCAA expected VCE students to do with this question?

The students are expected to see the insignificance of one variance and calculate the required probability using the approximate variance of drink.

VCAA expected students to be aware of the insignificance of one variance and to proceed with the calculation of the required probability.

No assumptions required.

Thank you. What makes you believe that is what VCAA expected? How would students be expected to know ahead of time (or at all) that the one variance is sufficiently insignificant to be ignored to the required number of decimal places?

The previous emails were my humble opinions.

Because there are no other yr12 methods available to calculate the probability, students cannot see the number of decimal places necessary to compare them.

.: they follow the instructions to write 3 decimal places, just like many other questions.

They did not need to know ahead of time.

Thanks again. Why not just apply Okham’s razor, and assume VCAA stuffed up (again), until there is some counter evidence?

While in his comment of 9 November, Marty said that we would (at that time) not be going down the road of covariance, I would like to offer, I believe with his kind permission, my way of how I would enter same road. The reason is *not* to make the question meaningful, but rather – and I gladly acknowledge this idea to Marty from communication outside of this blog – to show how the accuracy of the three-digits *cannot* be achieved if one lets a certain covariance (or correlation) roam free. The thread so far shows that there are other problems with this question, but I take the liberty to leave these aside.

Another heads up – or excuse: I have noted from commentators that covariance is not covered as part of the course. I hope that what is below is still of interest to some, even though it may require consulting sources (which are numerous). And another one: The two-dimensional normal distribution assumption made below does matter; indeed, one-dimensional normal variables may have a two-dimensional normal distribution when concatenated to form a vector, even when they are uncorrelated. (Source, not needed in what follows: https://en.wikipedia.org/wiki/Normally_distributed_and_uncorrelated_does_not_imply_independent)

The above is why I doubt that the sketch below can be substantially whittled down further in its mathematical requirements without a degree of – pardon my French – bullsh•ing, though I shall be glad to be proven wrong.

A nice simple model with covariance is the bivariate normal distribution. We shall use it here. Measure everything in weight (grams) and define the random variables,

= (Weight of) Filled Softdrink can ,

= (Weight of) Empty Softdrink can .

Clearly where is the weight of the liquid in the can. For clarity we still write instead of , however. The question asks for the distribution function of evaluated at , that is, for . We shall get the distribution function of as the one-dimensional marginal of the distribution of the two-dimensional vector .

First, the information from the question for the vector gives us one immediate “degree of freedom”, so to speak (I don’t use this term in the way it is used in statistics): one covariance, which is named in the attached screenshot. This is the knob we can turn as we like (alas, thanks to the bad question). Now, for the covariance matrix of the vector to be valid at all, that is to be what is called “positive semidefinite” (or even “positive definite”), its determinant must be non-negative (or positive), which gives (for the semidefinite case) the condition . Now we use the assumption (!) that is normally distributed, and a linear transformation, to get the distribution of . This can be done a bit simpler, but the crux is that the variance of is impacted on (to commit a Literary Offense and thus redeem myself with VCAA …) by , as we see presently. At the end, we get that , the first component of this vector, has the distribution. To understand the impact of in this expression, note that if we increase the variance of a normal variable while keeping the mean fixed, probability mass is squashed out to the left and the right of the mean (here equal to ). Since the sought probability is a decreasing function of . Inputting the extremes obtained earlier, we obtain the range of "answers" , with the endpoints excluded if we want a positive definite (and not just positive semidefinite) covariance matrix. Thus we found a range of answers by far prohibiting 3 decimal digits precision if pure can weight (without liquid) and full can weight (with liquid) are allowed to be correlated.

Sorry I forgot to name myself, the above post with computations is by me, Christian R

Thanks, Christian. I’ve added your name.

You are welcome, Marty. As I said in my email to you, I found this question instructive (assuming I did get on top of it – or better perhaps, on top of one of the ways it may be read).

I would like to correct a statement towards the end of my comment-proof, at the junction where we have established that . The desired probability is, as stated in words earlier in the comment, . Note that , so that we are to the left of the tip of a normal density (bell) curve with mean 391, whatever its variance is. Increasing the variance while keeping the mean fixed at 391 will lead to the probability mass under the density curve being “spread out” away from the mean to either side, making the so-called “tails” on either side thicker, so that the foregoing probability will also increase. (This geometrical argument is, I believe, well within reach of VCE maths; drawing a picture may help.) Since enters the variance of with a negative sign, increasing that parameter will decrease the variance and thus, decrease the probability that we seek. (If we had instead been to the right of the bell curve, that probability would instead have been an increasing function of .) So this has nothing to do with monotonicity of distribution functions, at least not directly. Sorry for any confusion this may have caused.

Of course one could simply insert the two values to conclude that we cannot have three digits precision. The point of the foregoing argument is that we want to show, via establishing a monotonicity, that this does exhaust the “range of answers” – at least if we do decide to stay in the (bivariate) normally distributed universe.

Please allow me to correct another, this time “verbal” error in my post, and to make a clarifying comment – with my apologies:

1) The final sentence ending, “… if we only allow for the total can weight and the weight of the liquid to be correlated.” should, by the definition of , and , read instead, “.. if we only allow for the total can weight (can plus liquid) and the weight of the can to be correlated.”

2) Even when two normally distributed random variables are uncorrelated, their pasting together may not yield a normally distributed vector. See https://en.wikipedia.org/wiki/Normally_distributed_and_uncorrelated_does_not_imply_independent

With independence, it does work.

No worries. Will correct, but will be later in the day.