Hypothesis testing using p-values

Now hypothesis testing is basically the science of making decisions based on evaluating the likelihood of an event.

In simple words, it is like if the null hypothesis $H_0$ is true, I expect things to behave in certain ways. If things do not behave in that way, then $H_0$ must not be true. However, even if things behave in the way one expected, it does not mean that $H_0$ is true. It only means that I don’t have enough proof to say that $H_0$ is not true yet.

So in the previous section, we have talked about how to use critical values to conduct hypothesis testing. Critical values help one see if $H_0$ should be rejected or not by saying that if $H_0$ is true then the test statistic should fall into the acceptance region instead of the rejection region.

In this section, we will talk about hypothesis testing using p-values. A p-value, short for “probability value,” is a statistical measure that helps us quantify the probability of obtaining results as extreme as or more extreme than the observed results when the null hypothesis is true.

The procedure of hypothesis using p-values is not too different from using critical values:

Formulate hypotheses:

Null Hypothesis (Ho): This is the default or status quo hypothesis. It typically represents no effect or no difference.
Alternative Hypothesis (HA): This is the hypothesis you want to test, often suggesting an effect or a difference.

Collect data and perform a statistical test, such as a t-test or chi-squared test, depending on the nature of the data and the research question.
Set the significance level $\alpha$ .
Calculate the p-value: The p-value is calculated by the statistical test, and it represents the probability of observing data as extreme as or more extreme than the observed data if the null hypothesis is true. A small p-value suggests that the observed results are unlikely to have occurred by chance, while a larger p-value suggests that the results are more consistent with the null hypothesis. If the p-value is less than or equal to the chosen significance level (alpha), one can reject the null hypothesis in favor of the alternative hypothesis. If the p-value is greater than alpha, the null hypothesis is not rejected.

Recall that a p-value quantifies the probability of obtaining results as extreme as or more extreme than the observed results when the null hypothesis is true. Therefore, a smaller p-value (e.g., $p < 0.05$ ) indicates stronger evidence against the null hypothesis, while a larger p-value suggests weaker evidence against the null hypothesis.

Now let’s dig into the details by reusing the examples in the previous section!

Fat in ground meat

Let’s recall the problem, which is as follows: I randomly bought 35 packages of meat from local stores and brought them home to scan. After calculating, I found that the sample mean of fat in each package is $\bar{x} = 6.8$ (grams) and the sample standard deviation is $s = 2$ grams.

The test procedure:

Now let’s conduct a hypothesis-testing procedure using the p-value.
Here the sample size is $n = 35$ , which is large enough to say the amount of fat in the packages follows a normal distribution. Therefore, I’ll conduct the z-test for the mean (We usually use the z-test when the sample size is at least 30).

First, to set the null and alternate hypothesis, note that 3% of 200 grams is 6 grams. Therefore my null hypothesis is

$H_0: \mu = 6$

because I want to reject the fact that on average the packages contain 3% of fat. I think the packages contain more than 3% on average. Therefore my alternative hypothesis is

$H_A: \mu > 6$

The test statistic is

$Z = \frac{\bar{x} - 6}{s / \sqrt{n}} = \frac{6.8 - 6}{2 / \sqrt{35}} = \frac{0.8 \times \sqrt{35}}{2} \approx 2.37$

Again, I choose the significance level $\alpha$ of 0.05.

Now we calculate the p-value which represents the probability of observing data as extreme as or more extreme than the observed data if the null hypothesis is true. Since the alternative hypothesis is $H_A: \mu > 6$ , we can visualize the p-value in relation to the test statistic like this:

In this case, let $G$ denote the standard normal distribution. Then

$p\text{-value} = P(G > 2.37) = 0.009$

So the p-value is smaller than $\alpha = 0.05$ . Therefore, at significance level 0.05, I can reject the null hypothesis.
However, note that if we chose the significance level $\alpha = 0.001$ , then $p\text{-value} = 0.009 > \alpha$ . Therefore, at the significance level $\alpha = 0.001$ , we cannot reject the null hypothesis yet.

The Weight of Chocolate Bars

Imagine you work at a chocolate factory and are responsible for ensuring chocolate bars’ quality. The factory claims that their chocolate bars have an average weight of 100 grams. However, you suspect that some of the machines might be overfilling the bars.

Your null hypothesis ( $H_0$ ) is that the chocolate bars indeed weigh 100 grams on average, and your alternative hypothesis ( $H_A$ ) is that they weigh more than 100 grams.

To test this, you randomly sample 30 chocolate bars and find that they have an average weight of $\bar{x} = 105 \text{ grams}$
and the sample standard deviation is $s = 2 \text{ grams}.$

So to write everything in math terms, the null hypothesis is $H_0: \mu = 100$
because I want to reject the fact that on average each chocolate bar weighs 100 grams. I think each bar weighs more than 100 grams on average. Therefore, my alternative hypothesis is $H_A: \mu > 100.$

Now let us pick our significance level to be $\alpha = 0.05$ . We should pick this in advance to further calculation to avoid bias in the conclusion because, as shown in the previous example, the conclusion can be different at different levels of $\alpha$ .

The test statistic is $Z = \frac{\bar{x} - 100}{s / \sqrt{n}} = \frac{105 - 100}{2 / \sqrt{30}} = \frac{5 \times \sqrt{30}}{2} \approx 13.69$

In Figure 3 above, the orange dot region is the critical region (rejection region), which contains the values that should not happen if the null hypothesis is true. The critical value is the value that separates the normal region and the rejection region. Here by the normal region I mean the region where $x$ should fall into if $H_0$ is true. The area of the orange dot region is the significant value $\alpha$ (Note that the area under the curve is 1, just as the probabilities of all events should sum to 1. The right tail part (the orange dot region) represents the observations that are bigger than expected, i.e., the abnormal region). Here we choose $\alpha = 0.05$ which leads to the critical value $z_{0.05} = 1.96$ . So we consider all values that are bigger than $z_{0.05} = 1.96$ to be abnormal. Note that $Z \approx 13.69 > z_{0.05}$
Therefore we reject the null hypothesis and conclude that on average each bar weighs more than 100 grams. Okay! That’s good for … the customers – the ones who don’t even know that they are lucky!

Discover more from Science Comics

Subscribe to get the latest posts sent to your email.

Hypothesis testing using p-values

The Weight of Chocolate Bars

Like this:

Related

Discover more from Science Comics

Like this:

Like this:

Like this:

Leave a ReplyCancel reply

The Weight of Chocolate Bars

Share this:

Like this:

Related

Discover more from Science Comics

Related Posts

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Leave a ReplyCancel reply