Hypothesis testing using p-values

Now hypothesis testing is basically the science of making decisions based on evaluating the likelihood of an event.

In simple words, it is like if the null hypothesis H_0 is true, I expect things to behave in certain ways. If things do not behave in that way, then H_0 must not be true. However, even if things behave in the way one expected, it does not mean that H_0 is true. It only means that I don’t have enough proof to say that H_0 is not true yet.

So in the previous section, we have talked about how to use critical values to conduct hypothesis testing. Critical values help one see if H_0 should be rejected or not by saying that if H_0 is true then the test statistic should fall into the acceptance region instead of the rejection region.

In this section, we will talk about hypothesis testing using p-values. A p-value, short for “probability value,” is a statistical measure that helps us quantify the probability of obtaining results as extreme as or more extreme than the observed results when the null hypothesis is true.

The procedure of hypothesis using p-values is not too different from using critical values:

  1. Formulate hypotheses:
  • Null Hypothesis (Ho): This is the default or status quo hypothesis. It typically represents no effect or no difference.
  • Alternative Hypothesis (HA): This is the hypothesis you want to test, often suggesting an effect or a difference.
  1. Collect data and perform a statistical test, such as a t-test or chi-squared test, depending on the nature of the data and the research question.
  2. Set the significance level \alpha .
  3. Calculate the p-value: The p-value is calculated by the statistical test, and it represents the probability of observing data as extreme as or more extreme than the observed data if the null hypothesis is true. A small p-value suggests that the observed results are unlikely to have occurred by chance, while a larger p-value suggests that the results are more consistent with the null hypothesis. If the p-value is less than or equal to the chosen significance level (alpha), one can reject the null hypothesis in favor of the alternative hypothesis. If the p-value is greater than alpha, the null hypothesis is not rejected.

Recall that a p-value quantifies the probability of obtaining results as extreme as or more extreme than the observed results when the null hypothesis is true. Therefore, a smaller p-value (e.g., p < 0.05 ) indicates stronger evidence against the null hypothesis, while a larger p-value suggests weaker evidence against the null hypothesis.

Now let’s dig into the details by reusing the examples in the previous section!

Fat in ground meat

Let’s recall the problem, which is as follows: I randomly bought 35 packages of meat from local stores and brought them home to scan. After calculating, I found that the sample mean of fat in each package is \bar{x} = 6.8 (grams) and the sample standard deviation is s = 2 grams.

The test procedure:

Now let’s conduct a hypothesis-testing procedure using the p-value.
Here the sample size is n = 35 , which is large enough to say the amount of fat in the packages follows a normal distribution. Therefore, I’ll conduct the z-test for the mean (We usually use the z-test when the sample size is at least 30).

First, to set the null and alternate hypothesis, note that 3% of 200 grams is 6 grams. Therefore my null hypothesis is

H_0: \mu = 6

because I want to reject the fact that on average the packages contain 3% of fat. I think the packages contain more than 3% on average. Therefore my alternative hypothesis is

H_A: \mu > 6

The test statistic is

Z = \frac{\bar{x} - 6}{s / \sqrt{n}} = \frac{6.8 - 6}{2 / \sqrt{35}} = \frac{0.8 \times \sqrt{35}}{2} \approx 2.37

Again, I choose the significance level \alpha of 0.05.

Now we calculate the p-value which represents the probability of observing data as extreme as or more extreme than the observed data if the null hypothesis is true. Since the alternative hypothesis is H_A: \mu > 6 , we can visualize the p-value in relation to the test statistic like this:

In this case, let G denote the standard normal distribution. Then

p\text{-value} = P(G > 2.37) = 0.009

So the p-value is smaller than \alpha = 0.05 . Therefore, at significance level 0.05, I can reject the null hypothesis.
However, note that if we chose the significance level \alpha = 0.001 , then p\text{-value} = 0.009 > \alpha . Therefore, at the significance level \alpha = 0.001 , we cannot reject the null hypothesis yet.

The Weight of Chocolate Bars

Imagine you work at a chocolate factory and are responsible for ensuring chocolate bars’ quality. The factory claims that their chocolate bars have an average weight of 100 grams. However, you suspect that some of the machines might be overfilling the bars.

Your null hypothesis (H_0) is that the chocolate bars indeed weigh 100 grams on average, and your alternative hypothesis (H_A) is that they weigh more than 100 grams.

To test this, you randomly sample 30 chocolate bars and find that they have an average weight of \bar{x} = 105 \text{ grams}
and the sample standard deviation is s = 2 \text{ grams}.

So to write everything in math terms, the null hypothesis isH_0: \mu = 100
because I want to reject the fact that on average each chocolate bar weighs 100 grams. I think each bar weighs more than 100 grams on average. Therefore, my alternative hypothesis is H_A: \mu > 100.

Now let us pick our significance level to be \alpha = 0.05. We should pick this in advance to further calculation to avoid bias in the conclusion because, as shown in the previous example, the conclusion can be different at different levels of \alpha.

The test statistic is Z = \frac{\bar{x} - 100}{s / \sqrt{n}} = \frac{105 - 100}{2 / \sqrt{30}} = \frac{5 \times \sqrt{30}}{2} \approx 13.69

In Figure 3 above, the orange dot region is the critical region (rejection region), which contains the values that should not happen if the null hypothesis is true. The critical value is the value that separates the normal region and the rejection region. Here by the normal region I mean the region where x should fall into if H_0 is true. The area of the orange dot region is the significant value \alpha (Note that the area under the curve is 1, just as the probabilities of all events should sum to 1. The right tail part (the orange dot region) represents the observations that are bigger than expected, i.e., the abnormal region). Here we choose \alpha = 0.05 which leads to the critical value z_{0.05} = 1.96. So we consider all values that are bigger than z_{0.05} = 1.96 to be abnormal. Note that Z \approx 13.69 > z_{0.05}
Therefore we reject the null hypothesis and conclude that on average each bar weighs more than 100 grams. Okay! That’s good for … the customers – the ones who don’t even know that they are lucky!


Discover more from Science Comics

Subscribe to get the latest posts sent to your email.

Leave a Reply

error: Content is protected !!