Hypothesis Testing - Statistics

 

1. What is Hypothesis Testing and when do we use it?

Hypothesis testing is a part of statistical analysis, where we test the assumptions made regarding a population parameter.

It is generally used when we were to compare:

  • a single group with an external standard
  • two or more groups with each other

Parameter is a number that describes the data from the population whereas, a Statistic is a number that describes the data from a sample.

2. Terminology used

Null Hypothesis: Null hypothesis is a statistical theory that suggests there is no statistical significance exists between the populations.

Alternative Hypothesis: An Alternative hypothesis suggests there is a significant difference between the population parameters. It could be greater or smaller. Basically, it is the contrast of the Null Hypothesis.

Note: Hmust always contain equality(=). Halways contains difference(≠, >, <).

For example, if we were to test the equality of average means (µ) of two groups:

for a two-tailed test, we define H0: µ1 = µand Ha: µ1≠µ2

for a one-tailed test, we define H0: µ1 = µand Ha: µ> µor Ha: µµ2

Level of significance: Denoted by alpha or α. It is a fixed probability of wrongly rejecting a True Null Hypothesis. For example, if α=5%, that means we are okay to take a 5% risk and conclude there exists a difference when there is no actual difference.

Critical Value: Denoted by C and it is a value in the distribution beyond which leads to the rejection of the Null Hypothesis. It is compared to the test statistic.

Test Statistic: It is denoted by t and is dependent on the test that we run. It is deciding factor to reject or accept Null Hypothesis.

p-value: It is the proportion of samples (assuming the Null Hypothesis is true) that would be as extreme as the test statistic. It is denoted by the letter p.

3. Steps of Hypothesis testing

  1. Start with specifying Null and Alternative Hypotheses about a population parameter
  2. Set the level of significance (α)
  3. Collect Sample data and calculate the Test Statistic and P-value by running a Hypothesis test that well suits our data
  4. Make Conclusion: Reject or Fail to Reject Null Hypothesis

4. Decision Rules

The two methods of concluding the Hypothesis test are using the Test-statistic value, p-value.

In both methods, we start assuming the Null Hypothesis to be true, and then we reject the Null hypothesis if we find enough evidence.

The decision rule for the Test-statistic method:

  • If test statistic < critical value: Fail to reject the null hypothesis.
  • If test statistic >= critical value: Reject the null hypothesis.

The decision rule for the p-value method:

  • If p-value > alpha: Fail to reject the null hypothesis (i.e. not significant result).
  • If p-value <= alpha: Reject the null hypothesis (i.e. significant result).

 

Comments

Popular posts from this blog

Transformers: Self-attention

Retrieval Augmented Generation(RAG)

Large Language Models(LLMs)