The p value is often used in applications to determine if some improvement is significant or if it was just random chance. Loosely speaking, it defines how likely it is (under the assumption of an hypothesis H0) to get more extreme results.
This is interesting for studies in medicine and similar scenarios. You have a drug and a placebo. You want to figure out if the drug is better than the placebo.
Hypothesis testing ¶
In statistics, hypothesis testing works as follows:
- Define a statistical model: This means you have a sample X and an assumption about the distribution of the data. For example, X=Rn where n∈N is your sample size and X1,…,Xniid∼N(μ,σ2)
- Define a hypothesis and an alternative: For example H0:μ=100 and H1:μ>100 might be hypotheses if you want to check if a drug increases the IQ.
- Controll errors: You can make two errors. Either H0 is true and you reject it or H1 is true, but you don't reject H0. In statistics, the first error is usually controlled. So the test is made in such a way that the first error is less than some α∈(0,1). Usually, α=0.05 or α=0.01 or even lower.
- Test statistic: You have a value which indicates something about the parameters in the hypotheses. For example T(X1,…,Xn)=1n∑i=1Xi.
- Test statistic distribution: T itself is a random variable and you can calculate its distribution (e.g. TH0∼N(μ,σ2n)).
- Calculate test decision: Hence you can calculate a c∈R such that PH0(H0 is rejected)=PH0(T≤c)≤α. It does not have to be T<c, but often it is.
The p value ¶
Now note that you could also make it the other way round. You could calculate
p∗=PH0(T(X)≥T(x))
If p∗≤α, then H0 can be rejected on Niveau α.
Interesting statements ¶
I just came across the following statments which I think are interesting enough to share them. All of them are wrong.
The follwing was takine from a German statistics exam by Dr. Klar (KIT, WS 2013/2014).
Assume for the following, that an experiment resulted in a p value of 0.01.
- H0 is certainly false.
- H0 is with probability 0.01 false.
- H1 is certainly correct.
- You can calculate the probability that H1 is correct with the p value.
- If one decides to reject H0, then the p value is the probability of making the wrong decision.
- The experimental result is reliable, meaning that if the experiment is repeated often one would get a significant result in 99% of the cases.
I'm not too sure if (5) is really wrong.