PSY400 Research Methods and Analysis 4
Dr Joshua Adie
University of the Sunshine Coast | CRICOS Provider Number: 01595DUniSC
Week 12 Bayesian Approaches
University of the Sunshine Coast | CRICOS Provider Number: 01595DReading - covers undertaking these techniques in SPSS
Workshop Content Overview
- Alternatives to NHST - Bayesian approaches
- Bayes' Theorem
- Problems with NHST (frequentist) statistics
- The rationality of the Bayesian approach
- Weaknesses of the Bayesian approach
University of the Sunshine Coast | CRICOS Provider Number: 01595D
UniSC1. Probability
- Thomas Bayes (1763) published a paper on equality among several probabilities - now known as
Bayes Theorem
- Statistical inference - draw conclusions from known data in sample to a population for which we
do not have data (e.g. in our sample of 1000 voters we find that 55% intend to vote for X; but what
% of the total population of voters will vote for X?)
- NHST approach
- Set up a Ho that population percentage = 55%
- Use sample data to test if Ho can be rejected or not
- Bayesian approach - make statements about the population percentage
Statistical inference - extrapolate from the known sample to a larger (unknown) population
. Never can be certain that conclusions extrapolated are correct
- Deal with this uncertainty by using probabilities (significance level, confidence level, Bayesian prior and
posterior probabilities)
University of the Sunshine Coast | CRICOS Provider Number: 01595D
UniSC. Statistical Inference Choices
The choice between doing statistical inference using hypothesis testing and confidence interval estimation (known as
classical statistical inference) and Bayesian statistics depends on how probabilities are measured:
- Empirical probabilities - probabilities seen as long-run relative frequencies or proportions (aka frequentist
probabilities or objective probabilities).
- experiment repeated many times - each time one of several events occurs. As the number of times the experiment is completed
becomes larger, the proportion of times a particular event occurs approaches the probability of that event.
- e.g., dice tossing experiment, repeated a large number of times. The event that numbers 1 or 2 come up occurs 1/3 of the time,
and this leads us to say that probability = 0.333. This led to the Neyman-Pearson system of classical statistical inference with
hypothesis testing and confidence interval estimation.
. The significance level for the test of a statistical null hypothesis is an example of a relative frequency. If the null hypothesis is true
and we study many different samples from the same population, the sample test statistics will still fall in the rejection region of
the test for some of the samples (ie as determined by a level) - the samples for which we will erroneously reject the null
hypothesis, and over the long-run the proportion of times this will happen is the significance level of the test (i.e. p. <. 05).
University of the Sunshine Coast | CRICOS Provider Number: 01595D
UniSC2. Subjective Probabilities
Subjective probabilities - The subjective view takes probabilities as personal measures of uncertainty, based on the
available evidence. The subjectivist would also say that the probability = 0.333 that dice toss comes up 1 or 2 because
of what is known about the physics of dice tossing, the even distribution of mass in the die and because from past
tosses the proportion of 1 and 2 is one-third.
- But, with subjective probabilities can also use probabilities on events that cannot be repeated.
- e.g. - what is the probability that a first-term Prime Minister (e.g. Anthony Albanese) will run for election again
based on the current political climate?,
- there is no empirical, experimental basis for the probability that the Prime Minister will run again? - cannot be
computed using classical statistical inference
- Subjective probabilities are more general - can be used to measure the uncertainty we have about single, unique
events.
- At the same time, subjective probabilities are not unique. 2 different people may have different information available to them about the
Prime Minister's intention (e.g. the PM's partner vs a random person on the street) - then their subjective probabilities are likely to differ on
the question of whether the Prime Minister will run for a second term.
University of the Sunshine Coast | CRICOS Provider Number: 01595D
UniSC1.1 Classical Statistical Inference
Almost all statistical inference in Psychology makes use of hypothesis testing of one form or another. Problematic
features of classical statistical inference.
Use of Tail Probabilities
- Study if a population is evenly divided between males and females: Ho : probability = 0.5 that randomly chosen person
is female.
- If you were to draw a random sample of 10 people, two-sided test.
- probability of the sample's containing 0 females (and 10 males) = 1/1024 | probability of 1 female (and 9 males) = 10/1024
- probability of 9 females (and 1 male) = 10/1024 | probability that all 10 people are female = 1/1024.
- If we reject the Ho for 0, 1, 9, or 10 female, the p. level required for significance (sum of probabilities, 22/1024) = 0.02.
- Run experiment and draw 10 people at random - find 9 females among the 10 people.
- we report that the Ho is rejected at p. < 0.02 significance.
- But the significance level also contains the probabilities for 0, 1, or 10 females, and these are data that did not occur.
- Classical statistical inference theory is based on probabilities as long run relative frequencies. The significance level tells
us what will happen in the long run, if we draw a large number of samples. But our experiments only use 1 sample.
University of the Sunshine Coast | CRICOS Provider Number: 01595D
UniSC1.1 Classical Statistical Inference Interpretation
Interpretation of Confidence Intervals
- Classical statistical inference theory states that in the long run with data from many samples and therefore many
confidence intervals, a certain proportion of these intervals will contain the true parameter value while the remaining
intervals will not.
- Thus, the theory predicts what will happen in the long run, before any data are collected.
- The one confidence interval from our sample either contains or does not contain the true parameter
- do not know whether the single confidence interval belongs to the large set of confidence intervals that do contain the
population parameter, or to the small set of intervals that do not contain the population parameter.
University of the Sunshine Coast | CRICOS Provider Number: 01595D
UniSC1.2 Problems with NHST (Frequentist) Statistics
- The frequentist view of statistics starts from assumption that probabilities are long-run relative frequencies.
- A long-run relative frequency requires that the probability of some property (q) occurring is then the proportion of events in the
collective with property q.
- e.g., the probability of having black hair is the proportion of people in a well-defined collective (e.g., people living in Australia)
who have black hair - the probability applies to the whole collective. But a person may belong to two different collectives that
have different probabilities
- Long-run relative frequencies do not apply to the truth of individual theories because theories are not collectives -
theories are just true or false.
- Given both a theory and a decision procedure, can determine a long-run relative frequency with which certain data might be
obtained, P(data | theory and decision procedure) [a vertical line - | - is read as "given"]
- e.g., given a null hypothesis and a procedure that includes rejection if the t value exceeds 2, can work out the frequency with
which we would reject the null hypothesis.
- The logic of frequentist statistics is to adopt decision procedures with known long-term error rates (of false positives and false
negatives) and then control those errors at acceptable levels. The error rate for false positives is called alpha (a), the significance
level (typically .05), and the error rate for false negatives is called beta (B), where ß = 1 - power.
- Thus, setting significance and power controls long-run error rates. An error rate can be calculated for decision procedures, not to
individual experiments. An individual experiment is a onetime event, so it does not constitute a long-run set of events, but a
decision procedure can in principle be considered to apply over an indefinite long-run number of experiments.
University of the Sunshine Coast | CRICOS Provider Number: 01595D
UniSC2. Bayes' Theorem
Probabilities of Data Given Theory vs. Theory Given Data
- The probability of a theory being true given data can be symbolized as: P(theory | data)
- This is the inverse of frequentist (NHST) - P(data | theory)
- An assumption/inference that P(data | theory) directly indicates the inverse P(theory | data) cannot logically be made, but is
commonly done so.
- e.g., the probability of being dead given that a shark has bitten one's head clean off: P(dead | head bitten clean off by shark) = 1.
But probability that a shark has bitten one's head clean off given that one is dead: P(head bitten off by shark | dead), is very
close to 0 ... because most people die of other causes.
- This also applies to null hypotheses:
- The significance value, a form of P(data | theory), does not by indicate the probability of the null, P(theory | data).
- The p value obtained does not indicate the probability of the null.
- Example: a coin that is heavily weighted on one side so that it will land "heads" 60% of the time (because you like to cheat). Your friend wishes to
test the null hypothesis that it is a fair coin at the 5% significance level. He throws it 5 times and gets 3 heads and 2 tails.
- Assume Ho is heads = 50% for each toss, therefore probability of 3+ heads in 5 tosses is 0.5. This is obviously not significant at the 5% level
therefore accept the null hypothesis (as the result is nonsignificant) BUT incorrectly conclude the H0 has a 50% probability of being true
(based on the p value), or a 95% probability (based on the significance level used).
- But because the coin is weighted, the Ho is false, and obtaining 3 heads out of 5 throws should not change your mind about that. You
rationally do not assign the H0 a probability of 50% (or 95%). When we directly infer a probability of the H0 from a p value (or significance
level), there is a violation the logic of frequentist statistics - these statistics do not directly test the probability of theories and hypotheses.
University of the Sunshine Coast | CRICOS Provider Number: 01595D