SL 4.8 — Binomial Distribution

Content	Guidance, clarification & links
• Definition of the binomial distribution • Probability formula P(X = k) • Mean and variance of X ~ Bin(n, p) • Real-life modelling situations	• Appropriate conditions for binomial model • Use of technology to find binomial probabilities • Mean = n p and Variance = n p (1 − p) (no formal proof needed) • Linked to expected number of occurrences from SL 4.5

1. What is a Binomial Distribution?

A random variable X follows a binomial distribution when we count the number of “successes” in a fixed number of
repeated trials, each trial having only two possible outcomes (success / failure).
We write this as X ~ Bin(n, p), where:

n = number of trials (fixed in advance)
p = probability of success on each trial (constant)
Each trial is independent of the others
Each trial has only two outcomes: success (with probability p) or failure (with probability 1 − p)
X counts how many successes occur in n trials → X takes values 0, 1, 2, …, n

If any of these conditions is clearly broken (e.g. probability changes, trials not independent), the binomial model may not
be appropriate and another distribution should be considered.

🌍 Real-World Connection:
Binomial models appear in:

Quality control: number of defective items in a batch
Medicine: number of patients responding positively to a treatment
Marketing: number of customers who buy after receiving an advert
Sports: number of successful penalty kicks out of n attempts

2. Binomial Probability Formula

For X ~ Bin(n, p), the probability that X takes the value k (i.e. exactly k successes) is

P(X = k) = C(n, k) p^k (1 − p)^{n − k} for k = 0, 1, 2, …, n

C(n, k) (also written nCk) is the number of different ways to choose which k trials are successes.
p^k gives the probability of those k successes.
(1 − p)^{n − k} gives the probability of the remaining n − k failures.

In IB exams you do not need to derive this formula, but you must know how to:

Write down the correct expression for P(X = k)
Interpret “at least”, “at most”, “no more than”, “no fewer than” using sums of binomial terms or GDC

Example 1 – Exact probability

The probability that a machine produces a defective item is 0.1. In a batch of n = 8 items, let X be the number of defectives.
Find P(X = 2).

P(X = 2) = C(8, 2) (0.1)² (0.9)⁶ = 28 × 0.01 × 0.531441 ≈ 0.148.

🟢 GDC Tip (Binomial):
Most calculators have functions like binompdf(n, p, k) and binomcdf(n, p, k).

Use binompdf for P(X = k).
Use binomcdf for P(X ≤ k); for P(X ≥ k), use 1 − P(X ≤ k − 1).
Always define X (e.g. “Let X be the number of successes …”) before using these commands.

🧠 Examiner Tip:
Marks are often lost by:

Not stating X ~ Bin(n, p) before calculating probabilities
Using the wrong n or p (e.g. confusing success with failure)
Mistreating “at least / at most” — write the probability sum explicitly or show the GDC command clearly

3. Mean and Variance of X ~ Bin(n, p)

For a binomial random variable X ~ Bin(n, p):

Mean (expected value): E(X) = n p
Variance: Var(X) = n p (1 − p)
Standard deviation: σ = √[n p (1 − p)]

These results link directly to SL 4.5: if the probability of success is p and there are n trials, the expected number of successes is n p.

Example 2 – Mean and variance

A basketball player scores a free throw with probability 0.75.
She takes 20 shots in a practice session. Let X be the number of successful shots (assume independence).

Model: X ~ Bin(20, 0.75)
E(X) = n p = 20 × 0.75 = 15 → on average she scores 15 shots.
Var(X) = n p (1 − p) = 20 × 0.75 × 0.25 = 3.75
σ ≈ √3.75 ≈ 1.94 → typical deviation from the mean is about 2 shots.

4. When is a Binomial Model Appropriate?

When reading a word problem, check:

Is there a fixed number of trials n?
Does each trial have only two outcomes (success / failure)?
Is the probability of success constant between trials?
Are outcomes of trials independent of each other?

If all answers are “yes”, then a binomial model is usually reasonable.

📝 Paper Strategy:
In explanation questions (“justify the use of a binomial model”), list the four key conditions
in short sentences. Examiners look for explicit reference to fixed n, independence, constant p, and two outcomes.

📐 Mathematical Connections (Pascal / Yang Hui):
Binomial coefficients C(n, k) appear in Pascal’s triangle, which is closely related to the binomial distribution.
Historically, similar triangular arrays were studied by the Chinese mathematician Yang Hui long before Pascal.
This highlights how mathematical ideas develop in parallel across different cultures.

🔍 TOK Perspective:

How do we choose between different probability models (binomial vs. normal vs. Poisson)?
To what extent is a model “true”, and to what extent is it only a convenient approximation?
Does assigning a probability to rare events (e.g. system failures) change how society responds to risk?

🌐 Enrichment / EE Ideas:

Hypothesis testing using binomial models (e.g. testing if a coin or die is biased)
Comparing theoretical binomial predictions with experimental data
Investigating real-world data sets where binomial or related models appear

Mastering SL 4.8 means you can recognise when a situation is binomial, write X ~ Bin(n, p), use technology or formulae
to calculate probabilities, and interpret the mean and variance in context.