AHL 4.13 — Non-Linear Regression & Model Evaluation

Key Concept Meaning / Formula
Non-linear Regression Regression where the model is not a straight line: may be quadratic, cubic, exponential, power or sinusoidal.
Least Squares Method Chooses model parameters to minimise the sum of squared residuals SSres.
Residual Difference between the observed value and the model’s predicted value.
Coefficient of Determination (R2) R2 = 1 − SSres/SStot. Measures how much of the variation is explained by the model.

📌 Understanding Non-Linear Regression

  • Non-linear regression is used when data clearly does not follow a straight line — e.g., growth curves, oscillations, or power laws.
  • Your GDC will automatically fit regression curves (quadratic, cubic, exponential, power, logistic, sine). Choose the one that visually matches the scatter plot.
  • The “best-fit” model is the one with the smallest SSres, meaning predicted values closely match actual data.
  • Different models may give similar R2 values; students must justify their chosen model by context (growth? periodic motion? decay?).

How to Choose Between Linear and Nonlinear Regression - Statistics By Jim

📌 Sum of Squared Residuals (SSres)

  • Residual = observed − predicted. Positive residual → model underestimates; negative residual → model overestimates.
  • SSres = Σ(residual)2. Squaring removes sign and penalises large errors more heavily.
  • A small SSres means the model fits the data tightly; a large SSres means poor fit.
  • SSres alone cannot compare drastically different model types unless the same dataset is used.

🌍 Real-World Connection

  • Economists model cost curves using power and exponential regressions.
  • Scientists use R2 to measure goodness of fit in physics labs and biological experiments.

📌 Understanding R2 — Coefficient of Determination

  • R2 = 1 − SSres/SStot. Measures the proportion of total variation explained by the model.
  • If SSres = 0, then R2 = 1 → perfect fit (rare and usually unrealistic for real data).
  • R2 does NOT confirm that the chosen model is appropriate — a misleading model may still have high R2.
  • Compare models using: (1) context, (2) realism, (3) residual plot shape, not only R2.

📌 Example Questions

Example 1 — Choosing the Best Non-Linear Model

A dataset shows rapid initial growth, then slows down. Evaluate whether an exponential or power regression is more suitable using:

  • Scatter plot shape (concave down suggests power model).
  • Comparison of R2 values.
  • Interpretation in context — many biological systems follow power laws.
Example 2 — Computing SSres

Given data points and predicted values from a cubic model, compute each residual, square them, and sum to find SSres. Compare with a quadratic model to determine which fits better.

🧠 Examiner Tip

  • Always justify your chosen regression model using both numerical (R2) and contextual reasoning.
  • Residual plots should look random — patterns mean the model is inappropriate.
  • Do NOT rewrite calculator output; include model coefficients exactly as shown on the GDC.