In this section we discuss how can we use sample data to estimate values of population parameters?
Point estimation and Interval estimation are the two forms of population parameter estimation based on sample data.
8.1 Point Estimation
Provides a single best guess for the population parameter.
The statistical properties of point estimators (Unbiased, Efficiency, Consistency) are out of the scope of this book.
Population Parameter
Point Estimator
Population Mean
Sample Mean
Population Variance
Sample Variance
Population Proportion
Sample Proportion
Note: Each point estimator provides the best single-value estimate of its corresponding population parameter, based on a random sample.
Question:
A zoologist collected data on the body weight (in kg) of 6 randomly selected adult cheetahs from a wildlife reserve.
The recorded weights are as follows:
42,\ 47,\ 39,\ 45,\ 44,\ 43
Additionally, out of these 6 cheetahs, 4 were identified as healthy based on veterinary examination.
Tasks:
Estimate the population mean body weight of adult cheetahs.
Estimate the population variance of body weight.
Estimate the population proportion of healthy cheetahs.
8.2 Interval Estimation
While a point estimate gives a single best guess for a population parameter, it does not indicate how reliable that estimate is. Interval estimation provides a range of plausible values within which the true population parameter is likely to lie.
The general form of an interval estimate is as follows:
The general form of a confidence interval (CI) is:
\text{Point Estimate} \ \pm\ \text{Margin of Error}
Confidence Interval for the Population Mean (\mu)
Condition
Confidence Interval Formula
Distribution Used
Population standard deviation \sigma known
\bar{x} \pm z_{\alpha/2} \frac{\sigma}{\sqrt{n}}
Standard Normal z)
\sigma unknown, population normal
\bar{x} \pm t_{\alpha/2,\,df} \frac{s}{\sqrt{n}}
Student’s (t)-distribution (df = n-1)
\sigma unknown, large sample
\bar{x} \pm t_{\alpha/2,\,df} \frac{s}{\sqrt{n}}
Student’s (t)-distribution or Standard Normal distribution
\bar{x} \pm z_{\alpha/2} \frac{s}{\sqrt{n}}
Confidence Interval for the Population Proportion (\theta)
Use exact (Clopper–Pearson) or Wilson interval methods
Binomial-based
Example:
A zoologist is studying the body length (in cm) of a rare frog species in a rainforest. She randomly captures 10 frogs and measures their lengths:
Sample data (body lengths in cm):
7.8, 8.2, 7.5, 8.0, 7.9, 8.1, 7.6, 8.3, 7.7, 8.0
Assume body lengths are normally distributed. Construct a 95% confidence interval (CI) for the population mean.
Interpretation:
We are 95% confident that the true mean body length of this frog species in the rainforest lies between 7.73 cm and 8.09 cm.
This does not mean that 95% of the frogs have lengths in this range. It refers to the population mean.
If we repeated this sampling many times, 95% of the calculated confidence intervals would contain the true mean.
Rcode:
# Sample data (frog body lengths in cm)frog_lengths<-c(7.8, 8.2, 7.5, 8.0, 7.9, 8.1, 7.6, 8.3, 7.7, 8.0)# Compute 95% confidence interval using t-distributiont_test_result<-t.test(frog_lengths, conf.level =0.95)# Display resultst_test_result
One Sample t-test
data: frog_lengths
t = 96.159, df = 9, p-value = 7.215e-15
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
7.723916 8.096084
sample estimates:
mean of x
7.91
Construct a 95% confidence interval for the average wing length of hummingbirds. Assume that wing lengths are normally distributed.
Question 2
A zoologist studies a population of butterflies in a forest. She randomly captures 120 butterflies and finds that 78 of them have blue wings.
Estimate the population proportion of blue-winged butterflies with a 95% confidence interval. Assume the sample is random and large enough for the normal approximation.