STA 113 2.0 Descriptive Statistics

Measures of Central Tendency

Dr. Thiyanga S. Talagala
Department of Statistics, Faculty of Applied Sciences
University of Sri Jayewardenepura, Sri Lanka

Data

s1 s2 s3 s4
-0.4941825 0.9915941 1.2651757 -0.4547021
-0.6687905 0.7621325 0.4623908 -0.5227580
-0.5469909 0.4472162 0.1096497 -0.9385280
-0.6231902 0.9652850 1.6850990 -1.0626077
1.4458192 0.7953235 0.1734091 -0.7810612
1.4321517 0.9403680 0.2576630 -0.9400142
-1.7834318 0.2037412 0.9451751 -0.4921403
-0.1310000 0.5811571 0.0841218 -0.9899774
-1.1957376 0.5004076 0.9543987 -0.6680712
0.4298630 0.6224225 0.2428342 -0.2485180
0.6814681 0.3328808 0.8645305 -0.7541138
-0.7584425 0.6059351 0.8742853 -0.6532270
0.4903058 0.5572273 0.0927645 -1.1676951
-0.0359587 0.9757127 1.4798731 -1.0630715
-0.4126305 0.3127026 0.6411980 -0.9250913
-0.6958339 0.8282239 0.6666132 -1.2498425
1.1344751 0.5325348 0.1791371 -0.7756105
-0.2479113 0.2439792 0.5884629 -0.8957835
0.3880169 0.0597912 0.5377590 -1.2037738
-0.6955792 0.7714712 0.7566644 -0.3658032
-1.2602048 0.8607332 0.0237823 -0.8675267
-0.4121627 0.9078947 0.3648342 -1.3747030
-1.0600587 0.5853123 0.0991288 -0.9682996
-0.6901896 0.8314007 0.1282657 -0.8799173
-0.1512943 0.7785955 0.3227742 -1.3436420
-0.8817142 0.2555937 0.2120927 -0.8572044
2.3139477 0.3352818 3.4199927 -0.7016565
2.6254204 0.9125926 0.3246409 -1.2752869
0.0997495 0.1433173 0.5174605 -0.3899073
-0.6003863 0.5621560 0.8610009 -0.8642701
0.7316799 0.7373602 0.2427046 -1.1415947
0.5253369 0.4902503 2.0961923 -0.7712001
0.2979311 0.4586207 0.2769770 -0.1074875
s1 s2 s3 s4
34 0.1707726 0.9417774 0.5032580 -0.6044107
35 -0.3198814 0.8077437 0.0181263 -1.2421812
36 -0.8186460 0.6444308 6.8161623 -1.1468929
37 -0.0072054 0.8448965 0.3052239 -0.8844985
38 -0.4512637 0.3424624 1.3661048 -1.3386497
39 -0.1925807 0.5625122 0.9218917 -0.8877711
40 2.2657184 0.6736529 0.2075009 -0.4708897
41 -0.9951849 0.8909363 5.0017890 -0.9994745
42 -0.2856625 0.8033804 0.0968602 -0.7467907
43 0.2191878 0.1782850 1.5306568 -0.7682978
44 -1.0367488 0.8343284 0.3886233 -1.3388111
45 1.1718172 0.2599477 2.9639756 -0.7148029
46 0.1918960 0.5094981 0.5536213 -0.7586514
47 0.5286750 0.7987479 0.4308663 -0.9896018
48 1.5910981 0.6943393 0.6255345 -1.1681160
49 -1.1722861 0.9982673 0.4587839 -0.9371935
50 0.1934595 0.6716770 0.3116617 -1.1103634
51 -1.4356298 0.9698931 5.7776384 0.8623060
52 -0.7890743 0.5939492 0.0224797 1.2137335
53 -0.6106055 0.0478229 3.0745683 0.9983251
54 -2.3119511 0.5605901 0.8132757 0.6957828
55 0.8667858 0.3635374 0.2700010 0.6795177
56 0.4041022 0.7116650 0.6794256 0.8919491
57 2.0842797 0.6958857 0.3890802 1.4396962
58 -1.6350715 0.4293465 0.9316933 1.5557211
59 -1.0041166 0.7481887 2.2529575 0.9289430
60 -0.1833006 0.8444459 0.3561461 1.2954135
61 -0.6370463 0.6549463 1.5700440 1.4239294
62 0.7891126 0.9183704 3.4331633 0.9424039
63 -1.9343009 0.8069813 1.7880099 1.0407747
64 1.0142296 0.9055865 0.5375650 0.0505981
65 1.9339288 0.2774429 0.4135352 1.1525336
66 0.0475422 0.9947034 1.3488308 1.2897422
67 0.2263924 0.8236884 0.8586223 0.4717407
s1 s2 s3 s4
68 -0.4959669 0.5652228 1.1851333 1.1189474
69 0.8229532 0.8178838 0.1777389 0.1525808
70 0.9298399 0.9716167 2.0837673 1.1283459
71 -0.8396403 0.6955667 1.1255832 0.8254749
72 -2.6643294 0.5633113 2.6622062 0.4479653
73 1.6035095 0.3289823 0.7328378 1.1938157
74 -0.7712137 0.6223501 3.6241769 1.0612159
75 -0.3840300 0.3932550 0.1783265 0.9074435
76 0.4037572 0.8684061 0.0584704 0.7707421
77 -0.2177293 0.7424245 1.3926143 0.7044549
78 -0.0787400 0.8584837 0.1558730 0.7731302
79 -1.3780882 0.4996357 0.5496975 1.1174710
80 -0.4246498 0.9272462 0.0421114 1.0224212
81 0.3751303 0.2122582 0.0489879 1.2401247
82 1.0816647 0.8919946 5.6415263 1.1950053
83 -0.1589649 0.4317767 0.2723074 0.6052308
84 0.2382120 0.1271924 0.0322854 0.6628394
85 0.7576892 0.7173759 0.5761504 0.7795326
86 -0.9829056 0.3897945 0.4854600 0.8800581
87 0.3364830 0.4760740 2.5864094 1.1548429
88 -0.2949718 0.9238953 1.3136616 0.5118688
89 0.7969451 0.6609144 7.4977078 1.2616219
90 -0.6483374 0.3351259 1.1174526 0.1745040
91 -0.3334201 0.9155915 0.2567393 1.0479585
92 -0.9726304 0.6654774 0.3236723 0.8446453
93 -0.8009399 0.2512914 1.4986115 0.4152249
94 -1.2797944 0.9317872 2.0137554 0.8929735
95 -1.0862999 0.5974472 0.4877782 0.3485280
96 -1.1646197 0.9956436 0.0835399 0.6462357
97 -0.0201359 0.7754757 0.3588838 1.6385065
98 -0.4628470 0.6599176 2.7117010 0.7595622
99 -0.4063630 0.4857695 0.5379405 0.8379056
100 -0.0119749 0.8521035 1.3446080 0.5883827

Histogram

Histogram (binwidth = .1)

Histogram (binwidth = 2)

What is Descriptive Statistics?

  • Describe the data

  • Descriptive statistics involves summarizing and organizing the data so they can be easily understood

  • Does not attempt to make inferences from the sample to the whole population

Measures

  • Measures of central tendency

  • Measures of variability (Measures of dispersion/ Measures of spread)

  • Measures of Shape: Kurtosis, Skewness

Measures of Central Tendency

  • Used to identify the center of data distribution.

  • It describes a whole set of data with a single value that represents the center of its distribution.

  • One number that best summarizes the entire set of measurement.

  • Measures of central tendency: mode, median, mean, weighted mean, harmonic mean, geometric mean, quadratic mean

Let’s look at how to compute measures of central tendancy for ungrouped and grouped data.

In-calss demo

Ungrouped Data

Mode

  • The most frequently occurring value in a set of data.

  • Can be used to determine which category occurs most frequently

    • Example 1: Determine the mode for the following numbers.

    2, 4, 8, 4, 6, 2, 7, 8, 4, 3, 8, 9, 4, 3, 10, 21, 4

  • The mode can be determined for qualitative data as well as quantitative data.

    • Example 2: A group of 10 people were asked about their favorite shoe color.

    Black, Blue, Brown, White, Black, Black, Black, Brown, Brown, Black

Your turn

Determine the mode for the following numbers

Question 1

0.5, 0.1, 0.8, 0.8, 0.8, 0.7, 0.7, 0.7, 0.6, 0.2, 0.3, 0.1, 0.8, 0.7

Question 2

2, 4, 6, 8, 10, 12, 14, 16

05:00

Important facts

  1. Unimodal - only 1 mode

  2. Bimodal - 2 modes

  3. Multimodal - more than 2 modes

  4. No mode: There is no mode when all observed values appear the same number of times in a data set.

Applications: qualitative data example

  • A shoe manufacturing company conducted a market survey to understand consumer preferences for shoe colors. The mode is the color that appears the most frequently in the dataset. The company can use this information to decide which color of shoes to produce more.

Applications: quantitative data example

  • A shoe manufacturing company conducted a survey to find out the most popular shoe size among its customers. The mode of the shoe sizes is Size 9. The company can use this information to decide which shoe size to produce more. Since Size 9 is the most popular size, the company should prioritize producing more shoes in Size 9 to meet consumer demand.

Median

  • The middle value in an ordered array of numbers.

  • For an array with an odd number of observations, the median is the middle number.

  • For an array with an even number of observations, median is the mean of the two middle numbers.

Steps in calculating median

  1. Arrange the data in an ordered array of numbers.

  2. Count the number of observations. Suppose there are \(n\) number of observations.

  3. Locate the middle value of the ordered array as follws

Median formula when n is odd

\(Median = (\frac{n+1}{2})^{th} \text{observation}\)

Median formula when n is even

\(Median = \frac{(\frac{n}{2})^{observation} + (\frac{n}{2} + 1)^{observation}}{2}\)

Important: To compute the median, at least the required scale of measurement is ordinal.

Your turn

Determine the median for the following numbers.

Q1:

214, 215, 216, 105, 109, 8, 50, 1000, 150

Q2:

2, 3, 10, 11, 50, 5, 8, 9, 10, 5

06:00

Mean (Arithmetic mean)

\[\mu = \frac{\sum_{i=1}^Nx_i}{N}\] Population mean: \(\mu\)

Population size: \(N\)

\[\bar{x} = \frac{\sum_{i=1}^nx_i}{n}\]

Sample mean: \(\bar{x}\)

Sample size: \(n\)

Your turn

Determine the mean for the following numbers.

Q1:

214, 215, 216, 105, 109, 8, 50, 1000, 150

Q2:

2, 3, 10, 11, 50, 5, 8, 9, 10, 5

06:00

Your turn

The following table shows the number of insects observed on a farm over the course of a week. Calculate the mean, median, and mode of data.

Day Number of Insects
Monday 45
Tuesday 50
Wednesday 42
Thursday 55
Friday 48
Saturday 53
Sunday 47
06:00

Your turn

The following table shows the number of insects observed on a farm over the course of the month of April. Calculate the mean, median, and mode of the number of insects observed during this period.

Number of Insects Number of days
45 5
50 6
42 3
55 3
48 2
53 1
47 2
40 8
10:00