STA 113 2.0 Descriptive Statistics

Frequency Distributions

Dr. Thiyanga S. Talagala
Department of Statistics, Faculty of Applied Sciences
University of Sri Jayewardenepura, Sri Lanka

Marks

Here are the marks for 50 students:

27 37 57 91 20 90 94 66 63 6
21 18 69 38 77 50 72 99 38 78
93 21 65 13 27 39 1 38 87 34
48 60 49 19 83 67 79 11 72 41
82 65 78 55 53 79 2 48 73 69

Ordered array

Ordered sequence of raw data

6 7 8 10 12 14 20 24 24 25
26 29 32 33 33 33 34 35 39 40
41 41 43 44 46 46 48 48 48 52
60 64 65 66 71 71 76 77 78 78
80 81 84 86 86 88 88 89 91 96

An ordered list makes it easier to find the highest and lowest values and see the range.

Frequency Distribution

A summary table in which data are arranged into ordered classes or categories to determine the number of observations belonging to each class.

Constructing a frequency distribution

Step 1: Arrange data into an ordered array.

6 7 8 10 12 14 20 24 24 25
26 29 32 33 33 33 34 35 39 40
41 41 43 44 46 46 48 48 48 52
60 64 65 66 71 71 76 77 78 78
80 81 84 86 86 88 88 89 91 96

Step 2: Decide on the number of classes (k)

Sturges’s Rule

\[\text{Number of classes} = 1+3.3log_{10}(N)\]

Here \(N\) is total number of obervations.

Example:

\[ 1+3.3log_{10}(50) = 6.606\]

\[\text{Number of classes} = 7\]

Step 3: Calculate the class width/ class size/ class length (c)

\[\text{width of interval} = \frac{\text{range}}{k}\]

\[\text{range = maximum - minimum}\]

Example

\[\frac{96-6}{7}=12.86\]

\[\text{c} \approx 13\]

Step 4: Calculate lower class limits

\[6\]

\[6+13 = 19\]

\[19+13=32\]

\[25+13=38\]

\[31+13=44\]

\[37+13=50\]

\[43+13=62\]

Step 5: Calculate class limits/ class intervals

CI
6-18
19-31
32-44
45-57
58-70
71-83
84-96

Step 6: Prepare a tally sheet

CI Tally Frequency
6-18 6
19-31 6
32-44 12
45-57 6
58-70 4
71-83 8
84-96 8

Step 7: Class Boundaries

CI Boundaries Frequency
6-18 5.5-18.5 6
19-31 18.5-31.5 6
32-44 31.5-44.5 12
45-57 44.5-57.5 6
58-70 57.5-70.5 4
71-83 70.5-83.5 8
84-96 83.5-96.5 8

To find class boundaries

  1. Subtract the first upper class limit from the second lower class limit and divide the difference by 2.

  2. Subtract the value calculated in step 1

    • from all of the lower class limits

    • add to all of the upper class limits.

Idea Behind Class Boundaries

CI Boundaries Frequency
6-18 [5.5, 18.5) 6
19-31 [18.5, 31.5) 6
32-44 [31.5, 44.5) 12
45-57 [44.5, 57.5) 6
58-70 [57.5, 70.5) 4
71-83 [70.5, 83.5) 8
84-96 [83.5, 96.5) 8

Size or width of a class interval (c)

\[c= \text{difference between successive lower class limits\class boundaries}\]

or

\[c= \text{difference between successive upper class limits\class boundaries}\]

or

\[c= \text{difference between successive upper class boundary and its lower class boundary}\] > Example

\[c=18.5-5.5=31.5-18.5=13\]

Step 8: Class mark/ class midpoint

\[\text{class mark}=\frac{\text{upper limit} +\text{lower limit}}{2}\]

Example

\[\text{class mark}=\frac{6+18}{2} = 12\]

Class mark/ class midpoint

CI Boundaries Mid_point Frequency
6-18 5.5-18.5 12 6
19-31 18.5-31.5 25 6
32-44 31.5-44.5 38 12
45-57 44.5-57.5 51 6
58-70 57.5-70.5 64 4
71-83 70.5-83.5 77 8
84-96 83.5-96.5 90 8

Your turn

Compute

  1. Cumulative-frequency distribution

  2. Percentage cumulative distributions

  3. Relative frequency distribution

  4. Relative cumulative frequency distribution

Histogram

Frequency Polygons

CL Lower_Boundary Upper_Boundary Mid_point Frequency
-7.5 5.5 -1 0
6-18 5.5 18.5 12 6
19-31 18.5 31.5 25 6
32-44 31.5 44.5 38 12
45-57 44.5 57.5 51 6
58-70 57.5 70.5 64 4
71-83 70.5 83.5 77 8
84-96 83.5 96.5 90 8
96.5 109.5 103 0

Cumulative Frequency Distribution

The total frequency of all values less than the upper class boundary.

Marks Cumulative_Frequency
5.5 0
18.5 6
31.5 12
44.5 24
57.5 30
70.5 34
83.5 42
96.5 50

The ogive (or Cumulative - Frequency Polygon)

  • Less than Ogive

  • Greater than Ogive

Less than Ogive

First of all, we have to convert the frequency distribution into a less than cumulative frequency distribution.

Boundaries Frequency
5.5-18.5 6
18.5-31.5 6
31.5-44.5 12
44.5-57.5 6
57.5-70.5 4
70.5-83.5 8
83.5-96.5 8
Boundaries Frequency
less than 18.5 6
less than 31.5 12
less than 44.5 24
less than 57.5 30
less than 70.5 34
less than 83.5 42
less than 96.5 50

Less than Ogive

Your turn

Plot greater than ogive.

First of all, we have to convert the frequency distribution into a greater than cumulative frequency distribution.

Boundaries Frequency
5.5-18.5 6
18.5-31.5 6
31.5-44.5 12
44.5-57.5 6
57.5-70.5 4
70.5-83.5 8
83.5-96.5 8
Boundaries Frequency
greater than or equal 5.5 50
greater than or equal 18.5 44
greater than or equal 31.5 38
greater than or equal 44.5 26
greater than or equal 57.5 20
greater than or equal 70.5 16
greater than or equal 83.5 8

Your turn

9 -20 25 27 -6 14 13 2 7 18
18 4 7 -1 2 4 4 4 -1 5
5 2 26 -1 23 12 21 -5 5 11
4 13 8 30 14 7 6 18 26 16
17 14 23 16 18 16 16 16 14 -19
9 -3 10 -10 3 -10 9 11 -6 16
8 11 6 25 9 3 20 -8 -5 34
-1 2 10 20 15 13 16 -1 20 4

Construct

  1. Frequency distribution

  2. Histogram

  3. Polygon

  4. Cumulative frequency distribution

  5. Orgive