Introduction to Statistics
Methods to find the mean of grouped data.
Introduction to Statistics — Mean & Methods for Grouped Data
Introduction to Statistics — Chapter 14, Class 10 Mathematics
Chapter 14 — Statistics in Class 10 Mathematics (CBSE, Telangana & Andhra Pradesh syllabus) revisits and extends the statistics you studied in earlier classes. In Class 9 you learned how to collect, organise, and represent data using graphs. This chapter takes you further — to analysing data by calculating three key measures of central tendency: the Mean, the Median, and the Mode, particularly for grouped (classified) data where individual values are not listed one by one.
This introductory lesson focuses entirely on the Mean — what it is, how it works for simple (ungrouped) data, and how to extend the concept to handle grouped data efficiently using three different calculation methods: the Direct Method, the Assumed Mean Method, and the Step Deviation Method. By the end of this lesson you will understand not only how to get the answer, but why each method works — and when each one is more convenient to use.
What Is the Mean?
The mean — often called the average — is the single value that best represents an entire collection of data. You calculate it by adding all the observed values together and then dividing that sum by how many observations you have. This spreads the total equally across every observation, giving one fair "representative" value for the whole dataset.
Mean = Sum of all observations ÷ Number of observations
This formula works perfectly when data is ungrouped — that is, when you have a raw list of individual values. Let's see two worked examples from the textbook, both showing this formula in action.
Worked Example 1 — Cinema Attendance Over Nine Days
A manager at a small movie theatre recorded the number of people who attended each day over nine consecutive days. The daily attendance figures were:
| Day | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|---|---|---|---|---|---|---|---|---|---|
| Attendance | 81 | 89 | 92 | 85 | 93 | 62 | 85 | 105 | 90 |
The manager wants to know the average daily attendance over these nine days so he can compare it with neighbouring theatres.
Worked Example 2 — Virat Kohli's Test Centuries (2012–2019)
The number of Test cricket centuries scored by Virat Kohli in each year from 2012 to 2019 was: 3, 2, 4, 2, 4, 5, 5, 2. Notice that several values repeat — for instance, the value 2 appears three times, and 4 appears twice. When data has repeating values, you can make your calculation more efficient by grouping identical values together and counting how many times each appears (its frequency).
| Centuries (xᵢ) | Frequency (fᵢ) — number of years | fᵢ × xᵢ |
|---|---|---|
| 2 | 3 | 6 |
| 3 | 1 | 3 |
| 4 | 2 | 8 |
| 5 | 2 | 10 |
| Total | Σfᵢ = 8 | Σfᵢxᵢ = 27 |
This leads us to the general formula for the mean when values have frequencies, which applies to both ungrouped data with repeated values and to grouped (classified) data:
Mean = Σfᵢxᵢ / Σfᵢ
where xᵢ = each observation value, fᵢ = its frequency
Three Methods to Find the Mean of Grouped Data
When data is grouped into class intervals (for example, "wages from Rs. 470–480"), individual values within each class are not known — only the class range and the number of observations (frequency) in that class. To find the mean of such grouped data, the textbook introduces three methods. All three give the same answer, but they differ in how much arithmetic you need to do:
Mean = Σfᵢxᵢ / Σfᵢ
Uses the class mark xᵢ directly. Simple but multiplications can be large.
Mean = a + Σfᵢdᵢ / Σfᵢ
Subtracts an assumed mean a first. Smaller numbers to multiply.
Mean = a + (Σfᵢuᵢ / Σfᵢ) × h
Divides deviations by class width h. Smallest numbers — easiest arithmetic.
All three methods are demonstrated on the same dataset — daily wages of workers — so you can see exactly how the calculations compare and confirm that every method produces the identical answer of Rs. 503.50.
Class mark (xᵢ) = (Upper class limit + Lower class limit) ÷ 2
For the class 470–480: xᵢ = (480 + 470) ÷ 2 = 950 ÷ 2 = 475
Method 1 — Direct Method (Fully Worked)
Daily wage data for 20 workers is grouped into six class intervals. In the Direct Method you calculate the class mark for every interval, multiply it by the class frequency, sum up all those products, and divide by the total frequency.
| Daily Wages (Rs.) | Frequency (fᵢ) | Class Mark (xᵢ) | fᵢ × xᵢ |
|---|---|---|---|
| 470 – 480 | 2 | 475 | 950 |
| 480 – 490 | 3 | 485 | 1455 |
| 490 – 500 | 4 | 495 | 1980 |
| 500 – 510 | 2 | 505 | 1010 |
| 510 – 520 | 5 | 515 | 2575 |
| 520 – 530 | 4 | 525 | 2100 |
| Total | Σfᵢ = 20 | — | Σfᵢxᵢ = 10070 |
Method 2 — Assumed Mean Method (Fully Worked)
To reduce the size of the numbers you work with, you pick one class mark as a convenient reference value called the assumed mean (a) — usually the class mark in the middle of the data. Here, a = 495 is chosen. You then calculate how far each class mark is from this assumed mean, calling that deviation dᵢ = xᵢ − a. Because the deviations are small (−20, −10, 0, 10, 20, 30), the multiplications become much simpler.
| Daily Wages (Rs.) | fᵢ | xᵢ | dᵢ = xᵢ − 495 | fᵢ × dᵢ |
|---|---|---|---|---|
| 470 – 480 | 2 | 475 | −20 | −40 |
| 480 – 490 | 3 | 485 | −10 | −30 |
| 490 – 500 | 4 | 495 | 0 | 0 |
| 500 – 510 | 2 | 505 | +10 | +20 |
| 510 – 520 | 5 | 515 | +20 | +100 |
| 520 – 530 | 4 | 525 | +30 | +120 |
| Total | Σfᵢ = 20 | — | — | Σfᵢdᵢ = 170 |
Method 3 — Step Deviation Method (Fully Worked)
When the class intervals are all equal in width (here every class is 10 rupees wide, so h = 10), you can simplify the arithmetic even further by dividing each deviation by h. The result is called uᵢ = dᵢ / h = (xᵢ − a) / h. Because h is a common factor, the numbers you multiply together become the smallest of all three methods (−2, −1, 0, 1, 2, 3 in this example).
| Daily Wages (Rs.) | fᵢ | xᵢ | dᵢ = xᵢ − 495 | uᵢ = dᵢ / 10 | fᵢ × uᵢ |
|---|---|---|---|---|---|
| 470 – 480 | 2 | 475 | −20 | −2 | −4 |
| 480 – 490 | 3 | 485 | −10 | −1 | −3 |
| 490 – 500 | 4 | 495 | 0 | 0 | 0 |
| 500 – 510 | 2 | 505 | +10 | +1 | +2 |
| 510 – 520 | 5 | 515 | +20 | +2 | +10 |
| 520 – 530 | 4 | 525 | +30 | +3 | +12 |
| Total | Σfᵢ = 20 | — | — | — | Σfᵢuᵢ = 17 |
Comparing All Three Methods at a Glance
All three methods always produce the same answer. Their only difference is the level of arithmetic involved — the step deviation method is the most efficient when class widths are equal, but all three are valid approaches accepted in board exams.
| Feature | Direct Method | Assumed Mean | Step Deviation |
|---|---|---|---|
| Formula | Σfᵢxᵢ / Σfᵢ |
a + Σfᵢdᵢ / Σfᵢ |
a + (Σfᵢuᵢ / Σfᵢ)×h |
| Extra variable | None | dᵢ = xᵢ − a | uᵢ = (xᵢ − a) / h |
| Numbers to multiply | Large (475 × 2 …) | Small (−20 × 2 …) | Smallest (−2 × 2 …) |
| Works when class widths differ? | Yes | Yes | No — equal widths only |
| Answer for this dataset | 503.50 | 503.50 | 503.50 |
Common Mistakes to Avoid
- Using raw class limits instead of the class mark: For grouped data, you must always use the midpoint (class mark xᵢ), never the upper or lower boundary of the class. Using an endpoint gives a wrong answer.
- Choosing an assumed mean that is not a class mark: In the Assumed Mean and Step Deviation methods, a must be one of the class marks in the table — not an arbitrary number you make up. Any class mark works, but using one from the table keeps dᵢ values manageable.
- Using the Step Deviation method when class widths are unequal: The step deviation formula relies on dividing every dᵢ by the same h. If class widths differ, there is no single h to use — this method simply does not apply.
- Sign errors in the dᵢ column: Deviations above the assumed mean are positive, below it are negative. A sign error in one row will change Σfᵢdᵢ and give a wrong final answer.
- Forgetting to multiply Σfᵢuᵢ by h at the end: In the Step Deviation method, the factor h must be multiplied back in the final step. Forgetting this is the most common exam error for this method.
- Treating Σfᵢ incorrectly: Always confirm that Σfᵢ equals the total number of observations in the dataset, not just the number of classes. Here Σfᵢ = 20 workers, not 6 (the number of class intervals).
What This Lesson Prepares You For
This introduction to Mean is the foundation for the rest of Chapter 14. The same style of frequency table is used in all subsequent topics: calculating the Median of grouped data (using the median formula with cumulative frequencies) and finding the Mode of grouped data (using the modal class and the mode formula). You will also encounter ogives (cumulative frequency graphs), which provide a visual way to read off the median from grouped data.
The concept of class mark introduced here connects back to the data handling topics in the Class 9 Statistics chapter. The careful algebraic manipulation used in the Assumed Mean and Step Deviation methods reinforces skills from Polynomials and Algebraic Expressions.