Introduction to Statistics

Methods to find the mean of grouped data.

Advertisement
Lesson Notes PDF
1 /
Loading PDF…
Class 10 » Mathematics » Statistics

Introduction to Statistics — Mean & Methods for Grouped Data

Direct Method, Assumed Mean Method & Step Deviation Method — CBSE, Telangana & Andhra Pradesh Syllabus
📄 [PDF Viewer block — already present on the live page above this content]
Advertisement (Top Banner — already in page template)

Introduction to Statistics — Chapter 14, Class 10 Mathematics

Chapter 14 — Statistics in Class 10 Mathematics (CBSE, Telangana & Andhra Pradesh syllabus) revisits and extends the statistics you studied in earlier classes. In Class 9 you learned how to collect, organise, and represent data using graphs. This chapter takes you further — to analysing data by calculating three key measures of central tendency: the Mean, the Median, and the Mode, particularly for grouped (classified) data where individual values are not listed one by one.

This introductory lesson focuses entirely on the Mean — what it is, how it works for simple (ungrouped) data, and how to extend the concept to handle grouped data efficiently using three different calculation methods: the Direct Method, the Assumed Mean Method, and the Step Deviation Method. By the end of this lesson you will understand not only how to get the answer, but why each method works — and when each one is more convenient to use.

What Is the Mean? Direct Method Assumed Mean Method Step Deviation Method
💡 Why statistics matter in real life: Every time you see an exam class average, a city's average temperature, or a company's average monthly sales figure, you are looking at a mean. Statistics is the mathematical language used to summarise large amounts of data into a few meaningful numbers that anyone can quickly understand and compare.

What Is the Mean?

The mean — often called the average — is the single value that best represents an entire collection of data. You calculate it by adding all the observed values together and then dividing that sum by how many observations you have. This spreads the total equally across every observation, giving one fair "representative" value for the whole dataset.

Mean = Sum of all observations ÷ Number of observations

This formula works perfectly when data is ungrouped — that is, when you have a raw list of individual values. Let's see two worked examples from the textbook, both showing this formula in action.

Worked Example 1 — Cinema Attendance Over Nine Days

A manager at a small movie theatre recorded the number of people who attended each day over nine consecutive days. The daily attendance figures were:

Day123456789
Attendance 8189928593628510590

The manager wants to know the average daily attendance over these nine days so he can compare it with neighbouring theatres.

Step-by-Step Solution
Mean of ungrouped cinema attendance data
Number of observations (n) = 9 Sum = 81 + 89 + 92 + 85 + 93 + 62 + 85 + 105 + 90 Sum = 782 Mean = Sum ÷ n = 782 ÷ 9 Mean = 86.8
Interpretation: On average, about 87 people (86.8 rounded) attended the cinema each day over the nine-day period. Notice that on Day 6 only 62 people came — well below average — while Day 8 had 105 — well above. The mean balances all these highs and lows into one representative figure.

Worked Example 2 — Virat Kohli's Test Centuries (2012–2019)

The number of Test cricket centuries scored by Virat Kohli in each year from 2012 to 2019 was: 3, 2, 4, 2, 4, 5, 5, 2. Notice that several values repeat — for instance, the value 2 appears three times, and 4 appears twice. When data has repeating values, you can make your calculation more efficient by grouping identical values together and counting how many times each appears (its frequency).

Centuries (xᵢ) Frequency (fᵢ) — number of years fᵢ × xᵢ
236
313
428
5210
Total Σfᵢ = 8 Σfᵢxᵢ = 27
Step-by-Step Solution
Mean using frequency notation
Mean = Σfᵢxᵢ ÷ Σfᵢ = 27 ÷ 8 = 3.375 centuries per year

This leads us to the general formula for the mean when values have frequencies, which applies to both ungrouped data with repeated values and to grouped (classified) data:

Mean = Σfᵢxᵢ / Σfᵢ
where xᵢ = each observation value, fᵢ = its frequency
📌 Key insight: When you sum all the fᵢ values (Σfᵢ), you get the total number of observations n. And Σfᵢxᵢ is simply the total sum of all values, exactly like the numerator in the basic mean formula. So this frequency formula is not a new idea — it is just a more compact way of writing the same calculation.
Advertisement

Three Methods to Find the Mean of Grouped Data

When data is grouped into class intervals (for example, "wages from Rs. 470–480"), individual values within each class are not known — only the class range and the number of observations (frequency) in that class. To find the mean of such grouped data, the textbook introduces three methods. All three give the same answer, but they differ in how much arithmetic you need to do:

1. Direct Method
Mean = Σfᵢxᵢ / Σfᵢ

Uses the class mark xᵢ directly. Simple but multiplications can be large.

2. Assumed Mean
Mean = a + Σfᵢdᵢ / Σfᵢ

Subtracts an assumed mean a first. Smaller numbers to multiply.

3. Step Deviation
Mean = a + (Σfᵢuᵢ / Σfᵢ) × h

Divides deviations by class width h. Smallest numbers — easiest arithmetic.

All three methods are demonstrated on the same dataset — daily wages of workers — so you can see exactly how the calculations compare and confirm that every method produces the identical answer of Rs. 503.50.

📌 Class mark (midpoint): For grouped data, you cannot know the exact value of each observation within a class interval. The standard assumption is that all values in a class cluster around the midpoint of that class, called the class mark:
Class mark (xᵢ) = (Upper class limit + Lower class limit) ÷ 2 For the class 470–480: xᵢ = (480 + 470) ÷ 2 = 950 ÷ 2 = 475

Method 1 — Direct Method (Fully Worked)

Daily wage data for 20 workers is grouped into six class intervals. In the Direct Method you calculate the class mark for every interval, multiply it by the class frequency, sum up all those products, and divide by the total frequency.

Daily Wages (Rs.) Frequency (fᵢ) Class Mark (xᵢ) fᵢ × xᵢ
470 – 4802475950
480 – 49034851455
490 – 50044951980
500 – 51025051010
510 – 52055152575
520 – 53045252100
Total Σfᵢ = 20 Σfᵢxᵢ = 10070
Direct Method — Final Calculation
Mean daily wage of workers
Mean = Σfᵢxᵢ ÷ Σfᵢ = 10070 ÷ 20 = Rs. 503.50
Result: The mean daily wage of the workers is Rs. 503.50. You get this by working directly with the actual class marks (475, 485 … 525) — hence the name "Direct Method". The arithmetic is straightforward but the products (950, 1455, 1980 …) are large numbers, which makes this method a little more error-prone by hand.

Method 2 — Assumed Mean Method (Fully Worked)

To reduce the size of the numbers you work with, you pick one class mark as a convenient reference value called the assumed mean (a) — usually the class mark in the middle of the data. Here, a = 495 is chosen. You then calculate how far each class mark is from this assumed mean, calling that deviation dᵢ = xᵢ − a. Because the deviations are small (−20, −10, 0, 10, 20, 30), the multiplications become much simpler.

Daily Wages (Rs.) fᵢ xᵢ dᵢ = xᵢ − 495 fᵢ × dᵢ
470 – 4802475−20−40
480 – 4903485−10−30
490 – 500449500
500 – 5102505+10+20
510 – 5205515+20+100
520 – 5304525+30+120
Total Σfᵢ = 20 Σfᵢdᵢ = 170
Assumed Mean Method — Final Calculation
Mean = a + Σfᵢdᵢ / Σfᵢ  (where a = 495)
Mean = 495 + (170 ÷ 20) = 495 + 8.5 = Rs. 503.50 ✓
🔍 Why does this work? Because dᵢ = xᵢ − a, when you compute Σfᵢdᵢ you are finding "how much the actual total differs from what it would be if every observation equalled a". Adding that correction back to a gives you the true mean. You can choose any class mark as a — the result will always be the same. Choosing a middle class mark minimises the size of dᵢ values, making arithmetic easier.

Method 3 — Step Deviation Method (Fully Worked)

When the class intervals are all equal in width (here every class is 10 rupees wide, so h = 10), you can simplify the arithmetic even further by dividing each deviation by h. The result is called uᵢ = dᵢ / h = (xᵢ − a) / h. Because h is a common factor, the numbers you multiply together become the smallest of all three methods (−2, −1, 0, 1, 2, 3 in this example).

Daily Wages (Rs.) fᵢ xᵢ dᵢ = xᵢ − 495 uᵢ = dᵢ / 10 fᵢ × uᵢ
470 – 4802475−20−2−4
480 – 4903485−10−1−3
490 – 5004495000
500 – 5102505+10+1+2
510 – 5205515+20+2+10
520 – 5304525+30+3+12
Total Σfᵢ = 20 Σfᵢuᵢ = 17
Step Deviation Method — Final Calculation
Mean = a + (Σfᵢuᵢ / Σfᵢ) × h  (a = 495, h = 10)
Mean = 495 + (17 ÷ 20) × 10 = 495 + (17 ÷ 2) = 495 + 8.5 = Rs. 503.50 ✓
🔍 Why does this work? Because uᵢ = dᵢ / h, multiplying Σfᵢuᵢ by h at the end "puts back" the h factor that was divided out. This is essentially the Assumed Mean Method but with an extra scaling step that makes the column values (−2, −1, 0, 1, 2, 3) extremely small and easy to multiply. This method is only valid when all class widths are equal.

Comparing All Three Methods at a Glance

All three methods always produce the same answer. Their only difference is the level of arithmetic involved — the step deviation method is the most efficient when class widths are equal, but all three are valid approaches accepted in board exams.

Feature Direct Method Assumed Mean Step Deviation
Formula Σfᵢxᵢ / Σfᵢ a + Σfᵢdᵢ / Σfᵢ a + (Σfᵢuᵢ / Σfᵢ)×h
Extra variable None dᵢ = xᵢ − a uᵢ = (xᵢ − a) / h
Numbers to multiply Large (475 × 2 …) Small (−20 × 2 …) Smallest (−2 × 2 …)
Works when class widths differ? Yes Yes No — equal widths only
Answer for this dataset 503.50 503.50 503.50
xᵢ
Class mark (midpoint)
fᵢ
Class frequency
a
Assumed mean
h
Class width (step)

Common Mistakes to Avoid

  • Using raw class limits instead of the class mark: For grouped data, you must always use the midpoint (class mark xᵢ), never the upper or lower boundary of the class. Using an endpoint gives a wrong answer.
  • Choosing an assumed mean that is not a class mark: In the Assumed Mean and Step Deviation methods, a must be one of the class marks in the table — not an arbitrary number you make up. Any class mark works, but using one from the table keeps dᵢ values manageable.
  • Using the Step Deviation method when class widths are unequal: The step deviation formula relies on dividing every dᵢ by the same h. If class widths differ, there is no single h to use — this method simply does not apply.
  • Sign errors in the dᵢ column: Deviations above the assumed mean are positive, below it are negative. A sign error in one row will change Σfᵢdᵢ and give a wrong final answer.
  • Forgetting to multiply Σfᵢuᵢ by h at the end: In the Step Deviation method, the factor h must be multiplied back in the final step. Forgetting this is the most common exam error for this method.
  • Treating Σfᵢ incorrectly: Always confirm that Σfᵢ equals the total number of observations in the dataset, not just the number of classes. Here Σfᵢ = 20 workers, not 6 (the number of class intervals).
Board exam alert: In Telugu State and CBSE board papers, the Statistics questions on Mean almost always involve grouped data with a neatly constructed frequency table. You will typically be asked to find the mean using a specific method — or sometimes asked to show two methods and verify they give the same answer. Know all three methods thoroughly.

What This Lesson Prepares You For

This introduction to Mean is the foundation for the rest of Chapter 14. The same style of frequency table is used in all subsequent topics: calculating the Median of grouped data (using the median formula with cumulative frequencies) and finding the Mode of grouped data (using the modal class and the mode formula). You will also encounter ogives (cumulative frequency graphs), which provide a visual way to read off the median from grouped data.

The concept of class mark introduced here connects back to the data handling topics in the Class 9 Statistics chapter. The careful algebraic manipulation used in the Assumed Mean and Step Deviation methods reinforces skills from Polynomials and Algebraic Expressions.

📐 Board Exam Tip (CBSE, Telangana & AP): A 5-mark question on finding the mean of grouped data — typically using the Assumed Mean or Step Deviation method — appears in almost every Class 10 board exam. Practise building the full frequency table with all columns neatly labelled, computing the totals carefully, and writing the substitution step clearly before reaching the final answer. Examiners award marks at every step of the working, so never skip a line.
Advertisement (Bottom Banner — already in page template)
Advertisement