Introduction to Statistics

Methods to find the mean of grouped data.

Lesson Notes PDF

1 / —

Loading PDF…

Class 10 » Mathematics » Statistics

Introduction to Statistics — Mean & Methods for Grouped Data

Direct Method, Assumed Mean Method & Step Deviation Method — CBSE, Telangana & Andhra Pradesh Syllabus

📄 [PDF Viewer block — already present on the live page above this content]

Introduction to Statistics — Chapter 14, Class 10 Mathematics

Chapter 14 — Statistics in Class 10 Mathematics (CBSE, Telangana & Andhra Pradesh syllabus) revisits and extends the statistics you studied in earlier classes. In Class 9 you learned how to collect, organise, and represent data using graphs. This chapter takes you further — to analysing data by calculating three key measures of central tendency: the Mean, the Median, and the Mode, particularly for grouped (classified) data where individual values are not listed one by one.

This introductory lesson focuses entirely on the Mean — what it is, how it works for simple (ungrouped) data, and how to extend the concept to handle grouped data efficiently using three different calculation methods: the Direct Method, the Assumed Mean Method, and the Step Deviation Method. By the end of this lesson you will understand not only how to get the answer, but why each method works — and when each one is more convenient to use.

What Is the Mean? Direct Method Assumed Mean Method Step Deviation Method

💡 Why statistics matter in real life: Every time you see an exam class average, a city's average temperature, or a company's average monthly sales figure, you are looking at a mean. Statistics is the mathematical language used to summarise large amounts of data into a few meaningful numbers that anyone can quickly understand and compare.

What Is the Mean?

The mean — often called the average — is the single value that best represents an entire collection of data. You calculate it by adding all the observed values together and then dividing that sum by how many observations you have. This spreads the total equally across every observation, giving one fair "representative" value for the whole dataset.

Mean = Sum of all observations ÷ Number of observations

This formula works perfectly when data is ungrouped — that is, when you have a raw list of individual values. Let's see two worked examples from the textbook, both showing this formula in action.

Worked Example 1 — Cinema Attendance Over Nine Days

A manager at a small movie theatre recorded the number of people who attended each day over nine consecutive days. The daily attendance figures were:

Day	1	2	3	4	5	6	7	8	9
Attendance	81	89	92	85	93	62	85	105	90

The manager wants to know the average daily attendance over these nine days so he can compare it with neighbouring theatres.

Step-by-Step Solution

Mean of ungrouped cinema attendance data

Number of observations (n) = 9 Sum = 81 + 89 + 92 + 85 + 93 + 62 + 85 + 105 + 90 Sum = 782 Mean = Sum ÷ n = 782 ÷ 9 Mean = 86.8

✅ Interpretation: On average, about 87 people (86.8 rounded) attended the cinema each day over the nine-day period. Notice that on Day 6 only 62 people came — well below average — while Day 8 had 105 — well above. The mean balances all these highs and lows into one representative figure.

Worked Example 2 — Virat Kohli's Test Centuries (2012–2019)

The number of Test cricket centuries scored by Virat Kohli in each year from 2012 to 2019 was: 3, 2, 4, 2, 4, 5, 5, 2. Notice that several values repeat — for instance, the value 2 appears three times, and 4 appears twice. When data has repeating values, you can make your calculation more efficient by grouping identical values together and counting how many times each appears (its frequency).

Centuries (xᵢ)	Frequency (fᵢ) — number of years	fᵢ × xᵢ
2	3	6
3	1	3
4	2	8
5	2	10
Total	Σfᵢ = 8	Σfᵢxᵢ = 27

Step-by-Step Solution

Mean using frequency notation

Mean = Σfᵢxᵢ ÷ Σfᵢ = 27 ÷ 8 = 3.375 centuries per year

This leads us to the general formula for the mean when values have frequencies, which applies to both ungrouped data with repeated values and to grouped (classified) data:

Mean = Σfᵢxᵢ / Σfᵢ
where xᵢ = each observation value, fᵢ = its frequency

📌 Key insight: When you sum all the fᵢ values (Σfᵢ), you get the total number of observations n. And Σfᵢxᵢ is simply the total sum of all values, exactly like the numerator in the basic mean formula. So this frequency formula is not a new idea — it is just a more compact way of writing the same calculation.

Three Methods to Find the Mean of Grouped Data

When data is grouped into class intervals (for example, "wages from Rs. 470–480"), individual values within each class are not known — only the class range and the number of observations (frequency) in that class. To find the mean of such grouped data, the textbook introduces three methods. All three give the same answer, but they differ in how much arithmetic you need to do:

1. Direct Method

Mean = Σfᵢxᵢ / Σfᵢ

Uses the class mark xᵢ directly. Simple but multiplications can be large.

2. Assumed Mean

Mean = a + Σfᵢdᵢ / Σfᵢ

Subtracts an assumed mean a first. Smaller numbers to multiply.

3. Step Deviation

Mean = a + (Σfᵢuᵢ / Σfᵢ) × h

Divides deviations by class width h. Smallest numbers — easiest arithmetic.

All three methods are demonstrated on the same dataset — daily wages of workers — so you can see exactly how the calculations compare and confirm that every method produces the identical answer of Rs. 503.50.

📌 Class mark (midpoint): For grouped data, you cannot know the exact value of each observation within a class interval. The standard assumption is that all values in a class cluster around the midpoint of that class, called the class mark:
Class mark (xᵢ) = (Upper class limit + Lower class limit) ÷ 2 For the class 470–480: xᵢ = (480 + 470) ÷ 2 = 950 ÷ 2 = 475

Method 1 — Direct Method (Fully Worked)

Daily wage data for 20 workers is grouped into six class intervals. In the Direct Method you calculate the class mark for every interval, multiply it by the class frequency, sum up all those products, and divide by the total frequency.

Daily Wages (Rs.)	Frequency (fᵢ)	Class Mark (xᵢ)	fᵢ × xᵢ
470 – 480	2	475	950
480 – 490	3	485	1455
490 – 500	4	495	1980
500 – 510	2	505	1010
510 – 520	5	515	2575
520 – 530	4	525	2100
Total	Σfᵢ = 20	—	Σfᵢxᵢ = 10070

Direct Method — Final Calculation

Mean daily wage of workers

Mean = Σfᵢxᵢ ÷ Σfᵢ = 10070 ÷ 20 = Rs. 503.50

✅ Result: The mean daily wage of the workers is Rs. 503.50. You get this by working directly with the actual class marks (475, 485 … 525) — hence the name "Direct Method". The arithmetic is straightforward but the products (950, 1455, 1980 …) are large numbers, which makes this method a little more error-prone by hand.

Method 2 — Assumed Mean Method (Fully Worked)

To reduce the size of the numbers you work with, you pick one class mark as a convenient reference value called the assumed mean (a) — usually the class mark in the middle of the data. Here, a = 495 is chosen. You then calculate how far each class mark is from this assumed mean, calling that deviation dᵢ = xᵢ − a. Because the deviations are small (−20, −10, 0, 10, 20, 30), the multiplications become much simpler.

Daily Wages (Rs.)	fᵢ	xᵢ	dᵢ = xᵢ − 495	fᵢ × dᵢ
470 – 480	2	475	−20	−40
480 – 490	3	485	−10	−30
490 – 500	4	495	0	0
500 – 510	2	505	+10	+20
510 – 520	5	515	+20	+100
520 – 530	4	525	+30	+120
Total	Σfᵢ = 20	—	—	Σfᵢdᵢ = 170

Assumed Mean Method — Final Calculation

Mean = a + Σfᵢdᵢ / Σfᵢ (where a = 495)

Mean = 495 + (170 ÷ 20) = 495 + 8.5 = Rs. 503.50 ✓

🔍 Why does this work? Because dᵢ = xᵢ − a, when you compute Σfᵢdᵢ you are finding "how much the actual total differs from what it would be if every observation equalled a". Adding that correction back to a gives you the true mean. You can choose any class mark as a — the result will always be the same. Choosing a middle class mark minimises the size of dᵢ values, making arithmetic easier.

Method 3 — Step Deviation Method (Fully Worked)

When the class intervals are all equal in width (here every class is 10 rupees wide, so h = 10), you can simplify the arithmetic even further by dividing each deviation by h. The result is called uᵢ = dᵢ / h = (xᵢ − a) / h. Because h is a common factor, the numbers you multiply together become the smallest of all three methods (−2, −1, 0, 1, 2, 3 in this example).

Daily Wages (Rs.)	fᵢ	xᵢ	dᵢ = xᵢ − 495	uᵢ = dᵢ / 10	fᵢ × uᵢ
470 – 480	2	475	−20	−2	−4
480 – 490	3	485	−10	−1	−3
490 – 500	4	495	0	0	0
500 – 510	2	505	+10	+1	+2
510 – 520	5	515	+20	+2	+10
520 – 530	4	525	+30	+3	+12
Total	Σfᵢ = 20	—	—	—	Σfᵢuᵢ = 17

Step Deviation Method — Final Calculation

Mean = a + (Σfᵢuᵢ / Σfᵢ) × h (a = 495, h = 10)

Mean = 495 + (17 ÷ 20) × 10 = 495 + (17 ÷ 2) = 495 + 8.5 = Rs. 503.50 ✓

🔍 Why does this work? Because uᵢ = dᵢ / h, multiplying Σfᵢuᵢ by h at the end "puts back" the h factor that was divided out. This is essentially the Assumed Mean Method but with an extra scaling step that makes the column values (−2, −1, 0, 1, 2, 3) extremely small and easy to multiply. This method is only valid when all class widths are equal.

Comparing All Three Methods at a Glance

All three methods always produce the same answer. Their only difference is the level of arithmetic involved — the step deviation method is the most efficient when class widths are equal, but all three are valid approaches accepted in board exams.

Feature	Direct Method	Assumed Mean	Step Deviation
Formula	`Σfᵢxᵢ / Σfᵢ`	`a + Σfᵢdᵢ / Σfᵢ`	`a + (Σfᵢuᵢ / Σfᵢ)×h`
Extra variable	None	dᵢ = xᵢ − a	uᵢ = (xᵢ − a) / h
Numbers to multiply	Large (475 × 2 …)	Small (−20 × 2 …)	Smallest (−2 × 2 …)
Works when class widths differ?	Yes	Yes	No — equal widths only
Answer for this dataset	503.50	503.50	503.50

xᵢ

Class mark (midpoint)

fᵢ

Class frequency

Assumed mean

Class width (step)

Common Mistakes to Avoid

Using raw class limits instead of the class mark: For grouped data, you must always use the midpoint (class mark xᵢ), never the upper or lower boundary of the class. Using an endpoint gives a wrong answer.
Choosing an assumed mean that is not a class mark: In the Assumed Mean and Step Deviation methods, a must be one of the class marks in the table — not an arbitrary number you make up. Any class mark works, but using one from the table keeps dᵢ values manageable.
Using the Step Deviation method when class widths are unequal: The step deviation formula relies on dividing every dᵢ by the same h. If class widths differ, there is no single h to use — this method simply does not apply.
Sign errors in the dᵢ column: Deviations above the assumed mean are positive, below it are negative. A sign error in one row will change Σfᵢdᵢ and give a wrong final answer.
Forgetting to multiply Σfᵢuᵢ by h at the end: In the Step Deviation method, the factor h must be multiplied back in the final step. Forgetting this is the most common exam error for this method.
Treating Σfᵢ incorrectly: Always confirm that Σfᵢ equals the total number of observations in the dataset, not just the number of classes. Here Σfᵢ = 20 workers, not 6 (the number of class intervals).

⛔ Board exam alert: In Telugu State and CBSE board papers, the Statistics questions on Mean almost always involve grouped data with a neatly constructed frequency table. You will typically be asked to find the mean using a specific method — or sometimes asked to show two methods and verify they give the same answer. Know all three methods thoroughly.

What This Lesson Prepares You For

This introduction to Mean is the foundation for the rest of Chapter 14. The same style of frequency table is used in all subsequent topics: calculating the Median of grouped data (using the median formula with cumulative frequencies) and finding the Mode of grouped data (using the modal class and the mode formula). You will also encounter ogives (cumulative frequency graphs), which provide a visual way to read off the median from grouped data.

The concept of class mark introduced here connects back to the data handling topics in the Class 9 Statistics chapter. The careful algebraic manipulation used in the Assumed Mean and Step Deviation methods reinforces skills from Polynomials and Algebraic Expressions.

📐 Board Exam Tip (CBSE, Telangana & AP): A 5-mark question on finding the mean of grouped data — typically using the Assumed Mean or Step Deviation method — appears in almost every Class 10 board exam. Practise building the full frequency table with all columns neatly labelled, computing the totals carefully, and writing the substitution step clearly before reaching the final answer. Examiners award marks at every step of the working, so never skip a line.