Introduction to Statistics

Statistics and measures of central tendency.

Advertisement
Lesson Notes PDF
1 /
Loading PDF…
Class 9 Mathematics · Chapter 9

Statistics — Introduction

Data, types of data, frequency distribution tables, tally marks, inclusive & exclusive classes — everything you need for CBSE, Telangana & Andhra Pradesh board exams.

What is Statistics?

Statistics is the branch of mathematics that deals with the collection, organisation, presentation, analysis, and interpretation of numerical information. In Class 9, Chapter 9 introduces you to the foundational ideas of statistics — starting from what data actually means, all the way to building frequency distribution tables. These concepts form the backbone of the Statistics chapters in Class 10 and beyond, and are regularly tested in CBSE, Telangana, and Andhra Pradesh board examinations.

Every time a teacher records attendance, a scientist measures temperatures, or a government publishes census numbers — they are all working with data. Statistics gives us the tools to make sense of all that information.

Definition
Data

Facts or figures — whether numerical or otherwise — that are collected with a definite purpose are called data. Examples include marks scored by students, heights of players, daily temperatures, or monthly rainfall.

Primary Data vs Secondary Data

The first important classification in statistics is understanding how data was originally collected. This determines whether it is primary or secondary data — a distinction that appears as a 1-mark question in many board exams.

Primary Data

Data collected directly by the investigator for a specific purpose for the first time.

  • Freshly gathered, original information
  • Collected through surveys, interviews, experiments, or direct observation
  • More reliable and specific to the need

Example: A teacher personally measuring the heights of all students in her class right now.

Secondary Data

Data collected from a source that has already recorded it — not gathered fresh by the investigator.

  • Sourced from registers, books, websites, government records, newspapers
  • Already processed or compiled by someone else
  • Quicker to obtain but may not perfectly match the investigator's need

Example: Using school registers from 2001–2010 to study enrollment trends.

Practice: Identify Primary or Secondary?

The textbook "Do This" activity asks you to classify these two situations. Think carefully — the key question is: who collected the data and when?

SituationTypeReason
Collection of enrollment data of students in your school from 2001 to 2010 Secondary The data was already recorded in school registers by someone else in the past
Height of students in your class recorded by the physical education teacher Secondary If you are using data already recorded by the PE teacher, it is secondary for you — the PE teacher's act of measuring was primary collection
💡 Tip: The same data can be primary for the person who originally collected it and secondary for anyone else who later uses it. Context matters!

Raw Data and Range

When data is collected but has not yet been arranged or organised in any way, it is called raw data. Consider the marks scored by 15 students in a mathematics test out of 100:

85, 92, 78, 46, 88, 93, 71, 69, 84, 77, 91, 82, 76, 89, 95

This unordered list is raw data. It is hard to draw conclusions directly from it. The first step is to find the range, which tells you how spread out the data is.

Formula
Range of Data

Range = Maximum value − Minimum value
Here: Range = 95 − 46 = 49

Once arranged in ascending order, the data becomes much easier to read:

46, 69, 71, 76, 77, 78, 82, 84, 85, 88, 89, 91, 92, 93, 95
QuestionAnswerExplanation
What is the range? 49 95 (max) − 46 (min) = 49
What is the middle value (8th value)? 84 After arranging 15 values in order, the 8th is the middle
How many students scored more than 80? 9 students Values above 80: 82, 84, 85, 88, 89, 91, 92, 93, 95
📌 The middle value of an arranged dataset is called the median — a concept you will study in detail in the Statistics exercises.
Advertisement

Frequency Distribution Tables — Organising Data

When there are many data points (like marks of 50 students), writing them all out is messy and unhelpful. The solution is to count how many times each value appears — this count is called the frequency — and record it in a frequency distribution table using tally marks.

Consider the marks of 50 students in a test out of 10:

5,8,6,4,2,5,4,9,10,2,1,1,3,4,5,8,6,7,10,2,1,1,3,4,4,5,8,6,7,10,2,8,6,4,2,5,4,9,10,2,1,1,3,4,5,8,6,4,5,8

Step 1: Ungrouped Frequency Distribution (Individual Values)

Each distinct mark is listed, and tally marks are used to count how many students scored that mark. This gives an ungrouped frequency distribution table, also called a table of weighted observations.

MarksTally MarksNumber of Students (Frequency)
1𝄷𝄷6
2𝄷𝄷6
3|||3
4𝄷𝄷𝄷𝄷9
5𝄷𝄷||7
6𝄷𝄷5
7||2
8𝄷𝄷6
9||2
10||||4
Total50
📌 How to draw tally marks: Each stroke represents one count. Every fifth stroke is drawn diagonally across the previous four (||||), making it easy to count in groups of 5.

Step 2: Grouped Frequency Distribution (Class Intervals)

With 10 different marks, the ungrouped table already has 10 rows — manageable. But if the data ranged from 1 to 100, an ungrouped table would have 100 rows, which is impractical. The solution is to group the data into class intervals and count the frequency within each group.

Marks (Class Interval)Number of Students (Frequency)
1 – 315
4 – 621
7 – 1014
Total50

This is called a Grouped Frequency Distribution Table. It summarises the data compactly and makes patterns much easier to see — here, most students scored in the 4–6 range.

Key Insight: Grouping reduces the number of rows but gives up the exact values. You trade precision for clarity. In board exams, you must be able to construct both types of tables.

Inclusive Classes vs Exclusive Classes

When we write class intervals in a grouped frequency table, there are two important formats — and confusing them is one of the most common mistakes in Class 9 Statistics board exams.

Inclusive Classes

Classes written as 30–39, 40–49, 50–59, ...

  • Both the lower and upper limits are included in the class
  • Classes do not overlap — 39 belongs to 30–39, and 40 belongs to 40–49
  • Best suited for discrete data (like whole-number marks or counts)

Example: Orange weights 30–39 g, 40–49 g, 50–59 g, …

Exclusive Classes

Classes written as 30–40, 40–50, 50–60, ...

  • The upper limit is excluded from the class — it belongs to the next class
  • Classes appear to overlap (both end at 40) but by convention, 40 goes into 40–50
  • Best suited for continuous data (like heights, weights, temperatures)

Example: 30–40 includes 30, 31, … 39 but not 40. 40–50 starts at 40.

Class Boundaries — Converting Inclusive to Exclusive

Inclusive classes like 30–39 have a gap between them (nothing covers exactly 39.5). To bridge these gaps, we use class boundaries:

Lower boundary = Lower limit − 0.5   |   Upper boundary = Upper limit + 0.5
Inclusive ClassClass Boundaries (Exclusive)
20 – 2919.5 – 29.5
30 – 3929.5 – 39.5
40 – 4939.5 – 49.5
50 – 5949.5 – 59.5
60 – 6959.5 – 69.5
70 – 7969.5 – 79.5
80 – 8979.5 – 89.5
90 – 9989.5 – 99.5
100 – 10999.5 – 109.5
110 – 119109.5 – 119.5
🤔 Common Question: Where does 49.5 go?

In the boundaries above, 49.5 appears as the upper boundary of the 39.5–49.5 class and the lower boundary of the 49.5–59.5 class. There seems to be a conflict!

Convention: By standard rule, a value that falls exactly on a class boundary is placed in the higher class. So 49.5 belongs to 49.5–59.5, not to 39.5–49.5.
Exam Mistake Alert: Students often confuse inclusive and exclusive classes and apply the wrong boundaries. Remember — the form of the class interval tells you the type: 30–39 is inclusive; 30–40 is exclusive.

How to Build a Grouped Frequency Distribution Table

Follow these steps every time you are given raw data and asked to construct a grouped frequency distribution table — a very common 3-mark or 4-mark question in Telangana, AP, and CBSE board exams.

1
Find the Range: Range = Maximum value − Minimum value. This tells you how wide your classes need to be.
2
Decide the Class Interval Length: Choose a convenient number (5, 10, 20, etc.) so that you get roughly 5–10 classes covering the full range.
3
List the Classes: Write them in order — either inclusive (30–39) or exclusive (30–40), consistently throughout.
4
Tally the Data: Go through each data point one by one and place a tally mark in the correct class row. Every 5th tally is a diagonal cross-stroke.
5
Count the Tallies: Convert each set of tally marks into a number (the frequency). Verify: all frequencies must add up to the total number of data points.
💡 Board Exam Tip (Telangana & AP): Always include a "Total" row at the bottom of your frequency table and verify the sum equals the total count of data points. This shows the examiner your table is complete and earns full marks.

Chapter 9 Introduction — Key Terms at a Glance

TermMeaningExample
Data Facts/figures collected for a purpose Marks of students, daily rainfall
Primary Data Collected fresh by the investigator Teacher measuring heights today
Secondary Data Already recorded by someone else School register, census report
Raw Data Unorganised, unprocessed data 85,92,78,46,88,… (as collected)
Range Max value − Min value 95 − 46 = 49
Frequency Number of times a value/class appears Mark 4 appeared 9 times
Inclusive Class Both limits included; non-overlapping 30–39, 40–49, 50–59, …
Exclusive Class Upper limit excluded; overlapping form 30–40, 40–50, 50–60, …
Class Boundary Adjusted limits bridging inclusive gaps 30–39 becomes 29.5–39.5

What This Introduction Prepares You For

The concepts introduced here — data types, tally marks, frequency tables, and class intervals — are the building blocks for the entire Chapter 9. In the exercises that follow, you will use grouped frequency distribution tables to draw histograms and frequency polygons (Exercise 9.1) and to calculate measures of central tendency like mean, median, and mode.

In Class 10, the same frequency table format is used in Statistics Chapter 14 to compute the mean using the assumed mean method and to draw cumulative frequency curves (ogives). Getting these fundamentals right in Class 9 makes Class 10 statistics significantly easier.

For Telangana and Andhra Pradesh SSC board exams, the introduction section of Statistics typically contributes 1-mark definition questions and 2-mark "identify primary or secondary data" problems. Understanding the difference between inclusive and exclusive classes can also earn you marks in table-construction questions.

Advertisement