We use these measures of central tendency, whenever we need to have a representative value for a dataset (distribution), may be to have a comparison in between or among samples or with a benchmark value. Ex: To compare performance of two student groups for a module, to compare employee turnover rate with a benchmark value set by the management
Now which measure of central tendency is used? Mean, median or mode or all three?
- If the distribution is on a categorical variable like gender, hair colour, religion. Then use mode to take a representative value.
- If it is on a quantitative variable like height, weight, marks of a module. You can either use the median or mean.
- If that quantitative dataset is skewed or if there are outliers (extreme values), then go for median.
Ex: Let’s take a data set on distance (Kilometers) from home to office of a sample of employees of XYZ company as 1, 11, 20, 20, 20, 30, 178. Then you can see 178 Km is not in the same pattern of observations. All the other values are two digits and in between 1 Km- 30 Km. So 178 Km can be called as an outlier. And the distribution is also positively skewed. Find mean, median and mode for this.
Mean = 40 Km,
Median = 20 Km
mode = 20 Km
Mean score is two times of score of median and mode. Which measures of central tendency gives the best impression about the central tendency of the data set? If you use mean here, you would say that these employees are coming from faraway places as the average distance is 40 Km. But due to a single record the mean found to be as a higher value. If you make conclusions based on median or mode here instead of mean, you would say they are coming from a distance which is manageable to travel daily as those values are just 20 Km.
Now you know when to use mean, median, mode right? The whole idea is down there.