Categories: Tools & Techniques

Cluster Analysis: Scaling

In geometry, all dimensions are equally important. A distance of 2 units on X axis counts the same as a distance of 2 units on Y axis. It does not matter what the unit of measurement is, as long as its the same for both X and Y.

But what if X is measured in yards, Y is measured in centimeters, and Z is measured in nautical miles? A difference of 1 in Z is now equivalent to a difference of 185,200 in Y or 2,025 in X. Clearly, they must all be converted to a common scale before distances will make any sense.

Now take another example. An example that is closer to the reality of business analytics. What if we have 3 variables – Income, Number of cars owned and Age. Clearly these 3 variables are measuring very different things, and thus have very different scales. If we perform cluster analysis on this data, differences in income will most likely dominate the other 2 variables simply because of the scale. In most practical cases, all these different variables need to be converted to one scale in order to perform meaningful analysis.

This video talks about the concept of scaling and various methods of scaling used in analytics.


Published by

Recent Posts

Books on Analytics

Analytics is a vast field. At the one end, it overlaps with statistics and higher…

Career in analytics in a KPO

Do you love to explore and investigate information? Do you find spreadsheets to be a…

Indian companies using analytics

India has developed into the global hub for analytics. A large number of MNCs have…

IBM: Betting big on analytics

International Business Machines Corp. Or IBM as it is popularly known recently announced its restructuring…

How to build a successful career in analytics

So you have got a job as an analyst in your dream company? Here are…

What’s the sentiment on “sentiment analysis”?

What's the sentiment on "sentiment analysis"? Is the field ready to take off?