Introduction

Star schema is a traditional database schema with a central table that can be schematically represented in the form of a star. It consists of fact tables and dimensions tables. Fact tables measure numbers, whereas dimension tables give more context to the fact table. 

It is very simple in structure. Star schema can also be defined as a fact table that is surrounded by dimension tables, and the dimensions are in a normalized form that means it has a lot of data redundancy, and this can sometimes cause data integrity and storage issues as well.

Star schema is used in real-time applications as they save a lot of memory. It involve several layers of summarization to recover information faster. Star schema is widely used as an approach to develop data warehouses and dimensional data marts. 

In this article let us look at:

  1. Fact Tables
  2. Dimension Table
  3. Characteristics
  4. Advantages
  5. Disadvantages
  6. Star Schema And Snowflake Schema
  7. Star Schema Vs Snow Flake Schema

1. Fact Tables

The whole concept of star schema data warehousing and OLAP (Online Analytical Processing) system is to analyze the data and to derive meaningful information, so fact tables are those tables that hold the data which has to be analyzed for meaningful information. Analysis plays a vital role, and in facts tables, the numbers are given much significance, which is stored in it. A facts table in a star schema constitutes two major attributes or columns that include fact and foreign key of dimensions.

The fact table can also be defined as a centralized table in a star schema that is surrounded by the dimension table with all the primary key, and these dimensions act as foreign keys which interprets the grain of a fact table, the grain of the fact table is very important as it helps with the unique identification of a row in a fact table. The fact table also has a bunch of major columns that give meaning in full data.

To understand this, we can take e-commerce websites as a real-life scenario-based example. We can break this theory with the help of another real-life example as well. Let say that a person has opened a new grocery store in an area, and that person has hired few employees. After the business started generating sales, the owner would like to know the transactions made in the business, about employees work, the kind of products sold in the last month and what is revenue generated?

Whether the revenue exceeds employees’ salary and expenses that will lead to an indication of whether the business is into profits or losses, and this can happen with the help of two properties or columns, which includes the foreign key of dimensions. 

2. Dimension Table

It contains the descriptive measures which make the facts meaningful. Fact tables only have the keys of the dimension tables, whereas dimension tables have the information and details. For example, there is an employee of a company, and the dimension table shows the information about his/her office name and all the details like their phone number, Which team they work for etc. 

Two words, ‘slice’ and ‘dice’, are heard synonymously. To provide the outcomes of a particular task, work, or a company’s performance, like profit/loss, we need to slice the data. Slicing is like filtering the data. Dicing involves the aggregation of data or cubing of data.

3. Characteristics

Following are the features of a star schema. 

  • The schema in a star schema is specially designed for data model warehousing. 
  • The fact table is located in the center surrounded by the dimension tables which give a shape of a star.
  • The Star scheme is a pure de-normalized design.
  • It is so flexible that it can be amended easily by adding to the design throughout the development cycle. 
  • It provides a parallel design by developing in the back-end how end-users might think and use the data.
  • It reduces the complexes of metadata for both developers and end-users.

4. Advantages

  • Star schema is the simplest and easiest design for the developers
  • Optimizes navigation through the database 
  • Star schema is the easiest application for the end-users to use
  • It is more suitable for the query processing
  • It can be easily understood by a customer by its well-designed star schema. 
  • The customer can easily analyze large multidimensional data sets.

5. Disadvantages

  • It is a bit complex due to multiple fact tables
  • The star schema is difficult to manage
  • Dimension tables in star schema diagrams are very large
  • In some scenarios, the size of the fact tables are significantly reduced
  • When we need to change an item, we can only make a single change of an item in the dimension table instead of many in fact, tables

6. Star Schema And Snowflake Schema

Snowflake is much similar to a star design as a star schema. The only exception is the dimension table in the snowflake design. In the snowflake design, the dimension tables are normalized. The star design is a real normalized design, but in the case of snowflake design, it can have normalized dimension tables. We can design complex star schema designs by normalizing the dimension table into several tables.

7. Star Schema Vs Snow Flake Schema

Star SchemaSnowflake Schema
NormalizationNormalized dimension tablesPure de-normalized dimension tables
Query Complex Simple
Maintenance Less maintenance due to less redundancy More maintenance due to more redundancy

Conclusion

Star schema converts process data into facts which holds measurable quantitative data about a business. A star schema is also called a centipede scheme for its coverage of many dimensions.

If you are interested in making a career in the Data Science domain, our 11-month in-person Postgraduate Certificate Diploma in Data Science course can help you immensely in becoming a successful Data Science professional. 

ALSO READ

SHARE