Myths surrounding Dimensional Models

I came across an interesting article dispelling myths around dimensional models and thought I will discuss them here.

Myth 1: Dimensional models are only for Summary level data – This myth is result of ill-designed or planning a design that caters to only a certain applications requiring a narrow set of query results. Dimensional models should be querable from the lowest grain to the highest possible and therefore should contain full set of data. There is practically no limitations on how much data can be stored in dimensional model tables. Summary level data should complement the granular data but should never be replaced. The amount of data store in dimensional model should be driven by business requirements and processes.

Myth 2: Dimensional models are less enterprise and more departmentalized- The dimensional models should be designed around the business process and the data captured and collected can be further used by different departments in the company. Dimensional models should not be designed separately and solely for a single business unit but should be built to use across different groups within a company. For example, sales data can be stored centrally in a dimensional model but can be further used in different data marts designed & developed for different groups like pricing, promotions, Sales strategy group etc.

Myth 3: Dimensional Models are not scalable – This is completely untrue, dimensional models are extremely scalable.  All the database vendors have incorporated the need to accommodate billions of rows of data. As much as normalized model have large data and relationships they can be translated to dimensional model due to the similarity in logical content.

Myth 4: Dimensional models cannot be integrated – Dimensional attributes are highly reusable and are often used in the ETL process to transform and replicate them into other databases in a desired form. They integrate very well with other databases when used as a centralized location of master data and using ETL feeds to replicate and source the data for different applications.