Data Warehouse And OLAP Part-2

Gopal Khadka
6 min readMay 26, 2024

--

Data Warehouse

If you have not read the previous blog, refer here.

Table of Contents

  1. Multi-Dimensional Data
  2. Data Cubes
  3. OLAP
    3.1 Concept
    3.2 Operations
  4. OLAP Servers
    4.1 ROLAP
    4.2 MOLAP
    4.3 HOLAP

Multi-Dimensional Data

It is a way of representing the data from the warehouse in more interactive manner for better and easy understanding using multiple attributes of the table. In the above figure, tabular representation of the data includes attributes like pid, timeid, locid and sales which are then presented as cube in multidimensional representation.

This model just gives the logical view of the data, not the physical view. Physically, data can be stored in any data structures like table, multidimensional arrays, objects, etc. This model of data representation helps to query and interact with data more easily. Unlike graph which shows data in 2D, it goes beyond that.

Data Cube

Data cube is a multidimensional data model used in data mining. It stores summarized and aggregated data values in the form of the cube, which can then be queried in much faster manner than a regular data table.

The data values in the data cube are pre-computed, which makes it an optimal solution for query. When we think of the data cube, it is usually a 3D representation, but it can have any no of dimensions. Data Cube is primarily used in OLAP (discussed below) for performing the online analytical processing in the warehouse.

OLAP

OLAP stands for Online Analytical Processing. It is done on multidimensional data (data cubes) in the data warehouse. It performs analytical operations on data cubes from multiple perspectives.

Rather than querying the database and tables each time for analysis, it performs trend analysis and estimation on aggregated and summarized data cubes where values are pre-computed. Unlike OLAP, OLTP (Online Transaction Processing) is done on the database, usually by performing queries like CRUD operations. The measure factor for effectiveness of the OLAP is its speed.

OLAP Operations

1. Roll up

It is a process of climbing up the dimension hierarchy. It is like zooming out the data cube. If we move up the hierarchy of the location, we get to countries from cities. If we move up again, we get to continents from countries.

In the below example, we sum the given data by rolling up the location from cities to countries. The sum of New York and Chicago makes up the USA and Vancouver and Toronto makes up Canada. Similarly, we can add up two quarters to see the data cube in semi-annual perspective.

https://www.javatpoint.com/olap-operations

2. Drill Down

It is a process of climbing down the dimension hierarchy. It is like zooming in the data cube. If we move down the hierarchy of the location, we get to cities from countries. In other words, it is the reverse of the roll up operation.

https://www.javatpoint.com/olap-operations

In the above example, we can see that time dimension is drilled down, i.e. quarters are converted into months. Let me ask you a question now?
In the roll up operation, we sum the related values to get the zoomed out value. What do we do here? How do we find the individual values from the final sum value? Do we use mean method or any other method?

The answer is no. It is simple. We know that data warehouse is built on top of the database. The data cube is built on top of the data warehouse. All values are recorded on the database if not on the warehouse. So you can easily query the database or usually the data warehouse to gain those individual values.

3. Slice

It is a process of slicing down the multiple values of dimension. It is like slicing bread out of a loaf. We can slice down in any axis (locations, time, types).

https://www.javatpoint.com/olap-operations

In the above figure, we have slice down the data cube for the first quarter only (Q1). Now, we can clearly see the location and item types for the first quarter. Notice that this process converts a 3D data cube into 2D flat representation.

4. Dice

Unlike the slice operation where it selects the whole set of data values for the single dimension, the dice operation selects the subset of data values for multiple dimensions. It is like cutting a cube from all axes to get a smaller cube.

https://www.javatpoint.com/olap-operations

In the above picture, we can see that we get a much smaller cube after the dice operation. We performed the dice operation on the data cube to select only two of each of cities, item types and quarters.

5. Pivot

This operation is also called rotation. It changes the orientation of the data by swapping the rows and columns of the data. It helps to view the data from the new perspective for better analysis and comparison.

In the above example, the pivot operation changes the axes of the data representation.

OLAP Servers

http://www.sql-datatools.com/2016/07/Data-Warehouse-olap-rolap-molap-holap.html

OLAP servers are specialized software systems designed to efficiently manage and analyse the multidimensional data model for faster and effective performance, enabling users to perform complex analytical data operations in the large data set.

Its main purpose is to provide the users with scalable and high performant platform for data analysis and data warehouse operations like slicing, dicing, and more.

ROLAP

ROLAP stands for Relational OLAP. This server works with the data stored in Relational DBMS. RDBMS also helps to store and handle the data in warehouse. The multidimensional data model is built on top of the relational databases. Some of its examples are:

  • Microsoft SQL Server Analysis Services (SSAS) in ROLAP mode
  • Oracle OLAP
  • IBM Cognos ROLAP Server

It provides flexible and cost-effective way to handle the data analysis as it leverages use of the existing relational databases. They are often integrated with BI (Business Intelligence) tools to perform complex analysis to make the informed decision in the business.

Even though it is scalable and cost-effective, it provides slow performance compared to other specialized OLAP servers and can’t handle high no of concurrent analytical workloads.

MOLAP

MOLAP stands for Multidimensional OLAP. These server use data cubes for multidimensional data representation. In the data cube, data is pre-computed, summarized and aggregated for faster query performance. This is what makes it faster than ROLAP. They are known for their rapid responses, easy use and ability to handle complex workloads.

This makes them use a lot more storage than ROLAP for pre-computation process. This process can be quite resource-intensive. Examples of MOLAP servers include IBM Cognos TM1, Oracle Essbase, and SAP BW/4HANA.

HOLAP

HOLAP stands for Hybrid OLAP. Hybrid OLAP is a mixture of both ROLAP and MOLAP. It offers fast computation of MOLAP and higher scalability of ROLAP. HOLAP uses two databases. They use a relational database as the underlying data store, but also maintain a multidimensional cache for frequently accessed data.

  • For queries that can be satisfied by the pre-aggregated data in the multidimensional cache, HOLAP servers can provide fast response times.
  • For queries that require accessing data not stored in the cache, HOLAP servers will fall back to the relational database, potentially with some performance impact.

HOLAP intends to find the balance between performance of MOLAP and flexibility and scalability of ROLAP. Since it uses both kinds of server, it can be complex to implement. Examples of HOLAP servers include Microsoft SQL Server Analysis Services (SSAS) in HOLAP mode and IBM Cognos OLAP.

--

--

Gopal Khadka
Gopal Khadka

Written by Gopal Khadka

Aspiring coder and writer, passionate about crafting compelling stories through code and words. Merging creativity and technology to connect with others.

No responses yet