Wednesday, September 25, 2013

Data Warehouse / ETL (Extract, Transform and Load) Concepts

In this post, we will go through data warehouse concepts like OLAP & OLTP System, Origin of DSS System, what is data warehouse, different architecture of data warehouse systems, dimension modelling, fact and dimension tables, star and snow flake schems. I believe that the best way to learn any concept is to ask questions like 'Why this came into existence?' , 'How things are evolved for particular system?' ,'Why this can be done by this way only?' . So, guys please post the questions whatever comes in your mind while going through this topic.

OLTP (Online Transactional Processing System) / Operational System -

  • OLTP systems are the systems which contain current operational data used to control and run fundamental business task . Example : ATM Machine, Airline Reservation system etc.
  • They are characterized by a large number of short online transactions (INSERT, DELETE, UPDATE) .
  • The main significance for OLTP systems is put on very fast processing and maintain data integrity in multi-access environment.
  • Queries of OLTP systems are typically simple and return very few records.
  • The structure of OLTP systems are highly normalized (usually 3NF) which means tables and fields are organized to minimize data redundancy and data dependency.
  • OLTP data is very critical to run the business (data loss may lead to monetary loss or legal liability) , so backup and recovery is very important here.

OLAP (Online Analytical Processing System ) / DSS (Decision Support System) -

  • OLAP systems are the systems which contain historical data used to analyse the trend, support strategic business decision and present the data on demand by customer, third party customer or legal entities .Example : MIS Systems, Airline DWH System which collects last 10 years data about flight reservation.
  • They are chracterized by relatively low volume of transactions.
  • In OLAP systems, there is aggregated historical data stored in multi-dimensional schemas (usually star schema) .OLAP systems are used for data mining as well.
  • OLAP emphasis is on retrieval and it organizes result data of adhoc queries in reasonable amount of time.
  • Queries of OLAP systems are very complex and involve aggregation of records.
  • The sructure of OLAP systems are de-normalized with fewer tables which will help in faster retrieval of data using complex queries.
  • OLAP system is updated by periodic process that works in stand-alone mode (consistency can be assured through update process).
  • Instead of regular backup some environment may consider smply reloading OLTP data as a recovery method.

                                                             COPYRIGHT © 2013 TECH BRAINS