What is a Data Warehouse?
By vinayakgole
A data ware house is exactly what the name suggests. Synonymous with a warehouse where supplies are stored and inventories maintained, a data warehouse stores data and Meta data maintained. A data warehouse helps organizations in decision making.
The concept of a data warehouse was formed by Barry Devlin and Paul Murphy in the 1980s. The concept attempted to address the problems associated with flow of data from operational systems to decision support systems. TeraData was one of the first companies to come out with a database specially designed for data warehousing. A data warehouse is designed mainly for queries whereas a transactional system is optimized to data manipulation operations. Over the years there have been many significant developments in the fields of design of data warehouses but all of them follow the Codd rules.
A data warehouse is basically made up of three parts: 1. The source systems from where the source data comes from. 2. The ETL (Extraction, Transform and Load) process which cleans the source data and loads it into the appropriate tables. 3. Decision support systems and Business Intelligence Systems which report on the stored data.
Design plays a pivotal role in developing a data warehouse. The better the design, the better is the performance. There are different methodologies for designing, the most significant being the normalized view and dimensional modeling.
1. Normalized view.
The data is mostly stored in a highly normalized table where data never changes. New records are appended periodically.
The advantage is maintaining only a single table or a small set of tables at the most. But disadvantage is storage. Since data is redundant, a large storage has to be in place and updated from time to time.
2. Dimensional modeling
The data in this type of modeling is stored in two types of tables: Fact table and Dimension Tables. The facts or the measurable components are stored in the fact tables. These are also the frequently changing values. The static components of the data are stored in tables called dimension tables. A typical data warehouse modeled in this structure has one fact and many dimensions. The dimensions are joined to the fact tables using keys which are common to both the kind of tables. Dimensions are not joined with each other.
The Extraction process extracts data from source data files which are sent to these systems. The Transformation process transforms the data into the desired format and the Load process loads the data into the data warehouse tables, mostly the fact tables. There are various tools available for the same like Ab-Initio, Informatica and TalenD which is open source.
Once the data is loaded there are reporting mechanisms which are specially designed for data warehousing. The prominent players in this filed are Business Objects, Cognos and Hyperion.
Over the years, these reporting mechanisms have matured to the phenomenon of Business Intelligence Systems and Decision Support Systems. These systems allow analysis of the historical data stored in the data warehouse and are able to predict the future trends. Business Intelligence Systems offer a bird's eye view of the data which can then be drilled down to go deeper to a lower granularity.
Evolution of databases and the phenomenon of SOA (Service Oriented Architecture) have posed a threat to the traditional designs of databases. Data ware houses have traditionally never supported changes in data. Also real time data was not stored in a warehouse. But modern concepts like the active data warehouse and the operational data store has support the data changes and also has real time data. Modern Databases have support more space and there has been an exponential rise in the database speed. Warehouse design has over the years managed to merge these two diverse systems, transactional and the analytical.
Data warehouses are able to offer the management of an organization a detailed as well as summarized view of the data. They also aid in business decisions by analyzing the data for trends. Warehouses thus become a very prized possession for organizations.
Comments
No comments yet.