The need to collect, store and analyze large amounts of data continues to increase in importance for businesses across all industries. Marketing agencies have led the pack in processing large amounts of data in order to gain consumers attention in a crowded space. However, virtually every other area is beginning to see the value in collecting and processing data. From self-driven vehicles to outcomes-based healthcare – data is becoming the most important part of product offerings. More importantly, business models that are able to monetize their data are seen to be the future of big data[1].
Data Warehouse – the what and how of it
A Data Warehouses is a central repository of data from different sources that stores current and historical data. They can be used for performing analytics to gain insights into any number of areas. A data mart is a subset of the data warehouse and is typically targeted at a specific team or product. Whereas data warehouses have a division-wide depth, the information in data marts pertains to a single department/product line. A department can be the owner of its data mart including all the hardware, software and data thus enabling each department to isolate the use, manipulation and development of their data.
In building a comprehensive Data Warehouse, there can be several different approaches used:
Top down – The holistic picture for data sources and usage is laid out and a company-wide initiative drives the Data Warehouse first before the Data Marts, thus making Data Marts dependent on the Data Warehouse. This approach tends to take longer and forces departments to compromise on their needs.
Bottom up – Each business area or department manages their data storage and reporting needs which then get fed into a Data Warehouse when the business is ready to take on a more holistic picture of the entire enterprise. Data Marts are delivered faster and generally meet the exact needs of their end users, but also makes a Data Warehouse more complex.
Mixed – A mixed approach would include a company-wide initiative and mapping out the Data Warehouse, but would actually start development on the Data Marts first. This method tries to strike a balance between meeting end-user reporting needs in the Data Marts while not making the Data Warehouse so complex.
Why have a Data Warehouse?
Unless your company currently leads the pack in big data innovations, you likely will not have a top down Data Warehouse mandate. This does not mean efficient data storage and reporting isn’t needed. 90% of the data in the world today has been created in the last two years alone[2].
Being ready to handle large amounts of data from a department level is the first step in being ready to leverage the benefits big data will have to offer. As the organization starts to leverage data across the enterprise, the departments with a comprehensive Data Warehouse will be the first to start providing more value beyond their departments. Having already benefited from that data is the icing on the cake.