Data Warehouse Automation vs Automated Data Preparation

Data Warehouse Automation vs Automated Data Preparation

Are you considering whether to choose data warehouse automation or automated data preparation? This 5-minute blog post will help you decide.


You’ve decided to invest in either data warehouse automation (DWA) or automated data preparation (data prep).


It’s a smart decision – you’re poised to enjoy significant productivity gains in delivering your data management solution.


This short overview of DWA and data prep will help you focus your selection process, so you choose the product that best meets your current and future requirements.


Let’s start with 2 definitions:


Data warehouse automation defined

Professional association TDWI defines data warehouse automation as a way to gain efficiencies and improve effectiveness in data warehousing processes.


It’s much more than simply automating the development process. It encompasses all of the core processes of data warehousing including design, development, testing, deployment, operations, impact analysis and change management.


(It’s a wonderful definition, and we couldn’t put it better ourselves).


Data preparation defined

Data preparation covers a range of processing activities that transform data sources into a format, quality and structure suitable for further analytical or operational processing.


So how do you decide which type of product to choose?

The short answer is that it depends on your requirements for a solution. You also need to consider your current requirements as well as how they may evolve in future.


  • Question 1: Is your project strictly limited to a single output, star schema-based data warehouse development using clean, reliable and relatively simple data?
  • Question 2: Will the data only ever be used for reporting or BI, and no other purpose?


If you answered ‘yes’ to both of these questions, then you will certainly benefit from the limited scope of DWA. But beware: most DWA tools don’t have the flexibility to cope with the requirements of processing complex data sources or the capability to chain together complex transformations in order to achieve a required output beyond data warehouse schemas.


So here are 2 more questions:


  • Question 1: Is it possible your requirements will grow beyond mildly complex data?
  • Question 2: Is it possible that at some point you’ll need to produce output beyond just a data warehouse or data mart?

If you answered ‘yes’ to either of these questions, you’re likely better off looking at an automated data prep product. These will provide you with the flexibility that’s needed to address both of these requirements, going beyond DWA tools which will quickly run out of steam in either of these situations.

However, there’s a caveat. By definition, data prep tools don’t have target output structures, so if you ever need a data warehouse or data mart in future your data prep capability will be lacking. Therefore, if you were to proceed with automated data prep without DWA, it’s highly likely you’ll find yourself in one of these undesirable situations:


  1. You’ll need to build your star schemas and cubes from the ground up – the very situation you were avoiding in the first place, and the reason that justifies this purchase
  2. Worse yet, you may find yourself in the market for another product, a DWA tool, that’s able to support your new aspirations

Both of these situations mean you will have wasted valuable time and money on this deployment.


Take a long-term view when you make your decision

If there’s any chance your data will grow in complexity or your outputs will change, I advise you to look at products that have the ability to do both DWA and data prep (like our enterprise data management platform, Data Academy).


This gives you flexibility, ensures you’re investing in a sustainable solution and delivers a lower total cost of ownership.