What is a Data Warehouse?
The academic definition…
“A data warehouse is a repository storing integrated information for efficient querying and analysis. Information is extracted from heterogeneous sources as it is generated or updated. The information is then translated into a common data model and integrated with existing data in the warehouse. When a user query is submitted to the warehouse, the required information is already there, with inconsistencies and differences already resolved. This makes it much easier and more efficient to run queries over data that originally came from different sources.”
In laymans terms…
“A data warehouse is simply a single database which brings together data from different operational database systems which could be on different database platforms. Data warehouse data sources can include large text file extracts rather than a database.
The data from these systems is merged, cleansed, summarized and optimized purely for producing reports. It’s sole purpose is for producing reports”
Sample data sources – Customer data in sales, CRM, Leads, BACS, Telephony systems.
Common Misconceptions
A data warehouse should not be used as a consolidation tool back to the operational system. Data does not equal information that is of use to your business.
Frustrations of Business Management
Do any of these phrases sound familiar in your organisation?
“We have mountains of data in this company, but we can’t access it.”
“We have all this data, but I have no information.”
“You’ve got to make it easy for business people to get at the data directly.”
“Just show me what is important.”
“It drives me crazy to have two people present the same business metrics at a meeting, but with different numbers.”
“We want people to use information to support more fact-based decision making.”
Goals of a Data Warehouse
We design data warehouses and BI solutions to resolve these issues and provide knowledge about their business. The goals of every data warehouse are as follows –
- The contents of the data warehouse must be understandable. Data has to be intuitive and obvious to the business user. Usually accessed with a user friendly web interface. Typically a BI application.
- Data must be carefully assembled from a variety of sources around the organization, cleansed, quality assured, and released only when it is fit for user consumption.
- Information from one business process should match with information from another. If two performance measures have the same name, then they must mean the same thing. Conversely, if two measures don’t mean the same thing, then they should be labeled differently. Consistent information means high-quality information.
- Organizational change is inevitable over time. Changes to user needs, business conditions, data, and technology. The data warehouse must be designed to handle this inevitable change. Changes to the data warehouse should be graceful, meaning that they don’t invalidate existing data or applications. The existing data and applications should not be changed or disrupted when the business community asks new questions or new data is added to the warehouse.
- A Data Warehouse is a huge asset which will include sensitive information on company performance. Information you don’t want competitors to see. Data access must also be restricted within an organization so that each user is only allowed to see what you want them to see.
- The data warehouse must have the right data in it to support decision making. There is only one true output from a data warehouse: the decisions that are made after the data warehouse has presented its evidence.
- If the business community has not embraced the data warehouse and continued to use it actively six months after training, then is has failed the acceptance test.
Operational Reporting vs Data Warehouse Reporting
Data for Management Information purposes usually falls into two categories
Operational Data
Data collected to handle an organization’s ongoing business. E.g. Daily transactions or invoices. Call stats.
Informational Data
Data collected, organized and made available for analysis. E.g. Aggregated sales information by day, week, month, quarter, year, financial year. Split by division, region, team. Analyze by product line.
Operational reports are typically scripts or reporting tools which run queries directly on an operational database. The database running a business application. Typically daily reports showing what has happened today or yesterday. Results of these reports are usually exported to Excel spreadsheets creating Excel empires.
On the otherhand, a properly designed data warehouse and BI solution provides a consistent, summarized archive of data on company performance. It exists on it’s own independent database therefore not affecting operational database performance.
Operational Reporting Pros and Cons
Pros…
- Reports show the latest data up to the point when the report is executed.
Cons…
- Reports are slow if trying to select a large amount of data.
- Users of the operational system may complain. “The system is suddenly very slow. My screen isn’t responding”
- Your transactional databases are not configured to return large amounts of due to database settings and the underlying structure of how data is held. Transactional databases configured specifically to run the application efficiently.
- Requires expert knowledge of the database and the SQL query language which means your Management Information (MI) reporting relies on a select few. No ad-hoc reporting capability.
- The data is volatile and constantly changing. Report results may differ day to day.
- Reports held and maintained by a small number of people, resource intensive, not easy to access reports, prone to human error.
Data Warehouse Benefits
- A simple design therefore easy to understand and queries execute faster. Makes company data much more accessible using BI tools.
- Presents a single consistent view of company data. Single definition of company metrics and KPIs.
- No impact on the operational database which runs your business. Report execution much faster.
- Quickly create ad-hoc reports with no specialist knowledge or reliance on IT professionals.
- Easy user access to information.
- Superior insight into your business. Easily spot trends for improved decision making leading to more advanced competitive capabilities.
Data Warehousing Applications
A data warehouse can also provide further insight into your business. Other applications of a data warehouse solution include –
- Judging Advertising Campaign Effectiveness
- Customer Retention
- Risk Management
- Fraud Detection
- Customer Profiling
- Market Basket Analysis
- Customer Profitability Analysis
- Consultant Efficiency
Looking for assistance in your Data Warehouse project using Oracle technologies? Contact Us.
Alabra Consulting with over 10 years experience developing ETL routines for the Oracle DBMS.