Friday, October 9, 2009

Data Quality

Data Quality:


How often after the implementation of a Business Intelligence (BI) Project, have you heard that the business users do not feel the data is reliable, credible, and consistent to meet their analysis and reporting needs. Unfortunately that is too often the response from a client after a BI Project is implemented. As pointed out by Ralph Kimball in his book “The Data Warehouse Toolkit”, the business community must accept the data warehouse if it is deemed to be successful. The other goals of the data warehouse are:

1. The data warehouse must make an organization’s information easily accessible
2. The data warehouse must present the organization’s information consistently
3. The data warehouse must be adaptive and resilient to change
4. The data warehouse must be a secure bastion that protects our information
5. The data warehouse must serve as the foundation for improved decision making

He further states that you can have the most technical sound data warehouse; but if the business community does not accept the data warehouse as adding value it is a failed program. One of the major reasons that a data warehouse is not accepted by the business community is the perception of poor quality of the data in the data warehouse that the user accesses. There are many reason why the data quality is perceived as poor, but one of the ways to discover the quality of the data is to conduct a data analysis in the early phase of the project – I discussed this briefly last week.

So where does data quality begin. Many point out the Database Administrator or the Information Technology staff as the cause of poor data quality. However, since the data is a corporate asset the responsibility for the data quality belongs to the whole organization. This is starting to be recognized by many corporations as Master Data Management and Customer Data Integration processes have been started within some organizations. Two of the most prominent causes of poor data quality were:

1. Movement of centralized data system to distributed data systems
2. Poor implantation of purchased package data systems
3. Silo implementation of purchased package data systems
4. Lack of data edits for imputing data into data systems
5. Lack of having a system of record for corporate entities
6. Not viewing data entities from a corporate perspective

So what can be done in the short term to help corporations implement Business Intelligence until they can get a Master Data Management and Data quality processes implemented . Some of the short term steps that can be done:

1. Begin data analysis process early in the program to help determine the quality and consistently of the data
2. Work with the business users to find short term solutions to the data quality issues
3. Work with the business users to determine data edits for the data elements
4. Work with the business users to determine the definition and calculation of major data metrics
5. Work with the business users to determine a system of record for the major entities
6. Work with the business users to determine major data hierarchies
7. Work with the business users to determine an acceptable level of data quality for the project within the current BI Program
8. Work with the business users to develop a data repository for a corporate definition of data elements and metrics
9. Work with the business users to determine acceptable level of analysis and reporting requirements
10. Keep the business users involved in all phases of the project development lifecycle

Data quality is an issue facing all data warehouse and Business Intelligence Programs. It should be addressed early in the BI Program, and be resolved in the short term with input from the business users. The business users are the ones that have to perceive that BI Program as adding values otherwise it will not be successful. They have to be involved in all phases of the BI Program to understand the data issues and develop short term solutions to the data quality and definition problems until a corporate Master Data Management and Data Quality Program can be implemented.

No comments:

Post a Comment