|8:30 - 9:00||Continental Breakfast|
|9:00 - 9:15||Welcoming Remarks|
|9:15 - 10:15||Presentation|
|10:15 - 10:30||Break|
|10:30 - 11:45||Presentation Cont.|
|11:45 - 12:00||Questions and Wrap-Up|
|12:00 - Lunch||Restaurant (to be announced at the meeting)|
Data Quality: Issues You May Have Not Thought About by Michael Scofield
1. The data life cycle is a way of looking at how data flows from observations of reality to methods of storage, expression, manipulation, aggregation, and finally towards useful information. Where can the quality break down along that path?
2. Scope of data assets and stewardship requires that we recognize all the data in the enterprise including data that is latent, and data which flows among enterprise units, incoming data from external sources (which are architecturally different and out of our control), and data we export. After creating an inventory of all the enterprise data, we must decide what is important. Where is the DQ pain?
3. What do we mean by data quality? We will quickly look at sub-facets (Danette McGilvray calls them “dimensions”) such as presence, scope, reasonableness, validity, precision, accuracy…all distinct and different. When we talk about DQ, we often must be precise in our terminology.
4. Data quality improvement requires, at first, giving production data visibility. We will look at some easy ways to do that without expensive tools. Data profiling can even be done in native SQL (if you have the stamina). Even graphic techniques are very useful.
5. Production data is constantly changing as enterprises morph. How can we establish a surveillance system which will alert us when significant changes warn of DQ issues? This would apply to latent data, but even more to imported data where we may be injured by changes in scope, architecture, or quality.
6. The data-ization of knowledge is the most oft-ignored issue in DQ. When we try to cram inherently ambiguous information (such as patient condition) into codes and tabular data structures, we often strip off important “qualifying” adjectives such as textual expressions of approximation and/or reliability. The granite gravestone in a cemetery gives us an excellent example. We will look at both discrete and linear ambiguity in data.
Michael Scofield is a popular speaker and writer in the fields of data management, data quality, and data warehousing. He holds an adjunct faculty position at Loma Linda University in the Department of Health Information Management. He is the recipient of the 2008 DAMA International (Data Mgmt. Assn.) Community Award for his contribution to the data management community. He was also a 2007 nominee for the DAMA Award for Professional Achievement.
Mr. Scofield is also a frequent speaker in topics of satellite imagery interpretation and emergency communications. His career has included education and private industry in areas of data quality, decision-support systems, data warehousing, and data management. His articles appear in DM Review, the B-Eye Newsletter, InformationWeek magazine, the Northern California Oracle User Group Journal, the IBI Systems Journal, and other professional journals. He has spoken to over 140 professional audiences for organizations such as Data Management Assn chapters, European Metadata Conferences, information quality conferences, The Data Warehousing Institute, Oracle User Groups, Institute of Internal Auditors, Assn. of Government Accountants, Quality Assurance Association chapters, Assn. for Computing Machinery and other professional and civic audiences. He also has humor published in the L.A. Times and other journals.