Data Talks, Episode 26
Data Talks is Exago’s podcast on business intelligence, analytics, and application software. This month, we’re joined by Aaron Fuller, Director of Data and Business Intelligence at CodeGreen Solutions, who tells us how small and midsize businesses can manage their data quality, today, with the tools already available to them. We learn where to start, who on your team to endow with data management responsibilities, and what to do with issues you uncover.
Segment 1: Data Quality Management for SMBs
(2:05) How common is poor data quality control?
(9:32) Six Sigma and process standards.
(16:25) Third-party applications and their impact on data quality.
(17:56) ETL services.
(20:25) How to actually fix issues once you’ve identified them.
(23:42) Best practices for where to start.
(24:35) Overcoming a million-dollar data mistake.
Segment 2: What We Are Nerding Out About
(26:46) Aaron: Systems Concepts in Action: a Practitioner’s Toolkit by Bob Williams and Richard Hummelbrunner
(29:37) Chris: The Pragmatic Programmer: Your Journey to Mastery by David Thomas and Andrew Hunt
(41:25) Nicole: Notion’s database feature
“I think that there are pockets of organizations that do really well at this, and what I find is it’s organizations that are very oriented around certain kinds of businesses where it’s built into their culture — accounting firms, by their nature, are very good at data quality.” (2:38)
“It’s about realizing what your business rules are when it comes to data quality. What are the things that you hope to use your data for, and what level of data quality is acceptable? Because really we’re not ever, or usually not, after perfect data quality. Perfect data quality is usually not attainable, so what we want is data that’s high enough quality to be used for the purposes that it’s intended for.” (3:54)
“…To what degree should the data quality system be tied in with the ETL process or separate from it? That’s an interesting question for any data architect, and I don’t actually have a firm opinion. I’ve done it both ways, and I think that it can work both ways.” (19:26)
“I would say that the best practice is to engage someone from the business side and to identify things that are mission-critical to the business and/or have caused major problems for the business in recent times.” (24:11)
“Data is guilty until it’s proven innocent, frankly.” (25:50)
Aaron Fuller is Director of Data and Business Intelligence at CodeGreen Solutions. Prior to that, he spent over ten years as Principal at Superior Data Strategies, a team of data architects, analysts, and developers dedicated to helping businesses tackle data management problems. Aaron regularly teaches courses on data quality and data warehousing with TDWI and has spoken on topics ranging from data modeling to data governance. You can find him on LinkedIn.