Our Automated Data Quality Management System for Publicis Media

Our Automated Data Quality Management System for Publicis Media

This case study details how we built an automated data quality management system solution to address trust issues with the data and maximize both the return on investment of Publicis Media in a centralized data platform and the quality of its BI and analytics activities. Discover how we achieved a 50-70% reduction of issue detection and mitigation time while regularly profiling 100% of the data implemented.

Key Challenges

Our client is one of the largest multinational advertising service providers and media groups. It offers various communication services. Within the scope of performance analyses, the company collects large amounts of data from various advertising platforms. All of this precious information is stored and transformed in a unique data platform which aims at providing actionable data to each business team.

However, our team of Data Engineers working on this data platform noticed some technical and functional quality problems. These could only be detected and corrected manually. And when there are data quality issues, there are inevitably trust issues with the results of the analytics, reporting and data science activities being used by the business teams.

In this context, our client wanted an automated quality assurance solution that would:

LinkedIn On-demand Webinar Speakers Data Quality Management System Positive Thinking Company SteepConsult
LinkedIn On-demand Webinar Speakers Data Quality Management System Positive Thinking Company SteepConsult

Our Approach

In order to address our client’s challenges, we:

  1. Analyzed the current situation including the data quality checks already performedand we collected all the stakeholders’ requirements (both from technical and business teams).
  2. Evaluated the potential for automation and generalization of existing quality assurance initiatives.
  3. Designed a quality assurance module that fits into the existing platform infrastructure and complements it with important components. We worked on three quality gates: a first check on the raw data, a second check after the harmonization phase and a final check on the data to be delivered/provisiones to the teams/data products.
  4. Implemented these technical and functional quality checks for initial data sources and projects to provide an MVP
  5. Created generalized quality levels (depending on the data sources and the expected quality delivered by the source) that can be communicated and understood easily.
  6. Implemented new roles (e.g. Quality Assurance Product Owner, Owners for the specific quality checks, Data Stewarts) and processes (e.g. RACI matrix)complementing the quality assurance solution.
  7. Implemented a reporting system in the form of dashboards and notifications for monitoring the data quality based on the checks.
  8. Scaled the automated quality assurance solution to further data sources and other projects.
  9. Coached and trained the business and technical teamsin use and operation. It enabled us to collect new inputs in order to further improve the whole quality assurance system in place.

Benefits

Teams involved on this project

A Product Owner and a Data Engineer collaborated with our client on this project for almost one year.

Technologies and Partners

Technologies used for this project: DataBricks, Great Expectations, Python and Azure Cloud
Newsletter subscription