Data Quality

Data quality at the highest level is one of the basic prerequisites for valid business decisions.

Data quality is a multidimensional construct that is determined by several measurement variables. The individual processes, use cases, users and systems in each organization determine which dimensions are relevant for the data quality in each data set.

GIn general, a high level of data quality is the basis for true data intelligence and thus a fundamental success factor for all data-driven business processes and models. Increasing your data quality creates the optimal conditions for smart decision-making processes and top performance in the digital age.

Check out our tool for the optimization of your data quality:

DATAROCKET Core

The prelude to a successful data management project

Performing a data analysis is a suitable measure to start a master data management project. It gives you an initial overview of the quality level of your data in its current state and allows you to plan further steps based on this knowledge.

Our approach to data analysis:

We proceed on a field basis. Based on pipelines the data quality can be determined with which you can define individual quality criteria and calculation paths. We carry out individual quality audits for each of our customers. To enable us to build up customer-specific sets of data quality criteria faster and more efficiently, we use templates developed in-house. We offer data quality rule templates for creditor data, debtor data and SAP material data.

Phase 1:
Development of measurement and analysis structure

  • Design and implementation of a DQ system

  • Requirements elicitation methodology

  • Structured measurement framework (DQ system)

  • Technical specifications for attribute maintenance

  • Company-wide standards for ensuring data quality

A data quality system (regulatory framework) is created to support employees in implementing and complying with the defined standards and rules.

Phase 2:
Implementing the Data Quality Measurement

  • Select relevant attributes

  • Define quality criteria

  • Establish duplicate definition rules

  • Determine key performance indicators to measure data quality

  • Determine required data exports

Result: The structured analysis measures the data quality in the company and is a prerequisite for the long-term implementation of quality criteria.

Benefits and results of measuring Data Quality

1. Check for duplicates

  • Identify duplicates based on specified criteria.

  • Use intelligent matching to identify similarities.

  • Merge duplicates by selecting relevant information.

  • Ability to manually review and correct potential duplicates.

2. Data Quality check

  • Data validation: Checks for validity according to defined rules.

  • Plausibility Check: Check for logical consistency and inconsistencies.

  • Completeness check: Check for the presence of all required fields and the absence of empty values.

  • Consistency check: Compare with other data sources to detect inconsistencies.

  • Purge functions: Automatically cleanse data, add missing values, correct invalid values, adjust formatting, and remove duplicates.

3. Optimization

  • Analyze data to identify weaknesses and patterns of data quality issues.

  • Provide metrics and assessments to quantify and evaluate current data quality.

  • Identify root causes of data quality problems.

  • Provide specific data cleansing and optimization recommendations.

Cleansing

Optimized data quality through data cleansing

Usually, data cleansing is the first approach to restore a correct database as the foundation for improving data quality. Detection and elimination of duplicates plays a decisive role, as does the establishment of validation rules for measuring data quality and monitoring its success.

The duplicate detection, which you can perform with our master data management software DataRocket, checks the entire data set and finds entries that concern the same business object but contain different information. In a process called data harmonization, these entries are merged into one comprehensive, meaningful data set – the Golden Record.

Golden Record

DATAROCKET Core acts as a hub in a company’s data landscape and as such accesses heterogeneous data sources. The data records from these sources are extracted and consolidated and then merged into Golden Records. This Golden Record or single point of truth is a master data record that combines the relevant attributes from all data sources.

Not only the elimination of duplicates, but also other corrections improve the data quality:

  • Plausibility violations (e.g. the net weight must always be less than the gross weight of an article)

  • Filling levels and limit values such as minimum and maximum values (e.g. postal codes with a fixed amount of digits)

  • Missing standards for date formats, addresses or phone numbers

Data Cleansing with DATAROCKET Core

Automated data cleansing

The application of one or more previously defined rules results in updated data (bulk update).

Data cleansing workflow

A workflow is run through in the software and manual corrections are made to the data based on the results.

Mass update (bulk upload)

A new file with clean data is uploaded to update the data set.

Long-term data quality improvement

DATAROCKET Core offers real-time measurements for continuous monitoring of data quality in your company. The innoscale mdm software DataRocket measures and continuously monitors the current data quality directly in your systems. The measurement provides results for the following quality criteria, among others:

  • Timeliness and age of the data

  • Consistency, validity, accuracy, completeness and uniqueness

  • Frequency of change