Cleansing

Data cleansing involves deleting or marking incorrect and empty entries in your data set. After consultation with you, we will create a process tailored to your needs.

Normalization

Normalization is used to make different forms of data uniform. This process can be applied to both text attributes and number attributes. This is the first step in creating a clean data set for analytics, databases, and customer inventories. The granularity of this normalization is adjusted as desired to lead to the optimal, desired output.

Input data

Normalized data

Street

Nr.

Street

Nr.

Musterstr. 7

 

Musterstrasse

7

Musterstrase

7

Musterstrasse

7

muster-strasse 7

7

Musterstrasse

7

Deduplication

This cleaning step ensures that there are no duplicate entries in your data. Simultaneously to the normalization we offer you to discuss the procedure exactly with us to meet your desired accuracy of duplicate detection.

Matching

“Matching” or also “Record Linkage” is used to merge data from different systems or files. This gives you a unified view of your data. In this process, we can also assign uniform keys to identify the same records.

For Swiss company addresses, we offer you the option of matching and matching your data against the UID register. Here you will receive the current company data and the UID (company identification number).

How we proceed

A detailed preliminary discussion is elementary for data preparation. Together with you, we go through the data set and discuss the necessary steps. In this way, you receive a customized solution for your needs.