In this article we are going to see how to use the Fuzzy Lookup transformation in SSIS. The Fuzzy lookup transformation uses an equi join to do a check for the matching records across the tables. Fuzzy lookup can be used where we have a large number of corrupted data and we need to consider doing a cleanup and processing the data to be available across the systems.
Take for example when we need to write a package which fetches the details from the customer table and processes the data to some systems; in that case if there is some mismatch in the name then we also need to process the data; in that situation we can have this fuzzy lookup which takes the matchup as per the threshold and processes the missing records so that the accuracy becomes relevant. Let's jump start to how to use this task in real time and see the steps to do the configurations.
You can also read my other articles on SSIS from here.
Follow steps 1 to 3 on my first article to open the BIDS project and select the right project to work on an integration services project. Once the project is created, we will see how to use the Fuzzy Lookup control. Once you open the project just drag and drop the Fuzzy Lookup control and a source provider as shown in the below image.
There are some Red Cross icons on the tasks which indicate that the controls are not configured yet. Now let's start to configure the controls in the coming sections. First configure the Source provider as shown in the below task.
Now the Source provider is configured, which means we have the data to process in our package; here we need to see the corrupted data that is like any data repeated and anything against the policy for the business. Now let's configure the Fuzzy Lookup as shown in the below screen.
Configure for each tabs as shown below:
Here we have an option to create a new index or use an existing index, normally Fuzzy lookup creates an index to do the check for the sorting and do the transformation for checking the duplication of values accordingly. If we have an existing index on the table then we have option to use the same instead of creating a new one to maintain the performance of the table.
The above image shows on which column we should map and which column holds the responsibility of doing the column check.
The above screen shows the advanced setting to use for the fuzzy lookup transformation like providing the threshold and giving the exact match for the fuzzy transformation.
After finishing the configuration your screen looks like below image:
When executing the package (Press F5) your screen looks like below. This indicates that the package is executed perfectly.
So in this article we have seen how to use the Fuzzy Lookup transformation task and the key configurations used in order to use this task handy.