Overview

Introduction

The Implementing Data Flow module provides you with the instruction and server hardware to develop your hands on skills in the defined topics. This module includes the following exercises:

  • Data Flow Task with Transformations
  • Adding and Configuring a Lookup Transformation
  • Using Fuzzy Lookup Transformation
  • Adding and Configuring Data Correction Transformation
  • Optimizing Integration Services Packages

Exercise 1 - Data Flow Task with Transformations

In this exercise, you will learn the following in Microsoft SQL Server 2012:

  • Create a Data Flow task with Conditional Split up transformations
  • Create a Data Flow task with Lookup transformation
  • Create a Data Flow task with Fuzzy Lookup transformation
  • Adding and configuring a Data Correction transformation

Exercise 2 - Adding and Configuring a Lookup Transformation

You can use the Look up transformation to derive new column value based on the existing set of values in input columns. This transformation helps you access additional information in a related table. The data in the related table is based on values in common columns. In other words, this transformation looks for values in a reference dataset using exact match. The reference data set can be the result of a SQL query, an existing or a new table, an existing view or even a cache file. You need to use an OLE DB data connection or a Cache connection for this transformation based on the type of the dataset.

In this task, you will create a Data Flow task to extract customer information from a text file. You will use a Lookup transformation to look up zip code information from a reference table in the AdventureWorks2012 database for the matching city and state columns in the text file. You will then combine the zip code information along with other customer information and load the same into the respective table in the AdventureWorks2012.

Exercise 3 - Using Fuzzy Lookup Transformation

The Fuzzy Lookup transformation is similar to Lookup transformation except that this transformation uses fuzzy matching instead of exact matching to locate the match between the value in an input column and the value in the reference table. This transformation has a single input and a single output and returns records that have one or more close matches in the reference table.

In this task, you will create a table named FuzzyLookUpSource that will hold the names of certain business entities. You will then use the Fuzzy Lookup transformation to reference the values in the name columns of the Person.Person table in the AdventureWorks2012 database for matching them against the column values in the FuzzyLookUpSource table and find the similarity threshold based on the match.

Exercise 4 - Adding and Configuring Data Correction Transformation

You can use DQS Cleansing transformation for correcting data from a connected data source to maintain data integrity. This transformation is a knowledge driven data quality service. To use this transformation, you need to first create a knowledge base applicable to the data in the data sources that needs to be cleaned. You can either build this knowledge base through knowledge discovery wherein you obtain samples of your data, or by manually adding the relevant data to the knowledge base. You can also use external knowledge from Windows Azure Data Market and configure your knowledge base. You need to store the knowledge within the context of data domains. You can create and maintain knowledge base through DQS client. Once the knowledge base is created, you can configure the DQS to use the created Knowledge Base for correcting the incorrect data in the connected data source.

In this task, you will use a Flat File Source adapter to read incorrect data from a text file. You will then use the DQS Cleansing transformation and connect it with the Flat File Source adapter. You will then create a Knowledge Base in a Data Quality Knowledge Domain relevant to the data in the text file using the DQS Client. You will then configure the DQS Cleansing transformation to connect with the created Knowledge Base to enable it to correct the data in the connected Flat File Source.

Exercise 5 - Optimizing Integration Services Packages

In this exercise, you will learn the following in Microsoft SQL Server 2012:

Optimizing an integration services package for speed of execution

Comprehensive Learning

See the full benefits of our immersive learning experience with interactive courses and guided career paths.