Importing Data Into Master Data Services 2012 – Part 2

Welcome to Part 2 of a 3-part series on using the new Entity-Based Staging Process in Master Data Services (MDS) 2012.

Part 1:  Overview of the Staging Process

Part 2:  Loading a Model Where All Attributes Come From a Source System

Part 3:  Loading a Model Where Some Attributes Are Maintained Directly in MDS

Sample Model

First, a few quick words about the sample model we’re going to use.  It’s a simple Account Model which has 3 entities.  The Account Type and Account Class entities exist solely to populate the primary Account entity.

Conceptually the Account Model looks like this:

     image

The Account Entity in MDS looks like this:

     image

The remainder of this blog entry discusses loading the Account model at the leaf (most detailed) level.  All attributes, for all entities, come from a source system.

Leaf Member Staging Tables

To support the Account model, 3 staging tables were created by MDS when the entities and attributes were set up:

     image

Recall that the Account Class and Account Type are domain-based attributes, used as follows:

  • Account Type:  Contains a distinct list of Account Type code and name – used as the lookup table for the main Account entity.
  • Account Class:  Contains a distinct list of Account Class code and name – used as the lookup table for the main Account entity.
  • Account:  Contains the accounts, plus the mapped values of Account Type and Account Class. 

For purposes of this example, we are going to assume all of the attributes in these entities come from a source system external to MDS.  (Note:  Part 3 of this series looks at how to handle it if one or more attributes are maintained directly in MDS – i.e., the data doesn’t exist in a source system anywhere.)

SSIS Package

Following is an example of the flow for the SQL Server Integration Services (SSIS) package.  Note that in a Production ETL environment a few things will be more dynamic, but the purpose of this example is to be simple & straightforward.

     image

 

image

Step 1:  Truncates each of the 3 staging tables.  This is to eliminate the data from the last time this process ran.

     image

 

imageStep 2:  Data flow to populate each entity in the model.

  image

In your source query, make sure you pull back the code field for the domain-based attributes rather than the name field.  For the Account entity, we’ll only populate Account Type and Account Class fields with the code value.  However, if we were populating the Account Class or Account Type entity, we would populate both code and name.

The derived columns include:

  • ImportType of 2 (which means updates can be accepted vs. a failure).  There are 6 codes to choose from to control the behavior.  The list of ImportType codes can be found here:  http://msdn.microsoft.com/en-us/library/ee633854.aspx
  • ImportStatus_id of 0 (which means each record is ready).
  • BatchTag which I chose to create from an ETLExecution_id (i.e., a unique code used for ETL job management) plus a string which refers to this entity.  The reason the string is appended is so that each BatchTag is unique for every data flow.  MDS will generate an error if > 1 of the same BatchTag executes at the same time.  Therefore, each of the 3 data flows in my example append a different string so they are allowed to run in parallel for speed.

     image

The data conversions are only needed because my source data was non-Unicode, and MDS is Unicode.

     image

Mapping of fields into the Leaf Member Staging Table looks like this:

     image

The Code can be ignored if you specified for the Code to be created automatically when the entity was set up (a new feature in MDS 2012).

More information about populating the Leaf Member Staging Table can be found here:  http://msdn.microsoft.com/en-us/library/ee633854.aspx 

image

Step 3:  Load the MDS Model.  With this step, we’re going to be invoking the MDS-provided stored procedure that takes the data out of staging and moves it into the actual model. 

Each staging table has its own stored procedure.  The stored procedure name will be in the format of udp_EntityName_Leaf (unless a different name was chosen when the entity was created). 

     image

Syntax is as follows:

DECLARE @RC int
DECLARE @VersionName nvarchar(50)
DECLARE @LogFlag int
DECLARE @BatchTag nvarchar(50)

SET @VersionName = N'VERSION_1'
SET @LogFlag = 1
SET @BatchTag = (? + ' Account')

EXECUTE @RC = [stg].[udp_Account_Leaf]
   @VersionName
  ,@LogFlag
  ,@BatchTag
GO


Note that the BatchTag needs to match what was specified in the Step 2 data flow.  I’ve set mine to be unique per entity so the batches can be executed in parallel.  The ? in the screen shot above signifies I’m using a parameter (which is mapped to a variable which contains the ETLExecution_id); this helps the BI team managing nightly loads know which run this was associated to.

A LogFlag of 1 specifies logging is enabled.  The VersionName also needs to be specified – a slight complication if you do a lot of versioning in your environment.

Execution of this step to load the model is what correlates to the Integration Management page in MDS:

     image

More information about loading the MDS Model can be found here: http://msdn.microsoft.com/en-us/library/hh231028.aspx

image

Step 4:  Validation of the Model.  Until validation is performed, the new inserts and updates are there but in a “waiting validation” status.  If any Business Rules have been defined for the model, they will be applied when Validation is run.

The Validation applies to the model as a whole, so it only needs to be done once regardless of how many entities we just loaded.

     image

Syntax is as follows:

DECLARE @ModelName nVarchar(50) = 'Account Model'
DECLARE @Model_id int
DECLARE @UserName nvarchar(50)=?
DECLARE @User_ID int
DECLARE @Version_ID int

SET @User_ID =  (SELECT ID 
                 FROM  mdm.tblUser u
                 WHERE u.UserName = @UserName)

SET @Model_ID = (SELECT Model_ID
                 FROM mdm.viw_SYSTEM_SCHEMA_VERSION
                 WHERE Model_Name = @ModelName)

SET @Version_ID = (SELECT MAX(ID)
                   FROM mdm.viw_SYSTEM_SCHEMA_VERSION
                   WHERE Model_ID = @Model_ID)

EXECUTE mdm.udpValidateModel @User_ID, @Model_ID, @Version_ID, 1

More information about validating the Model can be found here:  http://msdn.microsoft.com/en-us/library/hh231023.aspx 

Execution of this step changes the yellow question marks to green check marks:

     image

imageThe 5th and final step I have in my package is to check if any errors occurred and then alert the BI Team and/or Data Steward if needed.  This is done by querying the views provided by MDS for each entity. 

In my process, I just do a simple query that checks if 1 or more error records exist and if so, send an email.  Note the ResultSet is set to Single row.  The total count is passed to a variable.  If the value of the variable is >=1, the next step of generating an email will be kicked off.

     image

The syntax is as follows:

;WITH CTE_CheckEachTable AS
(
SELECT Count_MDSErrors = COUNT(*)
FROM stg.viw_Account_MemberErrorDetails

UNION ALL

SELECT Count_MDSErrors = COUNT(*)
FROM stg.viw_Account_Class_MemberErrorDetails

UNION ALL

SELECT Count_MDSErrors = COUNT(*)
FROM stg.viw_Account_Type_MemberErrorDetails

)

SELECT Count_MDSErrors = SUM(Count_MDSErrors)
FROM CTE_CheckEachTable

More information on viewing errors can be found here:  http://msdn.microsoft.com/en-us/library/ff486990.aspx 

Out Of Scope

The above process excludes the following:

  • Changing the code for an existing record (i.e., usage of the New Code field in the staging table).
  • Deactivating or deleting an existing record (i.e., via usage of specific ImportType codes).

Finding More Information

Part 1 of this series (Overview of the Staging Process)

Part 3 of this series (discusses Loading a Model Where Some Attributes Are Maintained Directly in MDS)