Author Archives: Greeny Dangerous

Pimp my Data Warehouse: MDS-parametrized dimension tables

Every once and a while we need a “classification dimension” table in Data Warehouse. For example when we have a basic sales fact table and column like “(Sold)Quantity” that tells how many items have been sold within the sales transaction. The data in the column varies between 1 and 10 000 (actually there is no absolute maximum value but just to be clear here). Users often want to have different classifications depending on the Quantity column for example:

  • 1-10: Tiny sales
  • 11-50: Small sales
  • 51-100: Medium
  • 101 – 500: Huge
  • 501 – nnn: Enormous

Respectively our dimension table D_SALES_SIZE_CLASS would look like this:

CLASS_ID CLASS_NAME
1 Tiny
2 Small
3 Medium
4 Huge
5 Enormous

The problem often is that if we go by the book we need to create reference id column eg. “SALES_SIZE_CLASS_ID” in the fact table and then load the whole sales fact table from scratch. In big environments this could be a bit of a problem because the reloading could take hours and you might need to arrange a downtime window for the production DW.An alternative is use computed columns (virtual columns) and a CASE – statement to get the correct value for SALES_SIZE_CLASS_ID then we avoid the reloading of the fact table.

But, a more comprehensive way of doing this is not to alter the sales fact table at all but doing all the magic in the dimension table by generating as much rows in the dimension table as there are distinct values in the Quantity-column of the sales fact table. Like this:

QuantityAmount CLASS_ID CLASS_NAME
1 1 Tiny
2 1 Tiny
3 1 Tiny
4 1 Tiny
5 1 Tiny
6 1 Tiny
7 1 Tiny
8 1 Tiny
9 1 Tiny
10 1 Tiny
11 2 Small
12 2 Small
13 2 Small
14 2 Small
48 2 Small
49 2 Small
50 2 Small
51 3 Medium
52 3 Medium
53 3 Medium
9999 5 Enormous
10000 5 Enormous

Whenever the sales classification requirements are changing we only need to modify the dimension table rather than loading the possibly enormous sales fact table once again from the scratch.

Making it customizable and accessible by end-users with MDS

After it is implemented in DW why not make it customizable by end-users so they can change the classification whenever there is need. For that MDS comes once again handy. Traditionally one would dig and hide this deep into the logic of ETL and whenever there is a need for change you need an IT specialist to modify the ETL scripts. Now we just need to set up a special parameter entity in MDS and add some useful attributes like this:

Sales classification parameter entity in MDS

Create new entity PARAM_SALES_SIZE_CLASS with attributes MIN_VALUE and MAX_VALUE to define the limit values for each class. Very easily editable by end-users. Just make sure that the values don’t overlap each other.

SQL script to generate the sales classification dimension table rows

Then it is time for some SQL magic. We generate 10 000 rows (or any other value that represents the biggest possible sales quantity size) and assign the class levels by using the minimum and maximum values defined.

SQL query results ready for loading to dimension table

This will generate 10 000 rows that are ready to insert to our dimension table (e.g. D_SALES_SIZE_CLASS). We now have fully customizable dimension table that can be edited whenever the business users want to and when they need a different point of view for their analysis.

Advertisement

TM1 dimension hierarchy levels not showing up in Cognos BI Reporting Studio

Being still a “TM1-freshman” I find myself facing some mysterious problems specially when it comes to the integration part of TM1 and Cognos BI. Recently we made updates to customer’s production environment. We copied new dimension and cube files and the associated {-prefixed control cube files from test environment. Cubes were also used by many reports in Cognos BI and everything else seemed fine but the hierarchy levels were missing:

Dimension levels not showing up in Report Studio

Normally updating the reporting package from Framework Manager does all the tricks but this time it wasn’t enough. It’s a relief to have a community like TM1 Forum where hundreds of professionals gather up and offer their help. A quick search there and I found out that there is a special control cube called {HierarchyProperties and from there you can choose whether to publish the dimensions with level information or not:

{HierarchyProperties cube

So I thought adding the level names there and once again refreshing the reporting package I’ll do the trick but there were still no levels showing up. Going through more threads I discovered that there is also a special TI function called RefreshMDXHierarchy() that one should execute after making changes in the control cube. Finally that missing function call did the trick and hierarchy levels appeared in Report Studio! One more thing to remember in the future …

Edit 19.9.2013: Make sure you ran first RefreshMDXHierarchy(‘ ‘) and then again RefreshMDXHierarchy(‘<your_dimension_name’)

Edit 18.11.2013: IBM released a new technote regarding this same matter. According to it one must ensure also the following things:

  • Determine whether CQM or DQM (Compatable Query Mode or Dynamic Query Mode) is being used in BI.
  • If using CQM, navigate to CognosInstallDIR\data\cqe\RTModels and remove everything within it (folders and files) so that there is an empty RTModels folder.
  • If using DQM, then navigate to CognosInstallDIR\data\mfw4j\PMCs and remove everything within it (folders and files) so there is an empty PMCs folder.

NOTE that you will need to stop BI to perform the actions above.
Alternatively, try republishing the package via FM.


Managing hierarchies being used in IBM Cognos TM1 – precautions

TM1 doesn’t like duplicate names in hierarchies

IBM Cognos TM1 is an excellent tool when it comes to planning, budgeting and forecasting. It also has hierarchy management capabilities but often you want to import hierarchies and data from an enterprise MDM solution such as Master Data Services (MDS) in SQL Server.

When  importing hierarchies and data to TM1 one must be careful with the business codes, ids and names of the entities being imported to TM1 since it has an interesting “limitation” (or feature, depends on from who you ask): all the elements within a dimension must have not only a unique id but also a unique business code and a unique name. It means that the following cost center source data example cannot be imported because the elements have duplicate names (although the business codes differ):

Type Code Name Parent
Consolidated 1000 Administration
Leaf 1 General 1000
Leaf 2 Purchasing 1000
Consolidated 2000 Production
Leaf 3 General 2000

To prevent situations like this and prevent having problems when importing data to TM1 one must be sure that there are no duplicate names or codes in the source data. In MDS this situation can be handled nicely by using MDS business rules.

Deploy Business Rules to prevent duplicates

In MDS you can define business rules to automatically keep your data in shape and also make some automatic processing to your data if wanted. Now we want to prevent duplicate names in the cost center data (codes are automatically unique so we don’t need to think about them).

Force the cost center names to be unique

Force the cost center names to be unique

Now that the business rule is defined and published we can apply it to the data itself:

Apply the business rule to existing data

Apply the business rule to existing data

As we can see two cost centers have the same name “General” and that violates the business rule definition.

Two cost centers have the same name

Two cost centers have the same name

The red flag indicates that there is a problem with the data and it needs to be fixed. You can also define business users to have notification emails when the validation rules fail so that someone can instantly fix the issues before they are causing errors in the downstream systems such like TM1.

After the the names are fixed and the business rules are applied again the situation is OK and the data meets the quality needs of TM1 and its ready to go!

No duplicates exist - the data is good to go

No duplicates exist – the data is good to go

Cost center hierarchy after fixing the data

Cost center hierarchy after fixing the data

Extra notes

Although we can prevent duplicate names within leafs and consolidated members independently the business rules cannot be used to prevent them in both simultaneously: you can still have leaf member with name “General” and a consolidated member with name “General”. And that’s also bad for TM1.

In most cases the hierarchy data goes first into a data warehousing system before it’s loaded to TM1. If that’s the case you can brute force the uniqueness of the names by storing both the consolidated and leaf members into a single table and then defining a unique constraint on the name column. That will keep your data well-formed. It’s not the most beautiful way to do it but a very effective one. Let’s hope that we can tackle this situation also in the future versions of MDS.


RSAAD – “Reporting Services as a Dashboard”

Some might think that the SQL Server Reporting Services (SSRS) is only for creating static and standardized reports with simple charts like the following picture describes:

This can be the most common feeling about SSRS but I can tell you that is not the case. Thanks to the acquisition of Dundas chart components SSRS has to provide fully parametrized, nice-looking and fresh Gauge panels and other chart components that can be used to create dashboards without making investments to platforms such as SharePoint PerformancePoint Services as they can be used straight from the Reporting Services portal just like any other report. Here is one example from my projects, a weather conditions dashboard made with SSRS:

Weather conditions dashboard made with SSRS

Weather conditions dashboard made with SSRS

Once in a while you hear business deciders struggling with how they want to have some simple dashboards where they can see in one glance how the business is performing. The problem often is that people think that they need to make (big?) investments in a specific dashboard tools or bigger platforms that provide the dashboard capabilities. That is not the case because with the latest SSRS Gauge panels and charts you can provide fresh insights to your data and have the dashboard look-and-feel. They can also be easily integrated to existing sites and portals.

SSRS Gauges collection

SSRS Gauges collection

I’m not telling you to get rid of or totally ignore products like SharePoint PerformancePoint Services etc. but often you can also get nice results without them. So, next time when you hear someone telling you about how one needs to invest lots of money to get dashboard capabilities you should first see what SSRS has to offer and see if it already fits to his needs.


Master Data Services 2012 is a must – Huge productivity improvements

Compared to the 2008R2 version you can now save even 75% of development time.

Microsoft just released the new SQL Server 2012 version. There are lots of new features included but one major improvement deals with Master Data Services (MDS) application. It has now moved to its second major release and it is now shipped with SQL Server 2012 Enterprise and Business Intelligence licenses. I don’t go into details that much but what I want to point out here is that the new 2012 version of MDS has remarkable improvements when it comes to user experience and the productivity of development work.

Biggest and maybe the most important improvement is the new Excel-add-in that can be used to modify data but also to create new entities and to import new data from scratch. Where the 2008R2 version relies only on the (clumsy) database staging tables in 2012 one can now use Excel to import batches of new data. This is of course good news for new users that are already familiar with Excel. The second big improvement is the new Silverlight-based web user interface. It has been completely renewed and I can tell that now it is really ready for end users. Unfortunately the initial 2008R2 version still suffers from nasty bugs and the usability is quite poor.

What I will now show you is a comparison between MDS2008R2 and MDS2012 and how they differ from each other when importing new data and creating entities.

Creating new entities and importing data

Let’s face it: I have a simple Product – structure that  grabbed from the Adventureworks – sample database. 5 entities stored in separate csv-files:

  • ProductCategory
  • ProductSubCategory
  • ProductModel
  • Color
  • Product

What we will next do is

  • Create a new Model to store the entities
  • Create 5 entities and the corresponding attributes + relationships: Product – SubCategory – Category, Product – Color and Product – ProductModel
  • Import data into entities
  • Create a derived hierarchy ProductCategory – ProductSubCategory – Product

Master Data Services 2008 R2

Create entities manually

In the old version we have to do it the hard way: first create the new entities in the web UI and then import the data via database staging tables using a proper ETL tool. I preffered SSIS.

This phase took me approximately 6 minutes to complete. Now I have the entities structure ready in MDS so it’s time to import some data.

Import data by using SSIS and the MDS staging tables

This is the most time-consuming part of the process. You have to create separate data flows for each entity and you also must do some unpivoting of the data when importing entity attributes. After each data flow you must go into the Import/Export  – page in the web UI and start the importing process manually. (OK, you can do all this automatically by using the web service calls and all that stuff but this approach is still the fastest way at this point).

Importing the attributes for each entity is bit tricky as you need to unpivot the source data into separate rows.

After a while and some serious SSIS-work we have 5 entities and a nice Product hierarchy set up in MDS:

The bad thing is that it took almost ~ 30 minutes to accomplish all this. With all the work that you have to do in the MDS Web UI we end up with appoximately total 40 minutes of work.

Master Data Services 2012

Create entities and import data

Now we’re talking. With MDS2012 you don’t actually have to separate the creating entities and importing data phases because now you can do them both at the same time! Starting with ProductCategory entity we just import the csv data into our Excel sheet and then connect to the MDS server and hit the Create Entity – button on the MDS ribbon.

Then we just choose right columns for code and name usage and we’re ready. As easy as that !

Handle domain-based attributes

The first thing that came to my mind when hearing about the new Excel-add-in was that if it was capable of dealing with domain-based attributes. And guess what, it sure is. Like we did with ProductCategory entity we do the same for ProductSubCategory. Notice here that there is a relation between SubCategory and Category and we have to handle it also correctly. That can be managed by using the Attribute Properties – functionality in the MDS ribbon.

Choose Constrained list (Domain-based) as the attribute type and populate values straight from the recently created ProductCategory entity:

Nice and easy! We now have succesfully created ProductCategory and ProductSubCategory entities and formed a relationship between. All this in ~ 3 minutes of time. After repeating the same steps for the rest of the entities (ProductModel, Color and Product) we end up with the same result as in 2008R2 but only in 10 minutes of development time. Saving 75% of time compared to the process in MDS2008R2.


MDS 2008R2 MDS 2012
Create model 1 min 1 min
Create entities 6 min
Import data 30 min 10 min
Other 4 min
Total 41 min 11 min

Extra mentions about MDS2012

When talking about how to automate data imports that’s where the staging tables are coming into picture also in MDS 2012. Now it’s also more user friendly since you don’t have to import entity rows and corresponding attributes in separate tables and you don’t have to unpivot the columns into rows in the tblStgAttributes – table. In MDS2012 there is a separate table for each entity and it is 1:1 with the entity definition.

What really completes the whole package is the also brand new Data Quality Services (DQS) application that works nicely together with MDS. More about DQS later …

Summary

When Microsoft launched the initial version of MDS in May 2010 it was a classic “first version” of the (recently acquired)  product: missing features and minor bugs here and there. Now the second major release really finds its place in the hearts of the users and developers since there are lots of really good improvements that make it a better product.

So, if you are planning to start a fresh MDS project my honest advice is: don’t start start with 2008R2, do it with SQL Server 2012.

-GD