Bringing in new source system to existing MDM

Infosphere Master Data Management theory and best practices

Moderators: chulett, rschirm, falsehate

Post Reply
hitmanthesilentassasin
Participant
Posts: 150
Joined: Tue Mar 13, 2007 1:17 am

Bringing in new source system to existing MDM

Post by hitmanthesilentassasin »

Hi,

We have implemented registry style MDM in one of our clients with few source systems. Now the client wants to add additional source systems to the existing ones. However, the client doesnt want EID's to change. I can't take the delta processing method as it was clover ETL and it takes very long to load and consumes too many resources. I can't use batch loader as that is part of advanced edition. Does any one know any other method to bring in the new source system?

Thanks!!
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

The newer versions of MDM typically include a bundled license for using DataStage (and possibly QualityStage too; I don't recall) with MDM, up to a certain PVU license value. I'm not sure what version of MDM started bundling such licenses. We started MDM with version 10.1 and it was included, but we already had it up and running for many years prior. Not sure if this helps, but it may be an option worth checking into further.
Choose a job you love, and you will never have to work a day in your life. - Confucius
hitmanthesilentassasin
Participant
Posts: 150
Joined: Tue Mar 13, 2007 1:17 am

Post by hitmanthesilentassasin »

qt_ky wrote:The newer versions of MDM typically include a bundled license for using DataStage (and possibly QualityStage too; I don't recall) with MDM, up to a certain PVU license value. I'm not sure what version of MDM started bundling such licenses. We started MDM with version 10.1 and it was included, but we already had it up and running for many years prior. Not sure if this helps, but it may be an option worth checking into further.
Yes, I am aware of DS as another possibility. However, that would be a secondary option for the client due to the fact that DS is another application that needs to be running besides MDM and sharing the resources.

I am using memput to load it for now. Do you have any thoughts about uploading large volume of data without taking too much time?
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

I am not aware of any restriction on sharing the server resources. You may want to double check with the vendor you purchased MDM from about that. My interpretation was that MDM was bundled with a PVU license for using a certain edition of DataStage. You should be able to install that on separate, dedicated hardware that does not exceed the PVU limit. I do not know of any better or faster ETL tool than DataStage.

As far as bulk loading into MDM, there are a few options documented in the MDM Knowledge Center. The one I have seen done involved building a specially formatted hybrid text file having a single MDM XML transaction per line that the MDM batch processor would consume. I do not think of it as being extremely fast because MDM has to parse out all that XML, so it seemed to have a lot of overhead associated with it, a tradeoff for the flexibility.
Choose a job you love, and you will never have to work a day in your life. - Confucius
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

I also recall another method discussed on the MDM site which involved directly loading the base tables. That would be the fastest way to do an initial bulk load, however I seriously doubt it would be recommended for any subsequent loads because you would need to turn on the history triggers and then route everything after that through all the MDM service layers as transactions.

The only other thing I can think to mention, and this may vary according to how you licensed MDM, is that in our case, there are no hardware or PVU limits on our MDM implementation. If we need more horsepower, we can simply add more cores and memory to the MDM servers dynamically at any time.
Choose a job you love, and you will never have to work a day in your life. - Confucius
hitmanthesilentassasin
Participant
Posts: 150
Joined: Tue Mar 13, 2007 1:17 am

Post by hitmanthesilentassasin »

The concern here wasn't about license but about the sharing of the resources between MDM and DS. I did load the initial load by doing the bulk load and unl files which was pretty fast to process.
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

What is the sharing concern exactly? Not clear.
Choose a job you love, and you will never have to work a day in your life. - Confucius
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

MDM 11 added Information Server for DQ to the license. MDM 11.3 upgraded it to Information Server Enterprise Edition - so you get every product in the suite. The use limitation is 480 PVUs production and only 2 authorized users and only patterns where MDM is the source or target.

The best uses of IIS EE to support MDM is to use InfoSphere Discovery to do source to target mapping discovery for new sources, Business Glossary to publish the MDM elements as a catalog and display MDM data lineage and DataStage for bulk load and delta data processing.

The installation quandary is whether you install IIS EE onto the same server as MDM - where they compete for CPU and RAM - or procure a new server and pay the ongoing support costs for that server. It's better to keep IIS EE and MDM on different servers, at least in production. DataStage might be faster if it runs the MemPut function as parallel streams via the new MDM stage. It could end up being many times faster then Clover on a 480 PVU box.

In the older MDM 10 setup it installed Clover ETL straight onto the MDM server as it had a lighter installation footprint, however it is also less scalable and powerful as a result.
hitmanthesilentassasin
Participant
Posts: 150
Joined: Tue Mar 13, 2007 1:17 am

Post by hitmanthesilentassasin »

I definitely agree with you Vincent. As a work around I tried mocking parallel processing and frequent commit intervals using clover etl and the turn around time reduced to 1/3rd of the actual time. I haven't tested it with DS. Ill try and compare the difference. I hope its not as sluggish as clover is.

Thanks for your reply!!
Post Reply