Hi,
I am working as a DataStage architect for a French firm. We are putting in a DataStage framework from a local supplier and in the process of moving a small number of high volume server jobs to PX.
The quality of the jobs and data is variable and knowing the PX is fussier with its treatment of data than Server, I am wondering if there is any mileage in using Information Analyzer to profile and Metadata Workbench to view the lineage.
Given that we have to redo the jobs anyway, it might help us with conversion.
What are your experiences of using these two products
Do you have any views on any other IBM products which simplify the design-redesign processes you have been through.
Thanks
Colin
PS. I have posted this to the MDW system forum too for the MDW users
Is Information Analyzer actually any use ?
Is Information Analyzer actually any use ?
Colin Larcombe
-------------------
Certified IBM Infosphere Datastage Developer
-------------------
Certified IBM Infosphere Datastage Developer
-
- Premium Member
- Posts: 1735
- Joined: Thu Mar 01, 2007 5:44 am
- Location: Troy, MI
It depends on the purpose of use on whether it can be of great help, little help or of no help.
I do use it to profile the data to have some insight for data architect for data modeling as well as for me as DS architect on what all kind of situations/exceptions I need to handle while designing the jobs.
I do use it to profile the data to have some insight for data architect for data modeling as well as for me as DS architect on what all kind of situations/exceptions I need to handle while designing the jobs.
Priyadarshi Kunal
Genius may have its limitations, but stupidity is not thus handicapped.
Genius may have its limitations, but stupidity is not thus handicapped.
-
- Participant
- Posts: 3593
- Joined: Thu Jan 23, 2003 5:25 pm
- Location: Australia, Melbourne
- Contact:
You don't have much to lose by configuring and running the data lineage - it may give you some clues about how jobs are sharing common data such as shared tables and files. I don't think lineage will help you work out the logic of the job as well as looking at the jobs. Lineage can show how the data flows but not what is done to it in terms of transforms, constraints, user-defined SQL and it will be completely blank on BASIC routines.
I would think about using InfoSphere Discovery to help reverse engineer what is happening to the data. Information Analyzer will give you table and column analysis and help you work out what data types are really in the data and what missing or default values are present. If you were to run Discovery against the source data and against the target data and perform overlap analysis and transformation discovery it would try to work out the mappings for you - and you can load those mappings into FastTrack/DataStage.
Whenever you have a populated source and a populated target Discovery is very good and cross system profiling.
The main use for Information Analyzer in your scenario is whether you would use the DQ Rules stage for some common data quality checks - such as valid email address. Information Analyzer's rules interface may replace some of the old BASIC routines and provide re-usable checks between jobs.
I would think about using InfoSphere Discovery to help reverse engineer what is happening to the data. Information Analyzer will give you table and column analysis and help you work out what data types are really in the data and what missing or default values are present. If you were to run Discovery against the source data and against the target data and perform overlap analysis and transformation discovery it would try to work out the mappings for you - and you can load those mappings into FastTrack/DataStage.
Whenever you have a populated source and a populated target Discovery is very good and cross system profiling.
The main use for Information Analyzer in your scenario is whether you would use the DQ Rules stage for some common data quality checks - such as valid email address. Information Analyzer's rules interface may replace some of the old BASIC routines and provide re-usable checks between jobs.
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn:Vincent McBurney LinkedIn