White Paper on DataStage Enterprise Edition

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
ecclesr
Premium Member
Premium Member
Posts: 260
Joined: Sat Apr 05, 2003 7:12 pm
Location: Australia

White Paper on DataStage Enterprise Edition

Post by ecclesr »

Hi

I am just starting the process of creating a short position paper on DataStage Enterprise Edition for my manager.

We are currently not a Enterprise Edition site

I would like try an cover the following in the paper
- Platform that is runs on
- An example design of a job before and after conversion to parallel
designed job
- Effort for migrate and convert to parallel job designs
- Some of the issues people have encounted is such a migration
- Benfits

I have searched the Ascential library - but found that a frustrating exercise

Any content, pointers the member can provide would be much apreciated in me put something on paper on this topic

You can forward to my email address

Thanking you in Advance

Ross Ecclesfield
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Firstly you are on Windows which means you cannot use Enterprise Edition until version 7.5.1 is released at the start of the next year. That will be the first version of parallel jobs that runs on Windows platforms.

Secondly you need multiple CPUs to get any benefit, preferably four or more. You also need a very high data volume to justify the move. You may be able to remove your current bottlenecks with a more efficient design and multiple instance jobs.

Third, compare the move to Enterprise with a move to RTI. This might get you the scalability you are looking for without a big hardware or development effort.

Now to answer your questions. DataStage Enterprise is certified for Unix and Linux platforms. At the start of next year there will be versions for Windows and Mainframe Unix System Services. I usually view the platforms from e.services but it's currently down.

For example jobs go to Ascential devnet and have a look for something from the upload/download lists.

The effort to migrate is considerable, you are rewriting all servers jobs into parallel jobs. You may be able to limit this work by just migrating those jobs that are a bottleneck and handle a lot of data and run a combination of server and parallel jobs.

Details of the benefits of the parallel architecture are at http://www.ascential.com/products/platf ... ility.html
This page has further links to Ascential Grid Computing, ETL Benchmark etc.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

A white paper on PX, differences from server and related issues got written almost two years ago by BigPoppa (since departed from the DataStage world) and me. Its publication on the then Tools4DataStage site was blocked by Ascential on "legal grounds".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ecclesr
Premium Member
Premium Member
Posts: 260
Joined: Sat Apr 05, 2003 7:12 pm
Location: Australia

Post by ecclesr »

Hi Ray and Vincent

Thank you for your replies. So far I have been unable to find any documents showing even the simiplest example of a sequential job converted to a parallel job, with a simple explaination of the steps and effort required.

I have worked my way around the ascential site and devnet pages and found them to be the most frustating I have ever come across and come up with nothing - like trying to get blood out of a stone.

Ross Ecclesfield
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Yep, that's Ascential documentation. You'll be pleased to hear they have hired technical writers for how to documents in upcoming releases. The parallel job developers guide is probably the best piece of documentation with a good description of parallel architecture, if you don't have a copy (it gets installed with the enterprise edition) I'm sure we can get one to you. That will answer most of your questions about how parallel processing works.

If you have a DataStage install CD you might find it in the documentation directory.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You will find it there. The best bits are Chapters 1 and 2 of parjdev.pdf.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply