Handling Huge files

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
nilotpalr
Participant
Posts: 29
Joined: Tue Dec 10, 2002 2:54 am

Handling Huge files

Post by nilotpalr »

Hi All,
I need to load files of 1 GB and upwards in Oracle 9i database. As of now I have one-to-one mapping. The datastage server job is taking around 40 minutes to do the same. When I use SQLLDR along with the control files (in a conventional path load) I take around 5 minutes.

Is possible to reach similar timings using datastage. Last I had workerd on version 5.2 in year 2002. Please advice if there are additional mechnism in version 7 (like parallel extender or parallel jobs - I am just guessing -:) ) that will help me to match the timings of SQLLDR.

Thanks in advance.

Nilotpal.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

No.

The sqlldr utility only does bulk loads. This is typically the final step in a DataStage job, which also performs extraction from source, cleansing, transformation, aggregation, and so on.

You're not comparing apples with apples here.

However, parallel extender (PX) does give you potentially huge throughput gains over server, provided that you can throw enough resources at it (multiple CPUs, whether in an SMP environment or an MPP environment, and lots and lots of memory).

You can do a lot of this with multi-instance server jobs, too, but PX handles data partitioning, pipelining and allocation of resources automatically, whereas you have to manage all these things explicitly if using server jobs.

Check out the Ascential website for a press release about the speed of PX.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
nilotpalr
Participant
Posts: 29
Joined: Tue Dec 10, 2002 2:54 am

Post by nilotpalr »

Hi Ray,
Thanks for yoyr prompt response. Please advice if the parrel extender is available as part of standard suite or we need to purchase it seperately.

I have installed Datastage but I am not getting any menu option like -Datastage Parallel Extender.

Please advice.

Thanks.
Nilotpal.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You need more money. :cry:

It's a separate component for DataStage, not part of server edition. These days Ascential call it DataStage EE (Enterprise Edition). But we here seem to stick with Parallel Extender, especially since the design canvas still carries the word "Parallel".
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply