Hardware recommendations

Archive of postings to DataStageUsers@Oliver.com. This forum intended only as a reference and cannot be posted to.

Moderators: chulett, rschirm

Locked
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Hardware recommendations

Post by admin »

All

I have a customer who is running DataStage 4.02 on NT connecting to a DB2 database through a 10mps connection and loading to Oracle 8 through a 100 mps connection. They want to know what is recommended for hardware on a production server to maximize performance. Any recommendation?

Double or quad processors (what speed)
At least a gig of memory (memory will probably have a higher impact than multiple or faster processors) Raid array to speed read/write access

Any input would be appreciated.

Thanks
Stacy Scoggins
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Post by admin »

We are running on a (now old) quad Pentium II Xeon server. Initially we had 1.2 Gb of RAM which we have since upgraded to 2.2Gb. We are using RAID arrays and some pretty flash disk controllers.

HOWEVER...

Overnight, the server is running DataStage, Oracle 8 and Cognos Transformer.

During the day, it is starting to cop a fair thrashing from the Cognos PowerPlay users as well.

From that stats I have observed, DataStage uses a very small proportion of the resources on our server. Most of the CPU and memory is being used by Oracle and Transformer.

My thoughts for your situation are...

* Your biggest bottlenecks are likely to be the network connections to
DB2 and Oracle.
* Performance will be greatly impacted by job design. For example, if
you do a lot of reference look ups to either DB2 or Oracle, this will slow things down. If there is any information that is repeatedly referenced, you might want to consider caching it in hash files on the DataStage server.
* Super fast disks on the DataStage server probably arent going to do
all that much for you unless you are using a lot of hash files or other local temporary storage.
* I have also found that DataStage does not tend to use a great deal
of memory. If you want to preload very large hash files into memory, then obviously this is a help. But, I suspect that 1Gb of memory is likely to be over kill in your situation.
* I also suspect that you would be hard pressed to keep 4 processors
busy as well. Again, however, it depends on the overall architecture of your job suites and how much local processing of data you do (ie local passive stages in jobs).

Im not sure what you budget is like (a few thousand dollars or tens of thousands). If you are trying to minimise costs, you might want to start with a motherboard which will handle dual CPUs and a reasonable amount of memory but only install 1, say Pentium 800, and a moderate amount of memory. Monitor performance and if processor or memory appear to be a problem, add some more.

The same goes for disk. Fast-wide SCSI RAID can get very expensive. You might even want to start out with some (potentially throw away) EIDE disks first as these will cost next to nothing compared with the faster options.

I think one of the risks you face is spending a lot of money on a powerful server that never uses even a fraction of its capacity.

Never having run DataStage on a server of its own, my comments are an extrapolation of our situation. Im sure the folks from Informix could give a better idea of how much grunt is required for a server to only run DataStage.

Cheers

David

-----Original Message-----
From: Stacy Scoggins [SMTP:sscoggins@performart.net]
Sent: Thursday, 18 January 2001 0:10
To: informix-datastage@oliver.com
Subject: Hardware recommendations


All

I have a customer who is running DataStage 4.02 on NT connecting to a DB2
database through a 10mps connection and loading to Oracle 8 through a 100
mps connection. They want to know what is recommended for hardware on a
production server to maximize performance. Any recommendation?

Double or quad processors (what speed)
At least a gig of memory (memory will probably have a higher impact than
multiple or faster processors)
Raid array to speed read/write access

Any input would be appreciated.

Thanks
Stacy Scoggins



*************************************************************************
This e-mail and any files transmitted with it may be confidential and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in
error, please notify the sender by return e-mail, and delete this e-mail from your in-box. Do not copy it to anybody else

*************************************************************************
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Post by admin »

If we (Informix) did, it would be very like Davids answer, with some added caveats that, if you create memory-hungry jobs (using sort or aggregator stages and hashed files pre-loaded to memory, and write-cache-enabled hashed files), then demand for (virtual) memory will increase. As with any computer tuning, the secret is do so as little work as possible; extract only those rows and columns that you need, reject and aggregate as early in the job as possible. I concur completely with Davids comment on the job suite - if youve got more than one CPU, then run more than one job at a time, perhaps controlled by job control routine.

-----Original Message-----
From: David Barham [mailto:David.Barham@Anglocoal.com.au]
Sent: Thursday, 18 January 2001 03:55
To: informix-datastage@oliver.com
Subject: RE: Hardware recommendations


We are running on a (now old) quad Pentium II Xeon server. Initially we had 1.2 Gb of RAM which we have since upgraded to 2.2Gb. We are using RAID arrays and some pretty flash disk controllers.

HOWEVER...

Overnight, the server is running DataStage, Oracle 8 and Cognos Transformer.

During the day, it is starting to cop a fair thrashing from the Cognos PowerPlay users as well.

>From that stats I have observed, DataStage uses a very small proportion
>of
the resources on our server. Most of the CPU and memory is being used by Oracle and Transformer.

My thoughts for your situation are...

* Your biggest bottlenecks are likely to be the network connections to
DB2 and Oracle.
* Performance will be greatly impacted by job design. For example, if
you do a lot of reference look ups to either DB2 or Oracle, this will slow things down. If there is any information that is repeatedly referenced, you might want to consider caching it in hash files on the DataStage server.
* Super fast disks on the DataStage server probably arent going to do
all that much for you unless you are using a lot of hash files or other local temporary storage.
* I have also found that DataStage does not tend to use a great deal
of memory. If you want to preload very large hash files into memory, then obviously this is a help. But, I suspect that 1Gb of memory is likely to be over kill in your situation.
* I also suspect that you would be hard pressed to keep 4 processors
busy as well. Again, however, it depends on the overall architecture of your job suites and how much local processing of data you do (ie local passive stages in jobs).

Im not sure what you budget is like (a few thousand dollars or tens of thousands). If you are trying to minimise costs, you might want to start with a motherboard which will handle dual CPUs and a reasonable amount of memory but only install 1, say Pentium 800, and a moderate amount of memory. Monitor performance and if processor or memory appear to be a problem, add some more.

The same goes for disk. Fast-wide SCSI RAID can get very expensive. You might even want to start out with some (potentially throw away) EIDE disks first as these will cost next to nothing compared with the faster options.

I think one of the risks you face is spending a lot of money on a powerful server that never uses even a fraction of its capacity.

Never having run DataStage on a server of its own, my comments are an extrapolation of our situation. Im sure the folks from Informix could give a better idea of how much grunt is required for a server to only run DataStage.

Cheers

David

-----Original Message-----
From: Stacy Scoggins [SMTP:sscoggins@performart.net]
Sent: Thursday, 18 January 2001 0:10
To: informix-datastage@oliver.com
Subject: Hardware recommendations


All

I have a customer who is running DataStage 4.02 on NT connecting to a DB2
database through a 10mps connection and loading to Oracle 8 through a 100
mps connection. They want to know what is recommended for hardware on a
production server to maximize performance. Any recommendation?

Double or quad processors (what speed)
At least a gig of memory (memory will probably have a higher impact than
multiple or faster processors)
Raid array to speed read/write access

Any input would be appreciated.

Thanks
Stacy Scoggins



*************************************************************************
This e-mail and any files transmitted with it may be confidential and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by return e-mail, and delete this e-mail from your in-box. Do not copy it to anybody else

*************************************************************************
Locked