configuration of nodes in default.apt

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
tine_bi
Participant
Posts: 18
Joined: Mon Nov 24, 2014 6:02 am

configuration of nodes in default.apt

Post by tine_bi »

Hi

We have an enviroment where we in addition to c drive have access to d,e,f

How should we best configure our .pat configuration file.

Currently I am testing

{
node "node1"
{
fastname "dsprdsrv1"
pools ""
resource disk "D:/IBM_NODE_CONFIG/Datasets" {pools ""}
resource scratchdisk "D:/IBM_NODE_CONFIG/Scratch" {pools ""}
}
node "node2"
{
fastname "dsprdsrv1"
pools ""
resource disk "F:/IBM_NODE_CONFIG/Datasets" {pools ""}
resource scratchdisk "F:/IBM_NODE_CONFIG/Scratch" {pools ""}
}
node "node3"
{
fastname "dsprdsrv1"
pools "" "sort"
resource disk "E:/IBM_NODE_CONFIG/Datasets" {pools "" "sort"}
resource scratchdisk "E:/IBM_NODE_CONFIG/Scratch" {pools "" "sort"}
}

}

But not sure if this is a good way to do this.
Also what should be the ideal size of each drive?
The server is set on an Hyper-v enviroment allocated with 12 cpu and 24 GB memory

Please advicse

BR
Dan
BR
Dan
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

My advice is setting your scratch disk to be on the local drive if possible. Network drives will work, but slow the interaction down.

As for size, it really depends on quantity of data, quantity and quality of sorts in a job, quantity of concurrent jobs.

We can't answer those for you.


I also like my parallelism in even numbers.

==============

I would leave your default.apt as a two node configuration, and then create other apt files with various degrees of parallelism based upon your data/job needs.

I like to create project_dev.apt, project_tst.apt, project_prd.apt and set those into your various projects as default. It helps in the future when you have different default project requirements for resources. Project #2 might come along and have a different scratch disk because he purchased it, and politically can't interfere with project #1 running on the same box... etc...

project_tst_4Nodes.apt, project_tst_6Nodes.apt, etc...

You always want a variety of config files to suite your needs. If you were grid, you would handle that variety via parameters. But you are not, so build them up ahead of time. Having a 1 node apt file is good for jobs that would simply submit a stored procedure on a database for instance. No data traveling in datastage, so no need for more than 1 thread.
Post Reply