Job datasets

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
PeteM
Premium Member
Premium Member
Posts: 19
Joined: Thu Dec 15, 2011 8:50 am
Location: uk

Job datasets

Post by PeteM »

My understanding is that if a job is run twice from the same project, at the same time, data stage ensures that the datasets created by both instances of the job are unique to the instance and no overlap can occur?
Thanks
PeteM
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

you should apply the invocation ID to the dataset name. If the names and folders are exact matches, it will have problems, and having the invocation id also lets you debug easier.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

... in other words, no it does not. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
PeteM
Premium Member
Premium Member
Posts: 19
Joined: Thu Dec 15, 2011 8:50 am
Location: uk

Post by PeteM »

we have a requirement whereby a given job has to be run multiple times supporting separate test environments (against different databases), but each instance of the job would be processing different test data, based on the database it was pointing at.

All instances of the job will run on the same data stage engine.

Therefore, there is a chance that two instances of the job could be running at the same time.

Based on your responses am I correct to assume that this requirement cannot be met?
Thanks
PeteM
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

This is exactly what Multi-Job Instancing was designed for. One Job...run concurrently multiple times, but with different Job Parameter values...each for their independent needs/environment, etc. ....."if" those Jobs also need to create their own staging areas or temp files, etc., use the #DSJobInvocationId# Job parameter within your name, as desribed by UCDI above, and provide a unique Invocation ID when you start the Job instance (in the designer it is a property in the run dialog but can also be provided on the command line).

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
PeteM
Premium Member
Premium Member
Posts: 19
Joined: Thu Dec 15, 2011 8:50 am
Location: uk

Post by PeteM »

In production only one instance of the job would ever run, which is why we have not made it a multiple instance job. This is a requirement to support multiple test streams.

Therefore, is the consensus of opinion that multiple instances of a job cannot be run on one data stage engine, without making the job multiple instance?
Thanks
PeteM
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

You can have a multi instance job run. Each job must write to a UNIQUE dataset name.

Apply a parameter into the path or filename to make it UNIQUE.

Same job can be loaded into DEV, TEST or PROD later on since it would still work with a unique supplied parameter for that job instance.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

PeteM wrote:Therefore, is the consensus of opinion that multiple instances of a job cannot be run on one data stage engine, without making the job multiple instance?
Yes... that's the whole point of the Multi-Instance option. Now that doesn't mean you need to run it that way in Production, you can still run it without an InvocationID and it would run just like it wasn't multi-instance enabled.
-craig

"You can never have too many knives" -- Logan Nine Fingers
R.K.Glover
Participant
Posts: 8
Joined: Mon Mar 11, 2013 2:51 pm
Location: RTP, North Carolina

Post by R.K.Glover »

PeteM wrote:In production only one instance of the job would ever run, which is why we have not made it a multiple instance job. This is a requirement to support multiple test streams.

Therefore, is the consensus of opinion that multiple instances of a job cannot be run on one data stage engine, without making the job multiple instance?
I suppose it really depends on whether you need to have multiple copies of the job running at the same time.

If you do, then you MUST make the job Allow Multiple Instances.

If you don't, then you don't have to - but for debugging purposes, gosh, wouldn't it be nice if all of the logs for each of your test streams had their own log, instead of one big log where you had to look at the passed parameters of each run to figure out which one it was?

As far as input/output files, DataSets, etc.... like others have said, take one of your unique parameters, and tack it into your path or filename for input/output. As long as there is a value for that parameter that matches production, you should be fine.
Post Reply