Page 1 of 1

Job datasets

Posted: Wed Aug 09, 2017 3:16 am
by PeteM
My understanding is that if a job is run twice from the same project, at the same time, data stage ensures that the datasets created by both instances of the job are unique to the instance and no overlap can occur?

Posted: Wed Aug 09, 2017 10:36 am
by UCDI
you should apply the invocation ID to the dataset name. If the names and folders are exact matches, it will have problems, and having the invocation id also lets you debug easier.

Posted: Wed Aug 09, 2017 11:12 am
by chulett
... in other words, no it does not. :wink:

Posted: Thu Aug 10, 2017 6:43 am
by PeteM
we have a requirement whereby a given job has to be run multiple times supporting separate test environments (against different databases), but each instance of the job would be processing different test data, based on the database it was pointing at.

All instances of the job will run on the same data stage engine.

Therefore, there is a chance that two instances of the job could be running at the same time.

Based on your responses am I correct to assume that this requirement cannot be met?

Posted: Thu Aug 10, 2017 6:47 am
by eostic
This is exactly what Multi-Job Instancing was designed for. One Job...run concurrently multiple times, but with different Job Parameter values...each for their independent needs/environment, etc. ....."if" those Jobs also need to create their own staging areas or temp files, etc., use the #DSJobInvocationId# Job parameter within your name, as desribed by UCDI above, and provide a unique Invocation ID when you start the Job instance (in the designer it is a property in the run dialog but can also be provided on the command line).

Ernie

Posted: Thu Aug 10, 2017 9:33 am
by PeteM
In production only one instance of the job would ever run, which is why we have not made it a multiple instance job. This is a requirement to support multiple test streams.

Therefore, is the consensus of opinion that multiple instances of a job cannot be run on one data stage engine, without making the job multiple instance?

Posted: Thu Aug 10, 2017 10:43 am
by PaulVL
You can have a multi instance job run. Each job must write to a UNIQUE dataset name.

Apply a parameter into the path or filename to make it UNIQUE.

Same job can be loaded into DEV, TEST or PROD later on since it would still work with a unique supplied parameter for that job instance.

Posted: Thu Aug 10, 2017 11:58 am
by chulett
PeteM wrote:Therefore, is the consensus of opinion that multiple instances of a job cannot be run on one data stage engine, without making the job multiple instance?
Yes... that's the whole point of the Multi-Instance option. Now that doesn't mean you need to run it that way in Production, you can still run it without an InvocationID and it would run just like it wasn't multi-instance enabled.

Posted: Thu Aug 31, 2017 9:31 am
by R.K.Glover
PeteM wrote:In production only one instance of the job would ever run, which is why we have not made it a multiple instance job. This is a requirement to support multiple test streams.

Therefore, is the consensus of opinion that multiple instances of a job cannot be run on one data stage engine, without making the job multiple instance?
I suppose it really depends on whether you need to have multiple copies of the job running at the same time.

If you do, then you MUST make the job Allow Multiple Instances.

If you don't, then you don't have to - but for debugging purposes, gosh, wouldn't it be nice if all of the logs for each of your test streams had their own log, instead of one big log where you had to look at the passed parameters of each run to figure out which one it was?

As far as input/output files, DataSets, etc.... like others have said, take one of your unique parameters, and tack it into your path or filename for input/output. As long as there is a value for that parameter that matches production, you should be fine.