Is there a way to restart a job from point of failure?

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
vbr_03
Participant
Posts: 2
Joined: Wed Apr 18, 2012 4:54 am

Is there a way to restart a job from point of failure?

Post by vbr_03 »

Hi ,

Is there any way to restart a parallel job to load the data from last failure point?
VIJ
leandrohmvieira
Participant
Posts: 44
Joined: Wed Sep 02, 2015 7:19 am
Location: Brasilia, Brazil

Post by leandrohmvieira »

Sequence jobs does have some checkpoint functionality, which allow a sequence to restart from it.

Parallel Jobs and Server Jobs does not have any features like this. Can you provide some details of your problem?
Leandro Vieira

Data Expert - Brasilia, Brazil
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Short answer, no.

You may be able to design jobs with a certain degree of restartability but, in general, the amount of effort required would make it not worthwhile.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Right, restartable jobs are certainly possible, I've always striven for atomic level job designs ('single units of work') to allow them to be restartable with little or no human intervention. I've posted high level notes here in the past describing the 'framework' we're using now to support that.

Restarting from the point of failure? That's a whole 'nuther kettle of fish, especially if there's any kind of complexity in the job design and would generally require some kid of... let's say "compromises"... with regard to job speed.

(technically, the tool I'm using now has a magical checkbox to enable that functionality but I've yet to try/playwith/trust any such feature)
-craig

"You can never have too many knives" -- Logan Nine Fingers
Joel in KC
Participant
Posts: 3
Joined: Fri Aug 10, 2018 12:23 pm

Post by Joel in KC »

Please let me know where I can find your framework and and "single unit of work" as we are trying to move to this type of usage, rather than the huge, complex systems that need re-starting,,,appreciate your time. New to the board. Thx again
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Both are mentioned here with some high level details for the framework. Hope it helps. As noted there, would really be interested to see if anyone has done anything like that in DataStage, mine is an Informatica implementation which makes it a tad easier.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Where I need this functionality I, like Craig, create small atomic units of work as DataStage job, and make use of the restartability capability of sequence jobs to handle that. No point in re-inventing the wheel.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

High-level error handling design is where restartability is identified. Error handling is a part of the definition of the unit of work.

Example:

1. Download file. If that fails, fix problem and rerun.
2. Process file. If there are no intermediate points of failure -- like commits -- if the process fails fix and rerun.
3. Etc.

DataStage permits jobs that do both functions in one parallel job. If your design does that, you're next step is to rewrite the job to create the separate units of work.

Job Sequence design covers the how and where.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
Post Reply