Page 1 of 1

Custom built parallel stages

Posted: Thu Oct 25, 2018 12:01 am
by Novak
Hi all,

I have come across a custom built parallel stage within our environment that does a better job at MD5 hashing that the built-in Checksum stage because it does not use the trailing pipe character. The latter happens to be a problem for us because there are existing hashed values (produced by SQL) that have not had trailing pipe character when provided as an input.

After some discussions I have found out that custom written MD5 stage have not been used from the beginning because it was written in C++ and as such not an out of box product. Instead, the projects have used hashing command within DB. There was a fear that custom written stage may not work in future versions of DataStage.
Whilst it is a fair comment from the platform owners, my argument is that some of these custom written stages will outlive some (and may have already) some of the built-in stages that have been deprecated over the years.
Does anyone know of any documentation or is happy to provide some sensible arguments that would support the use of custom written hashing stage?
It has been has been thoroughly tested and producing the correct results. Might need to adjust it to produce upper cased hash results (again to match the current outputs produced by SQL) but otherwise the stage works as expected.
It would greatly simplify design of future ETL jobs.

Cheers,

Novak

Re: Custom built parallel stages

Posted: Thu Oct 25, 2018 2:38 pm
by chulett
Novak wrote:There was a fear that custom written stage may not work in future versions of DataStage.
Huh, no clue where these kind of fears come from. We've seen fears like this before and they generally are 99.99% unfounded. I'm no expert on IBM and their future plans but I don't see a future where the underpinnings of the Parallel framework aren't still C/C++ and where custom stages built in that same language would ever stop working or otherwise being unsupported. But maybe that's just me. Curious what Ray / Ernie / Vincent / Steve or whomever else wanders by may think.

Posted: Fri Oct 26, 2018 4:39 am
by qt_ky
That sounds like an unfounded fear, alright! I bet if you asked the fearful people a few questions, you might find out the fear is more that a future developer may need to go through a learning curve in order to maintain and tweak the code for the custom stages.

You already have all the data you need to make the case for continuing the development and use of custom stages, especially since you're already getting correct (and better) results. I don't know of any warm and fuzzy documentation but if you have internal development standards then just add some wording to support custom stages, so that it is supported by your org's official standards documentation.

Posted: Mon Oct 29, 2018 4:51 pm
by ray.wurlod
The build stage executes your own C++ code. There are very few restrictions about what that code can do, other than the usual (expected) ones such as the code needs to be thread-safe. I will agree with the others that the fears are unfounded.

Posted: Thu Nov 08, 2018 12:08 am
by Novak
Hi gents,

Many thanks for your opinions on this one. With IBM not being able to move away from the official stance of not providing typical support for custom written code (understandable) the explanations from senior experts helped in convincing powers to be that custom written C++ routines pose no end-of-life risks.

Cheers,

Novak