Page 1 of 1

Checksum Stage limitation

Posted: Thu Nov 15, 2018 10:59 pm
by harikhk
Hi,
I am using concatenated result of 5 columns (different datatypes) to calculate the checksum value.

The problem is even though the concatenated string of the 5 columns is different, it is generating the same checksum value.

Is there any restriction on the length of string (input value) to generate the checksum key?

Any leads on this will be really helpful.

Posted: Fri Nov 16, 2018 6:02 am
by chulett
No limit that I am aware of. Post some examples of your concatenated strings or at least let us know how long they are since that seems to be a worry for you. I'd also be curious how you are building the string and calling the function as even with "long" strings, I don't see how it could be generating the same checksum over and over unless you're not calling it properly.

Also, for the home audience, doesn't this stage automatically add pipes between the fields? However, if you are manually doing the concatenation rather than letting the stage do it, you should be doing the pipe thing yourself. The "why" of that has been posted here and in the wild a lot so not going to go into it again. Best to just let the stage do it, though. IMHO.

Posted: Tue Nov 20, 2018 3:45 am
by Novak
Definitely strange if wrong hash result is produced for bigger number of columns. We have used about 5 columns in our own testing. Had to revert to using custom written MD5 hash stage because the built in checksum stage is appending '|' to the last input column. I wish that was not there but it is very unlikely to be changed

The stage has also been updated recently and it allows the use of space for delimiters now. Not good for us but just in case if somebody wants to use it.
https://www-01.ibm.com/support/docview. ... wg1JR58823