DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
rohit_mca2003
Participant



Joined: 08 Oct 2008
Posts: 32

Points: 395

Post Posted: Mon Jan 15, 2018 4:37 am Reply with quote    Back to top    

DataStage® Release: 9x
Job Type: Parallel
OS: Unix
Additional info: CHECKSUM stage re-arranges the columns while computing hash
Hi Everyone,

I have issue while computing hash value using 'CHECKSUM' stage. It seems that CHECKSUM stage re-arranges the columns as per their names while computing the hash value.

Example:
----------
Source --> Col1, Col2, Test_Val

Case1: If I generate checksum keeping order as (Col1, Col2, Test_Val)
Case2: If I generate checksum keeping order as (Col1, Test_Val, Col2).

In both cases, I get same checksum value (but order of column I passed was different).

Is there any way I can restrict this stage to re-arrange the columns while computing the hash value.

Thanks.

_________________
Rohit
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42391
Location: Denver, CO
Points: 217726

Post Posted: Mon Jan 15, 2018 8:57 am Reply with quote    Back to top    

Can't answer your question but I suspect the answer is 'no'. But I have to ask - does it really matter? What's important (IMHO) are the columns selected and the fact that they are used / ordered in a consistent manner. As someone who has had to clean up a mess made by developers hashing the same columns in a different order, I would gladly take that aspect of it out of their hands. Wink

_________________
-craig

I googled "Rorschach Test" but all that came up were pictures of my parents fighting
Rate this response:  
Not yet rated
rohit_mca2003
Participant



Joined: 08 Oct 2008
Posts: 32

Points: 395

Post Posted: Mon Jan 15, 2018 10:10 pm Reply with quote    Back to top    

Thanks Craig.
I need to generate CHECKSUM based on the column order as there is requirement to have keys in specific order.
Existing application is running in production where hash was generated based on order of columns and to match with existing hash values we need to follow same order of keys/columns.

_________________
Rohit
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42391
Location: Denver, CO
Points: 217726

Post Posted: Tue Jan 16, 2018 8:15 am Reply with quote    Back to top    

Best to open a support case then, the documentation doesn't seem to show that as an option. Out of curiosity, what 'existing application' was used to generate the checksum and what did it use to generate them? I've had 'issues' trying to match checksums from different systems, hence the question.

_________________
-craig

I googled "Rorschach Test" but all that came up were pictures of my parents fighting
Rate this response:  
Not yet rated
abhiramk
Participant



Joined: 22 Dec 2015
Posts: 1
Location: India
Points: 23

Post Posted: Thu Feb 08, 2018 5:11 am Reply with quote    Back to top    

Before computing the checksum, the Checksum stage arranges the columns in the alphabetical order of column names (irrespective of the order they appear in your input column tab).

I have tried and tested a couple of possibilities and there seem to be no option to rearrange the column order in which checksum is calculated. It always takes the alphabetical order.

To add more, the checksum stage adds a pipe symbol at the end of each column and calculates HASH using MD5 algorithm.

For e.g, if you are using two columns COL1 and COL2 to compute HASH/Checksum, COL1 holds ALICE and COL2 holds BOB, checksum is computed on ALICE|BOB|

This "ALICE|BOB|" string doesn't change even if COL2 is your first column and COL1 is your second column. The columns are first arranged as COL1,COL2 and a pipe operator is inserted at the end of each column and checksum is computed using MD5 hash algorithm.

So, even if there is an option to rearrange the columns (which obviously is a speculation), the hash computed by your application will not match with the Datastage computed hash unless your application too adds a pipe symbol at the end of each column.

_________________
Cheers
Abhiram
Rate this response:  
Not yet rated
UCDI



Group memberships:
Premium Members

Joined: 21 Mar 2016
Posts: 289

Points: 2862

Post Posted: Mon Feb 12, 2018 10:05 am Reply with quote    Back to top    

can you DIY?

if you explicitly concat your data into a single column the checksum will be in the order you specify with any delimiters you specify (or not).

You can also toss the internal checksum and use an external one that is shared across your platforms, instead of trying to replicate datastage's checksum externally.
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours