DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
This topic has been marked "Resolved."
Author Message
Gius
Participant



Joined: 09 Mar 2015
Posts: 23

Points: 321

Post Posted: Thu Nov 01, 2018 11:04 pm Reply with quote    Back to top    

DataStage® Release: 11x
Job Type: Parallel
OS: Windows
Additional info: Send duplicates Error (for a key), to Error
Hello,
I 'm reading a file, where the fields , Person Identifier and the Item Identifier, are keys.
I would like to send to Error (for example an output file), all the duplicates records and just keep one (max load date). But don't know really how to do that , could you help me please ?
Thank You -
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42834
Location: Denver, CO
Points: 220807

Post Posted: Fri Nov 02, 2018 5:32 am Reply with quote    Back to top    

Question Okay, what have you tried and what issues are you facing?

There is the Remove Duplicates stage which can retain one record per "key" checked for duplicates as long as they are sorted properly. Don't know if it supports any kind of a reject link or not, didn't see it during a quick check of the docs. Failing that... perhaps a mechanism to take the deduped result and join it back to the original source to get your "all the duplicates" for your error file?

Let's see what other folks think who may actually have had to do something like this.

_________________
-craig

Your brain is full of spiders, you've got garlic in your soul, Mr. Grinch!
I wouldn't touch you With a thirty-nine-and-a-half foot pole!
Rate this response:  
Not yet rated
rumu
Participant



Joined: 06 Jun 2005
Posts: 267

Points: 2694

Post Posted: Fri Nov 02, 2018 6:41 am Reply with quote    Back to top    

In a similar kind of requirement, I used Sort stage. Sort your data based on

Person Identifier, the Item Identifier and load date(descending). Use Create key change column, this will assign 1 for the first record in the group and for remaining 0. Use transformer stage to output 2 links one for Records with 1 and other for records with 0.The last link will be your error output.

_________________
Rumu
IT Consultant
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42834
Location: Denver, CO
Points: 220807

Post Posted: Fri Nov 02, 2018 7:28 am Reply with quote    Back to top    

That's another approach but may not satisfy the "all duplicates" requirement... which I assume also includes the one to keep. So, if there is only one occurrence of a key, there's no error. However, if there are 12 occurrences, than I'm thinking they want one specific one (max load date) to be kept but all 12 should go to the error file rather than 11. Really depends on what "all the duplicate records" means in the OP. TBH, I may be over-thinking things. Wink

Gius, can you clarify please?

_________________
-craig

Your brain is full of spiders, you've got garlic in your soul, Mr. Grinch!
I wouldn't touch you With a thirty-nine-and-a-half foot pole!
Rate this response:  
Not yet rated
Gius
Participant



Joined: 09 Mar 2015
Posts: 23

Points: 321

Post Posted: Wed Nov 07, 2018 8:38 am Reply with quote    Back to top    

yes thank you, the sort with create key change column worked fine.
Another solution was to do a Loop Condition within the Transform.

The requirement was if 12 records dup, 11 to error and 1 (max Date) to output.
If I had to send the 12 to Error, I think I would do a join after , between the unique key of the 11 Error rec with all records.
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours