selecting one record out of 2 records

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

vamsi.4a6
Participant
Posts: 334
Joined: Sun Jan 22, 2012 7:06 am
Contact:

selecting one record out of 2 records

Post by vamsi.4a6 »

I have data as mentioned below.source is sequential file

source,destination,distance
chennai,hyderabad,500
hyderabad,chennai,500
chennai,bangalore,600
bangalore,chennai,600
chennai,bangalore,600

I have to select 1 record out of 2 records.ay record is fine.
chennai,hyderabad,500
hyderabad,chennai,500


I need algorithim how to proceed and then i will think about logic
Thanks and Regards
Vamsi krishna.v
http://datastage-vamsi.blogspot.in/
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

YOU need to provide the algorithm about which record to preserve. Then use a Remove Duplicates stage on partitioned, sorted data.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vamsi.4a6
Participant
Posts: 334
Joined: Sun Jan 22, 2012 7:06 am
Contact:

Post by vamsi.4a6 »

I need to select one record out of 2 records.any record is fine.similary for rest of the records also.Nit sure how to proceed and anybody help on this

chennai,hyderabad,500
hyderabad,chennai,500
Thanks and Regards
Vamsi krishna.v
http://datastage-vamsi.blogspot.in/
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

:!: Simply repeating your post doesn't help any of us understand it better. Your example doesn't seem to match up to your words. "1 out of 2 records" means what exactly - every other record? Every even? Every odd? Something else entirely? All your example shows are the first two records being 'selected'.

So, until you can properly explain what it is you need, no-one will be able to help you without guessing. Never mind the fact that once you do, the answer may become clear to you as well! :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Create a VarChar column derived as Start:End and another derived as End:Start. Use these to effect your comparisons and duplicate removal.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

See here.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Yes, I thought I'd answered this question already.

Mirabile dictu, I came up with the same answer!
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
vamsi_4a6
Participant
Posts: 95
Joined: Wed Jun 04, 2014 12:06 am

Post by vamsi_4a6 »

I have concatenated as mentioned below

Start,End,Start_End,End_Start,Distance
Bangalore,Mumbai,BangaloreMumbai,MumbaiBangalore,1500
Hyderabad,Delhi,HyderabadDelhi,DelhiHyderabad,2000
Delhi,Hyderabad,DelhiHyderabad,HyderabadDelhi,2000
Mumbai,Bangalore,MumbaiBangalore,BangaloreMumbai,1500

How to use above Start_End,End_Start columns for comparisions?
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

Come on, mate. At least try to work it out on your own...
vamsi_4a6
Participant
Posts: 95
Joined: Wed Jun 04, 2014 12:06 am

Post by vamsi_4a6 »

I stuckup with how to compare two records within a group ans selecting one record out of it.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Are you the same user as vamsi.4a6, who has over 330 posts?

Anyway, and irrespective of that, think about the present problem this way: what is the purpose of constructing Start:End and End:Start columns?

The answer to that is that, if they are the same, then the records belong in the same group.

From that you can use whatever technique you like, for example the LastRowInGroup() function or "remembering" stage variables, to identify a single record from that group.

And I guess you'll need to get yourself a premium membership.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
stuartjvnorton
Participant
Posts: 527
Joined: Thu Apr 19, 2007 1:25 am
Location: Melbourne

Post by stuartjvnorton »

vamsi_4a6 wrote:I stuckup with how to compare two records within a group ans selecting one record out of it.
Come on, man. You've got a blog.
Seriously, order the names alphabetically in a transformer and then remove duplicates.

eg:

Here There 666
There Here 666

Becomes

Here There 666
Here There 666

Now pick one.


I want my 2 minutes back.
Sheesh.
vamsi_4a6
Participant
Posts: 95
Joined: Wed Jun 04, 2014 12:06 am

Post by vamsi_4a6 »

Anybody help on how to proceed?
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Honestly? Reread all of the previous replies. Try something. Then come back and ask specific questions based on the results, if you still have any. Questions, that is... not results. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
naveenkumar.ssn
Participant
Posts: 36
Joined: Thu Dec 03, 2009 9:11 pm
Location: Malaysia

Post by naveenkumar.ssn »

vamsi_4a6 wrote:Anybody help on how to proceed?
Hi,

Step1: First sort the rows where you wanted to check with
Example : a b 100
b a 200

After you sort the above records it would be like the below:
a b 100
a b 200

Use the remove duplicate stage to get either the first record or the second records.(Keep those 2 column as key column)

I hope you understood !!!

Reply back if not.

Regards
Naveen
Naveen Kumar
Datastage Consultant
Post Reply