Combining two files into a single based on certain critirea

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
akash_nitj
Participant
Posts: 27
Joined: Fri Aug 13, 2004 3:36 am
Location: INDIA

Combining two files into a single based on certain critirea

Post by akash_nitj »

Hi
I have a case in which we have two input files (A and B) having same number of coloumns and datatypes.
A contains 15records.
B contains 15 records (out of which 10 are same same as in A).

Now i want to join these two files(A and B) into single file such that output file should have 20 records ( 15 from A file and 5 from B ones which are not in A).

Any solutions or suggestion .........?

TIA

Regards
akash
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

Code: Select all

cat file_a file_b | sort -u > file_c
This is an exact match bitwise, " x" and "x" will not be the same. It gives you unique rows from both files, a full outer join.

Using a job, you can funnel/link collect all rows together and aggregate and group by and use min or max to collapse rows together. This also give you a full outer join.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
Post Reply