Source file Structure in Columns not rows

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
orginal6
Participant
Posts: 2
Joined: Sat May 25, 2019 7:28 pm

Source file Structure in Columns not rows

Post by orginal6 »

Looking for guidance on how to process a file that is in the format below. I was having trouble using the pivot stage and wanted to see if there were any other ideas.
Account#1 Account #2 Account#3
Account Type Master Checking Savings
Account ID 123 987 654
Beginning Balance 1000 550 450
Indirect Activity 5 25 55
Expenses 12 12 12
Transfers Out 0 0 0
Ending Balance 1005 562 505

Target File
Account ID Ending Balance Expenses
123 1005 12
987 562 12
654 505 12

I know that my source file will always have 3 accounts, that is fixed and won't be dynamic.
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

the first line is inconsistent (it has the column label 3 times, the others only once). Is that correct?

what do you want from it... 3 records? One big record with _1 _2 _3 on the columns?
orginal6
Participant
Posts: 2
Joined: Sat May 25, 2019 7:28 pm

Post by orginal6 »

That's correct the data is stored in the columns.

I reformatted the source file for better reference. I never worked with data in this format. I'm looking for suggestions on the best way to process the data.

Code: Select all

                              Account#1   Account #2    Account#3 
Account Type                  Master      Checking      Savings 
Account ID                    123         987           654 
Beginning Balance             1000        550           450 
Indirect Activity             5           25            55 
Expenses                      12          12            12 
Transfers Out                 0           0             0 
Ending Balance                1005        562           505 
Target File
Account ID| Ending Balance |Expenses
123 |1005 |12
987 |562 |12
654 |505 |12
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

And I reformatted the source file for more better reference. :wink:

The

Code: Select all

 tags preserve the "extra" whitespace that the forum software automatically removes. You need to use them and then edit/preview/edit/preview/lather/rinse/repeat until things actually line up in the final version.
-craig

"You can never have too many knives" -- Logan Nine Fingers
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

it looks to me like you need to add a column to your data for account number, and once you have that, a pivot should work?

If you don't want to do a pivot or can't get it to work, you can parse it any number of ways. Datastage can do it with transformers and funnels and such, but it is clunky ... stage variables, stamping a key to each line until the keys change, and reassemble based off the keys. You would probably have to turn off parallel processing for that bit. Or a routine. Or an external program that you call.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

If there are always these six rows, you can derive the three output columns based on the input row count. You need the first, sixth and fourth rows from each set.

Be careful to partition your data using an artificial key, so that each group of six is preserved on its own single node.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply