Page 1 of 1
Source file Structure in Columns not rows
Posted: Mon May 27, 2019 9:16 pm
by orginal6
Looking for guidance on how to process a file that is in the format below. I was having trouble using the pivot stage and wanted to see if there were any other ideas.
Account#1 Account #2 Account#3
Account Type Master Checking Savings
Account ID 123 987 654
Beginning Balance 1000 550 450
Indirect Activity 5 25 55
Expenses 12 12 12
Transfers Out 0 0 0
Ending Balance 1005 562 505
Target File
Account ID Ending Balance Expenses
123 1005 12
987 562 12
654 505 12
I know that my source file will always have 3 accounts, that is fixed and won't be dynamic.
Posted: Tue May 28, 2019 10:37 pm
by UCDI
the first line is inconsistent (it has the column label 3 times, the others only once). Is that correct?
what do you want from it... 3 records? One big record with _1 _2 _3 on the columns?
Posted: Wed May 29, 2019 1:35 am
by orginal6
That's correct the data is stored in the columns.
I reformatted the source file for better reference. I never worked with data in this format. I'm looking for suggestions on the best way to process the data.
Code: Select all
Account#1 Account #2 Account#3
Account Type Master Checking Savings
Account ID 123 987 654
Beginning Balance 1000 550 450
Indirect Activity 5 25 55
Expenses 12 12 12
Transfers Out 0 0 0
Ending Balance 1005 562 505
Target File
Account ID| Ending Balance |Expenses
123 |1005 |12
987 |562 |12
654 |505 |12
Posted: Wed May 29, 2019 2:11 am
by chulett
And I reformatted the source file for more better reference.
The
Code: Select all
tags preserve the "extra" whitespace that the forum software automatically removes. You need to use them and then edit/preview/edit/preview/lather/rinse/repeat until things actually line up in the final version.
Posted: Wed May 29, 2019 9:18 pm
by UCDI
it looks to me like you need to add a column to your data for account number, and once you have that, a pivot should work?
If you don't want to do a pivot or can't get it to work, you can parse it any number of ways. Datastage can do it with transformers and funnels and such, but it is clunky ... stage variables, stamping a key to each line until the keys change, and reassemble based off the keys. You would probably have to turn off parallel processing for that bit. Or a routine. Or an external program that you call.
Posted: Thu Jun 13, 2019 1:47 am
by ray.wurlod
If there are always these six rows, you can derive the three output columns based on the input row count. You need the first, sixth and fourth rows from each set.
Be careful to partition your data using an artificial key, so that each group of six is preserved on its own single node.