Dynamically export files based on Config table
Moderators: chulett, rschirm, roy
Dynamically export files based on Config table
Preface:
I am new to Datastage. I have been doing ETL for 10ish years with SQL tools but I'm struggling a bit conforming to the ways of Datastage. I'm at the bottom of the learning curve. I've done the Datastage Essentials course and just submitted payment for premium status. I've authored roughly 20 jobs thus far - most are pretty simple ETL.
What I'm trying to do:
The Job requirement is simple in concept, Export one file for every customer in a list.
What I have done:
(PJob1)
I have the parallel job setup to accept the following parameters:
FileName
CustAccountCode
This jobs works as I expect it to.
(C_Tbl)
I've created a config table with 3 columns
RecId | FileName | CustAccountCode
I want to loop through C_Tbl and pass each record as two parameters into PJob1 which creates one file for every record. I can't figure the sequence job out to do so.
I've read quite a few posts with similar questions which are posted from more senior level persons who have a better concept of the intricacies of Datastage. I'm struggling to fill in the blanks, so to speak.
What's the best way to accomplish the above?
I am new to Datastage. I have been doing ETL for 10ish years with SQL tools but I'm struggling a bit conforming to the ways of Datastage. I'm at the bottom of the learning curve. I've done the Datastage Essentials course and just submitted payment for premium status. I've authored roughly 20 jobs thus far - most are pretty simple ETL.
What I'm trying to do:
The Job requirement is simple in concept, Export one file for every customer in a list.
What I have done:
(PJob1)
I have the parallel job setup to accept the following parameters:
FileName
CustAccountCode
This jobs works as I expect it to.
(C_Tbl)
I've created a config table with 3 columns
RecId | FileName | CustAccountCode
I want to loop through C_Tbl and pass each record as two parameters into PJob1 which creates one file for every record. I can't figure the sequence job out to do so.
I've read quite a few posts with similar questions which are posted from more senior level persons who have a better concept of the intricacies of Datastage. I'm struggling to fill in the blanks, so to speak.
What's the best way to accomplish the above?
Welcome to DataStage, where you often have 2 or 3 ways to accomplish anything!
In this case you might want to try using a Server job to read records from the table, then a Transformer stage to call the DSUtilityRunJob() function for each record. Search on DSUtilityRunJob for more info.
In this case you might want to try using a Server job to read records from the table, then a Transformer stage to call the DSUtilityRunJob() function for each record. Search on DSUtilityRunJob for more info.
Choose a job you love, and you will never have to work a day in your life. - Confucius
True dat!
Another approach would be a looping Sequence job. Use a User Variables Activity stage to read the config file and pass each entry in it to the Start Loop stage. There's an old post of mine that might help with that, I'll see if I can dig it up later.
The server job approach would be easier, I do believe.
Another approach would be a looping Sequence job. Use a User Variables Activity stage to read the config file and pass each entry in it to the Start Loop stage. There's an old post of mine that might help with that, I'll see if I can dig it up later.
The server job approach would be easier, I do believe.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
I've read about this technique - but I'm struggling to fill in the blanks.qt_ky wrote:you might want to try using a Server job to read records from the table, then a Transformer stage to call the DSUtilityRunJob() function for each record.
So, right now I have a Server job. It reads my SQL Server Config table through ODBC stage which is handed to the Transformer.
Questions and me thinking out loud:
- The transformer is calling my other job through UtilityRunJob, It doesn't look like I can pass multiple parameters. I will have to pass one parameter but delimited. I can handle that in my SQL call. Ok. So how do I read the parameter within my job to spread my one parameter back into two? FileName and CustAccountCode?
If the Transformer is calling my other job, what stage is needed after the Transformer stage to complete my Server Job?
Where will your two parameters be leveraged in your Parallel job?
As for the Server job, I'd just end it with a Sequential File stage where you write out pretty much anything, say like the parameter values used. You could even write it to "/dev/null" (or whatever the Windows equivalent is) so it doesn't clutter anything up.
As for the Server job, I'd just end it with a Sequential File stage where you write out pretty much anything, say like the parameter values used. You could even write it to "/dev/null" (or whatever the Windows equivalent is) so it doesn't clutter anything up.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
I have a loop that uses a list file with parameters, an index (increasing integer, controls the number of loops) and a file name. Each instance of the loop calls a parallel job to FTP get the named file.
First read of list gets maximum index (number of loops). During the loop, the instance of the loop reads the indexed file name, passes it to the FTP job as a parameter.
No server jobs or routines are involved. If your "table" was a list file, it would easily plug into my job design. You can read the list in other ways, ending the loop when you identify the last item in the list.
First read of list gets maximum index (number of loops). During the loop, the instance of the loop reads the indexed file name, passes it to the FTP job as a parameter.
No server jobs or routines are involved. If your "table" was a list file, it would easily plug into my job design. You can read the list in other ways, ending the loop when you identify the last item in the list.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson
Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson
Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
If it helps - Below is the SQL i have in the Server Job. One column is being passed to the Transformer, Parm.
Result:
D:\External\Outbound\EXTRACT_002\AAFES_STORES\DAILY\DH_AAFES_STORES_DAILY_ORD_TRACKING_20171115.XLST|290623
D:\External\Outbound\EXTRACT_002\AAFES_CONUS\DAILY\DH_AAFES_CONUS_DAILY_ORD_TRACKING_20171115.XLST|290624
My deliminator is a bar. I don't know how or where to parse that into two variables. I don't think that it can be done after it is send to the job.
Code: Select all
SELECT 'D:\External\Outbound\EXTRACT_002\' + CUSTOMER_NAME + '\' + CYCLE + '\DH_'+CUSTOMER_NAME+'_'+CYCLE +'_ORD_TRACKING_'+ CONVERT(VARCHAR(10), GETDATE()-1, 112) + '.XLST|'+CUST_CODES PRAM
FROM EXP_002_CONFIG
D:\External\Outbound\EXTRACT_002\AAFES_STORES\DAILY\DH_AAFES_STORES_DAILY_ORD_TRACKING_20171115.XLST|290623
D:\External\Outbound\EXTRACT_002\AAFES_CONUS\DAILY\DH_AAFES_CONUS_DAILY_ORD_TRACKING_20171115.XLST|290624
My deliminator is a bar. I don't know how or where to parse that into two variables. I don't think that it can be done after it is send to the job.
Well, I thought it used to be called DSUtilityRunJob() but I did some poking around and only found one called UtilityRunJob() which is actually what my old test job has in it. I wonder if it got renamed at some point over the years.
Anyhow, if you are in Designer and expand Routines in your Repository view, under sdk, Utility, UtilityRunJob is there. Double click it and look at the Arguments tab. In the description it gives the format for passing multiple parameters.
It might be of interest as well that you can go to the Code tab and just walk through each line of code in the routine to see how it works.
Anyhow, if you are in Designer and expand Routines in your Repository view, under sdk, Utility, UtilityRunJob is there. Double click it and look at the Arguments tab. In the description it gives the format for passing multiple parameters.
It might be of interest as well that you can go to the Code tab and just walk through each line of code in the routine to see how it works.
Choose a job you love, and you will never have to work a day in your life. - Confucius
I don't think so.
Using the server method - I have:
The Transformer calls my Job using:
UtilityRunJob('Export_Job', CONFIG.PRAM, 0, 0)
In the job that the Transformer calls, how do I configure my parameters to receive the parameter sent from the Transformer? I suspect there is some expression that will read the parameter and split based on array value- but I don't see where or how that should work.
Using the server method - I have:
The Transformer calls my Job using:
UtilityRunJob('Export_Job', CONFIG.PRAM, 0, 0)
In the job that the Transformer calls, how do I configure my parameters to receive the parameter sent from the Transformer? I suspect there is some expression that will read the parameter and split based on array value- but I don't see where or how that should work.
Sweet, Got it.
I didn't read the requirements for the UtilityRunJob() close enough.
I was passing it the value of my variable. The Arguments clearly state that I needed to pass it as
ParamName1=Value1|ParamName2=Value2
My SQL has been modified to :
Which gives the Results of:
FILENAME=D:\External\Outbound\EXTRACT_002\AAFES_STORES\DAILY\DH_AAFES_STORES_DAILY_ORD_TRACKING_20171115.XLST|VAR_ACCOUNTS=290623
FILENAME=D:\External\Outbound\EXTRACT_002\AAFES_CONUS\DAILY\DH_AAFES_CONUS_DAILY_ORD_TRACKING_20171115.XLST|VAR_ACCOUNTS=290624
So two jobs - A server job and a Parallel job. SWEET.
I didn't read the requirements for the UtilityRunJob() close enough.
I was passing it the value of my variable. The Arguments clearly state that I needed to pass it as
ParamName1=Value1|ParamName2=Value2
My SQL has been modified to :
Code: Select all
SELECT 'FILENAME=D:\External\Outbound\EXTRACT_002\' + CUSTOMER_NAME + '\' + CYCLE + '\DH_'+CUSTOMER_NAME+'_'+CYCLE +'_ORD_TRACKING_'+ CONVERT(VARCHAR(10), GETDATE()-1, 112) + '.XLST|VAR_ACCOUNTS=' + CUST_CODES PARM
FROM EXP_002_CONFIG
FILENAME=D:\External\Outbound\EXTRACT_002\AAFES_STORES\DAILY\DH_AAFES_STORES_DAILY_ORD_TRACKING_20171115.XLST|VAR_ACCOUNTS=290623
FILENAME=D:\External\Outbound\EXTRACT_002\AAFES_CONUS\DAILY\DH_AAFES_CONUS_DAILY_ORD_TRACKING_20171115.XLST|VAR_ACCOUNTS=290624
So two jobs - A server job and a Parallel job. SWEET.