Page 1 of 1

algorith for loading a Dimension table in Data Stage

Posted: Tue Jan 26, 2016 10:46 pm
by sirdavis
ALGORITHM FOR ETL PROCESS ON DIMENSION TABLE.

I creating a standar algorithm for an ETL process on loading a dimension table. I expect you can suggest me or fix my process with your best comments.

STEPS:

1. Validation input file: header, detail and trailer.
2. Linux command for placing the file and grant permissions in the proper allocation and files. Variable environments and parameters.
3. Connect using the proper data source to the input file.ram
4. Sort the file before applying a transformation, if necessary apply a filter.
4.1 Apply the proper transformation process.
5. Load stage table -temporary table- previously it has been truncated and created index must be done.
6. Loading strategy.
6.1 Bulk mode
6.2 Insert or Update -DELTA process-.
7. Generate log file and update the control table processes and go to step nine, otherwise increase the complexity of the process.
7. Apply the join using the joiner component -hashed file- between input file and target table.
7.1 Tune the query if we have to join several input files and one target Dimension table.
7.2. divide and conquer: generate a join per every two input files, and generate a final join between the input files and target table.
8. Loading strategy.
8. Generate the log file and notification for process finished and failed.
9. Create a plan for failing process and rerun option.
10. Trigger the process -sequence- as per the dependency and priority and timely.
11. Your best suggestion

sirdavis :idea:

Posted: Wed Jan 27, 2016 7:43 am
by chulett
Welcome. That's... an interesting list. For grins, can you verify we're talking about a Type 2 dimension here?

ps. Not really an algorithm, more of a process or procedure.

Posted: Wed Jan 27, 2016 4:14 pm
by ray.wurlod
Welcome aboard.

Use a Slowly Changing Dimension stage, which does all of that and, further, handles Type 2 and Type 1 (or some combination thereof).