ALGORITHM FOR ETL PROCESS ON DIMENSION TABLE.
I creating a standar algorithm for an ETL process on loading a dimension table. I expect you can suggest me or fix my process with your best comments.
STEPS:
1. Validation input file: header, detail and trailer.
2. Linux command for placing the file and grant permissions in the proper allocation and files. Variable environments and parameters.
3. Connect using the proper data source to the input file.ram
4. Sort the file before applying a transformation, if necessary apply a filter.
4.1 Apply the proper transformation process.
5. Load stage table -temporary table- previously it has been truncated and created index must be done.
6. Loading strategy.
6.1 Bulk mode
6.2 Insert or Update -DELTA process-.
7. Generate log file and update the control table processes and go to step nine, otherwise increase the complexity of the process.
7. Apply the join using the joiner component -hashed file- between input file and target table.
7.1 Tune the query if we have to join several input files and one target Dimension table.
7.2. divide and conquer: generate a join per every two input files, and generate a final join between the input files and target table.
8. Loading strategy.
8. Generate the log file and notification for process finished and failed.
9. Create a plan for failing process and rerun option.
10. Trigger the process -sequence- as per the dependency and priority and timely.
11. Your best suggestion
sirdavis
algorith for loading a Dimension table in Data Stage
Moderators: chulett, rschirm, roy
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact: