Level of ELT DataStage can do..
Posted: Fri Feb 17, 2006 7:00 am
Hi Guys
In this discussion I dont think the Release, Platform and OS makes a difference. This is more of a methodology issue.
Sunopsis are going to market (you'll see Sunopsis as visionaries in the Gartner ETL quadrant) with an ELT message. In other words, (E)xtract from the sources and (L)oad to the target or staging database and then (T)ransform the data. The great part of the tool is that it will compile all the native code for the transformation to occur on the target database.
Then there's the news in the industry that ETL as a buzzword and methodolgoy is dying. Mainly because the RDBMS systems we extract and load to have improved their own transform abilities - i.e. pivoting data, grouping by cube data etc..
My questions for comment to the forum are;
* Is ETL dying?
* What are the pitfalls for an ELT solution?
* Can DataStage do ELT?
* ETL = row based processing, ELT = set/batch based processing?
I have given it some thought, and here is what I could come up with;
Is ETL dying?
No. My response is based mostly on the limitations/pitfalls I feel are in the ELT space.
What are the pitfalls for an ELT solution?
Firstly, how busy is the source/target system that you need to do the transforms on? If the system is overloaded as it is, it will require a purchase of new hardware.
Secondly, debugging of transforms on data is going to be an issue without extra data stored in a test environment. I guess they (ELT) know and trust each client has a test environment which they will piggyback on. In per row processing (ETL) the data is on the engine and can be reprocessed as often as the developer wants.
Thirdly, can the current RDBMS system handle the complexity of the transforms developed? If not, it may be necessary to upgrade or migrate to better technology.
Can DataStage do ELT?
Keeping in mind that ELT is just a methodology, I would say 'yes, but to an extent'. The only difficulty is manually coding the SQL to do the transformations which are executed by DataStage. In this case, DataStage turns into a glorified scheduler.
Is ETL = row based processing and ELT = set/batch based processing?
This is how I see it anyway.
Any comments welcome. Ideally I'd like to get to a point where we agree on the direction of ETL and its influence on the leaders in the ETL space.
Thanks
In this discussion I dont think the Release, Platform and OS makes a difference. This is more of a methodology issue.
Sunopsis are going to market (you'll see Sunopsis as visionaries in the Gartner ETL quadrant) with an ELT message. In other words, (E)xtract from the sources and (L)oad to the target or staging database and then (T)ransform the data. The great part of the tool is that it will compile all the native code for the transformation to occur on the target database.
Then there's the news in the industry that ETL as a buzzword and methodolgoy is dying. Mainly because the RDBMS systems we extract and load to have improved their own transform abilities - i.e. pivoting data, grouping by cube data etc..
My questions for comment to the forum are;
* Is ETL dying?
* What are the pitfalls for an ELT solution?
* Can DataStage do ELT?
* ETL = row based processing, ELT = set/batch based processing?
I have given it some thought, and here is what I could come up with;
Is ETL dying?
No. My response is based mostly on the limitations/pitfalls I feel are in the ELT space.
What are the pitfalls for an ELT solution?
Firstly, how busy is the source/target system that you need to do the transforms on? If the system is overloaded as it is, it will require a purchase of new hardware.
Secondly, debugging of transforms on data is going to be an issue without extra data stored in a test environment. I guess they (ELT) know and trust each client has a test environment which they will piggyback on. In per row processing (ETL) the data is on the engine and can be reprocessed as often as the developer wants.
Thirdly, can the current RDBMS system handle the complexity of the transforms developed? If not, it may be necessary to upgrade or migrate to better technology.
Can DataStage do ELT?
Keeping in mind that ELT is just a methodology, I would say 'yes, but to an extent'. The only difficulty is manually coding the SQL to do the transformations which are executed by DataStage. In this case, DataStage turns into a glorified scheduler.
Is ETL = row based processing and ELT = set/batch based processing?
This is how I see it anyway.
Any comments welcome. Ideally I'd like to get to a point where we agree on the direction of ETL and its influence on the leaders in the ETL space.
Thanks