DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
iHijazi
Participant



Joined: 24 Oct 2010
Posts: 45

Points: 559

Post Posted: Fri Mar 01, 2013 6:23 am Reply with quote    Back to top    

DataStage® Release: 8x
Job Type: Parallel
OS: Unix
Hi folks,

I'm currently doing a research to understand the major differences between:
IBM Balanced Optimization and Oracle Data Integrator approach.

"Balanced Optimization approach is comparable to ODI ELT architecture." How true is that statement?

What are the limitations of DS approach compared to ODI?

Any information, samples and/or resources would be highly appreciated.

Thanks!

_________________
Not only thoughts, but a little bit of experience.
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42277
Location: Denver, CO
Points: 217088

Post Posted: Fri Mar 01, 2013 7:53 am Reply with quote    Back to top    

iHijazi wrote:
"Balanced Optimization approach is comparable to ODI ELT architecture." How true is that statement?

I would think "comparable" is a decent choice of words. Balanced Optimization is also known as Push Down Optimization and moves applicable processing into the target database, essentially swapping the "T" and "L" in ETL. Note that it can also "push up" as well.

As to "limitations" I'll defer that to others that have actually used it. If you already have DataStage then this is just a licensed add-on. From what I recall, ODI writes PL/SQL code so that would be a completely different skill set you'd need in house.

I'm assuming you've seen this but just in case there's the link an introduction to BO.

_________________
-craig

Watch out where the huskies go and don't you eat that yellow snow
Rate this response:  
iHijazi
Participant



Joined: 24 Oct 2010
Posts: 45

Points: 559

Post Posted: Fri Mar 01, 2013 8:27 am Reply with quote    Back to top    

Thanks chulett!

I do understand how ODI actually works. Having it writing PL/SQL only works with Oracle DB, other DBs have different scripting which ODI support accordingly. Plus, I think this is a good feature since it can have more control on transformation in some cases, without using the parallel advantage though. But then, if this is the case, then DS won't be able to perform certain transformation through pure SQL, and manage to do it on the Engine itself, no?

From your statement, I understand that DS BO only uses pure SQL, right?

I've seen the link already and read all I could on the BO. But I cannot find find a confirmation how ODI could be a better choice if DS can do ELT approach as well.

If you can add anything I'll highly appreciate it.

Thanks!

_________________
Not only thoughts, but a little bit of experience.
Rate this response:  
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42277
Location: Denver, CO
Points: 217088

Post Posted: Fri Mar 01, 2013 9:15 am Reply with quote    Back to top    

Right, BO moves what it can into the database and anything it can't gets processed in the engine like normal. And if pure SQL means SQL rather than something like PL/SQL then yes, that's my understanding.

_________________
-craig

Watch out where the huskies go and don't you eat that yellow snow
Rate this response:  
BI-RMA



Group memberships:
Premium Members

Joined: 01 Nov 2009
Posts: 463
Location: Hamburg
Points: 3697

Post Posted: Mon Apr 08, 2013 7:49 am Reply with quote    Back to top    

Hi iHijazi,

There is at least one major difference between the ODI and the DataStage BO approach. ODI is designed to work without a dedicated ETL-server. You do not necessarily need a Weblogic application server to run ODI, though you may still wnat to use one to use specific agents, or for BAM, or to address WebServices run within SOA-Suite, but that is another matter.

DataStage BO will make use of database-resources to rebuild some steps that can be run more efficiently outside of DataStage, but there is still going to be a DataStage job on a DataStage-Server processing input from one or more sources and writing data to one or more targets.

ODI with its knowledge-modules gives you a lot more control over the ELT-transformations that will be performed at runtime and if your source and target are within the same database your data is not going to be transfered out of the database for any transformations - which is significantly reducing network traffic.

ODI makes use of the parallel-engines provided by source- or target-databases. Which makes ODI a terribly good choice for high-sophisticated massive-parallel-processing databases like for example Teradata...

So - when just looking at the ELT-capabilities of ODI and DataStage BO - I'd definitely prefer ODI. But DataStage has got a number of other very useful features for which there is no replacement within ODI.
No space here to discuss all these...

_________________
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
Rate this response:  
iHijazi
Participant



Joined: 24 Oct 2010
Posts: 45

Points: 559

Post Posted: Mon Apr 08, 2013 8:03 am Reply with quote    Back to top    

Excellent points!

Thanks for sharing.

_________________
Not only thoughts, but a little bit of experience.
Rate this response:  
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54072
Location: Sydney, Australia
Points: 293282

Post Posted: Mon Apr 08, 2013 2:14 pm Reply with quote    Back to top    

I should probably point out that you have design-time control over how much Balanced Optimization pushes into the source and target databases. For example, you can specify a stage in the design at whi ...

_________________
RXP Services Ltd
Melbourne | Canberra | Sydney | Hong Kong | Hobart | Brisbane
currently hiring: Canberra, Sydney and Melbourne
Rate this response:  
BI-RMA



Group memberships:
Premium Members

Joined: 01 Nov 2009
Posts: 463
Location: Hamburg
Points: 3697

Post Posted: Tue Apr 09, 2013 6:31 am Reply with quote    Back to top    

Hi Ray,

your comment still points to the problem: Balanced Optimization does "its thing".

ODI does not do that. It uses the strategies defined within the knowledge module being used. No black-box feeling about it, because you've got full control over it.

The more important thing when comparing ODI and DataStage is, however: do you believe that it is possible to perform all the transformations DataStage allows within SQL - and only SQL?

Now, you see that we have reached the realm of the speculative... Some say yes, I have got my doubts, and it seems that in ODI SQL is all there is. Even though in a broader sense I do believe that there is more to most things than the eye can see...

_________________
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
Rate this response:  
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54072
Location: Sydney, Australia
Points: 293282

Post Posted: Tue Apr 09, 2013 4:05 pm Reply with quote    Back to top    

In my opinion DataStage has substantially more functionality than ODI. As you point out (and I admit not being able to comment cogently) ODI seems to be limited to what can be done in SQL, so no pars ...

_________________
RXP Services Ltd
Melbourne | Canberra | Sydney | Hong Kong | Hobart | Brisbane
currently hiring: Canberra, Sydney and Melbourne
Rate this response:  
BI-RMA



Group memberships:
Premium Members

Joined: 01 Nov 2009
Posts: 463
Location: Hamburg
Points: 3697

Post Posted: Wed Apr 10, 2013 12:44 am Reply with quote    Back to top    

Well, Ray, ODI is definitely not as bad as you describe it. It has got excellent XML-Parsing ability, it can be fully integrated with Oracle's SOA Suite 11g - so calling Web-Services is more than an option -, access to Java classes is fully available (most databases like Oracle, DB2 - you name it - offer extensive Java support, so you can just plug into that) and the built-in expression-editor actually offers quite a lot of functionality.

But almost all the transformations are realized using a database engine and are based on the functional elements of SQL. So there are limitations. I'd see major advantages of DataStage in relatively recent advances like Key-Break-detection and the availability of Loop-variables. And - of course - in DataStage's enormous abilities to support generic development approaches - which are eyed upon very sceptically by many customers...

So the margin is definitely much narrower than your comment suggests.

_________________
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
Rate this response:  
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54072
Location: Sydney, Australia
Points: 293282

Post Posted: Wed Apr 10, 2013 1:17 am Reply with quote    Back to top    

Perhaps I should have admitted to a personal bias, since I was there at the gestation and birth of DataStage! Wink

_________________
RXP Services Ltd
Melbourne | Canberra | Sydney | Hong Kong | Hobart | Brisbane
currently hiring: Canberra, Sydney and Melbourne
Rate this response:  
BI-RMA



Group memberships:
Premium Members

Joined: 01 Nov 2009
Posts: 463
Location: Hamburg
Points: 3697

Post Posted: Wed Apr 10, 2013 1:27 am Reply with quote    Back to top    

Oh Ray!

I wouldn't have though your as old as THAT!!! Smile

_________________
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
Rate this response:  
vmcburney

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup

Joined: 23 Jan 2003
Posts: 3564
Location: Australia, Melbourne
Points: 27712

Post Posted: Sun Apr 14, 2013 10:15 pm Reply with quote    Back to top    

ODI does not have excellent XML parsing - ODI versions before 11.1.7 cannot handle XSD schema files where the same element exists in more than one part of the schema. It breaks on import. ODI version 11.x does not handle recursive XML within an XML hierarchy and just breaks. DataStage parses both easily, DataStage has much better import and filtering of nested XSD schema files and a large library of XML parse and construction stages. I would also be dubious of ODI on very large XML files, something IBM finally got right in version 8.7.

ODI is also a long way behind DataStage in terms of functionality and breadth of Transformations. I recently did some evaluation for a customer who had populated a Data Warehouse with ODI but found it too hard to make changes. I manually put together a Data Lineage diagram for a single table load into the Data Warehouse, I found a PL/SQL function created a table, a complex SQL statement created a view, a complex SQL select statement in an ODI job retrieved data and wrote to a warehouse table, a complex SQL statement in an ODI job retrieve data and wrote to a dimension table. In one instance the select statement for the population of a Fact table had 500 lines of SQL.

I asked the developers why they had so much complex SQL and PL/SQL in addition to ODI steps and they explained that ODI was limited in what it could do and they had to frequently revert back to manual coding. This entire process could have been replaced by two DataStage jobs with no manual coding.

DataStage Balanced Optimizer relies a lot of set based database operations, something Databases excel at, it does not generate PL/SQL routines because this is something databases do poorly. A lot of the advantages of parallel processing are lost if you generate PL/SQL and it can become a bottleneck. Balanced Optimizer gives you the load balancing between an ETL engine with massive parallel processing capabilities and a huge transformation library and the database engine for table based operations.

_________________
Certus Solutions
Blog: Tooling Around in the InfoSphere
Twitter: @vmcburney
LinkedIn: Vincent McBurney LinkedIn
Rate this response:  
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours