ODI Vs DS

Moderators: chulett, rschirm

Post Reply
iHijazi
Participant
Posts: 45
Joined: Sun Oct 24, 2010 12:05 am

ODI Vs DS

Post by iHijazi »

Hi folks,

I'm currently doing a research to understand the major differences between:
IBM Balanced Optimization and Oracle Data Integrator approach.

"Balanced Optimization approach is comparable to ODI ELT architecture." How true is that statement?

What are the limitations of DS approach compared to ODI?

Any information, samples and/or resources would be highly appreciated.

Thanks!
Not only thoughts, but a little bit of experience.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Re: ODI Vs DS

Post by chulett »

iHijazi wrote:"Balanced Optimization approach is comparable to ODI ELT architecture." How true is that statement?
I would think "comparable" is a decent choice of words. Balanced Optimization is also known as Push Down Optimization and moves applicable processing into the target database, essentially swapping the "T" and "L" in ETL. Note that it can also "push up" as well.

As to "limitations" I'll defer that to others that have actually used it. If you already have DataStage then this is just a licensed add-on. From what I recall, ODI writes PL/SQL code so that would be a completely different skill set you'd need in house.

I'm assuming you've seen this but just in case there's the link an introduction to BO.
-craig

"You can never have too many knives" -- Logan Nine Fingers
iHijazi
Participant
Posts: 45
Joined: Sun Oct 24, 2010 12:05 am

Post by iHijazi »

Thanks chulett!

I do understand how ODI actually works. Having it writing PL/SQL only works with Oracle DB, other DBs have different scripting which ODI support accordingly. Plus, I think this is a good feature since it can have more control on transformation in some cases, without using the parallel advantage though. But then, if this is the case, then DS won't be able to perform certain transformation through pure SQL, and manage to do it on the Engine itself, no?

From your statement, I understand that DS BO only uses pure SQL, right?

I've seen the link already and read all I could on the BO. But I cannot find find a confirmation how ODI could be a better choice if DS can do ELT approach as well.

If you can add anything I'll highly appreciate it.

Thanks!
Not only thoughts, but a little bit of experience.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Right, BO moves what it can into the database and anything it can't gets processed in the engine like normal. And if pure SQL means SQL rather than something like PL/SQL then yes, that's my understanding.
-craig

"You can never have too many knives" -- Logan Nine Fingers
BI-RMA
Premium Member
Premium Member
Posts: 463
Joined: Sun Nov 01, 2009 3:55 pm
Location: Hamburg

Post by BI-RMA »

Hi iHijazi,

There is at least one major difference between the ODI and the DataStage BO approach. ODI is designed to work without a dedicated ETL-server. You do not necessarily need a Weblogic application server to run ODI, though you may still wnat to use one to use specific agents, or for BAM, or to address WebServices run within SOA-Suite, but that is another matter.

DataStage BO will make use of database-resources to rebuild some steps that can be run more efficiently outside of DataStage, but there is still going to be a DataStage job on a DataStage-Server processing input from one or more sources and writing data to one or more targets.

ODI with its knowledge-modules gives you a lot more control over the ELT-transformations that will be performed at runtime and if your source and target are within the same database your data is not going to be transfered out of the database for any transformations - which is significantly reducing network traffic.

ODI makes use of the parallel-engines provided by source- or target-databases. Which makes ODI a terribly good choice for high-sophisticated massive-parallel-processing databases like for example Teradata...

So - when just looking at the ELT-capabilities of ODI and DataStage BO - I'd definitely prefer ODI. But DataStage has got a number of other very useful features for which there is no replacement within ODI.
No space here to discuss all these...
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
iHijazi
Participant
Posts: 45
Joined: Sun Oct 24, 2010 12:05 am

Post by iHijazi »

Excellent points!

Thanks for sharing.
Not only thoughts, but a little bit of experience.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

I should probably point out that you have design-time control over how much Balanced Optimization pushes into the source and target databases. For example, you can specify a stage in the design at which to stop Business Optimization from doing its thing.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
BI-RMA
Premium Member
Premium Member
Posts: 463
Joined: Sun Nov 01, 2009 3:55 pm
Location: Hamburg

Post by BI-RMA »

Hi Ray,

your comment still points to the problem: Balanced Optimization does "its thing".

ODI does not do that. It uses the strategies defined within the knowledge module being used. No black-box feeling about it, because you've got full control over it.

The more important thing when comparing ODI and DataStage is, however: do you believe that it is possible to perform all the transformations DataStage allows within SQL - and only SQL?

Now, you see that we have reached the realm of the speculative... Some say yes, I have got my doubts, and it seems that in ODI SQL is all there is. Even though in a broader sense I do believe that there is more to most things than the eye can see...
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

In my opinion DataStage has substantially more functionality than ODI. As you point out (and I admit not being able to comment cogently) ODI seems to be limited to what can be done in SQL, so no parsing XML, no invoking web services, no access to Java classes, no restructuring vectors/arrays, no delimited substring replacement, to name a few, all of which are easily accomplished using DataStage.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
BI-RMA
Premium Member
Premium Member
Posts: 463
Joined: Sun Nov 01, 2009 3:55 pm
Location: Hamburg

Post by BI-RMA »

Well, Ray, ODI is definitely not as bad as you describe it. It has got excellent XML-Parsing ability, it can be fully integrated with Oracle's SOA Suite 11g - so calling Web-Services is more than an option -, access to Java classes is fully available (most databases like Oracle, DB2 - you name it - offer extensive Java support, so you can just plug into that) and the built-in expression-editor actually offers quite a lot of functionality.

But almost all the transformations are realized using a database engine and are based on the functional elements of SQL. So there are limitations. I'd see major advantages of DataStage in relatively recent advances like Key-Break-detection and the availability of Loop-variables. And - of course - in DataStage's enormous abilities to support generic development approaches - which are eyed upon very sceptically by many customers...

So the margin is definitely much narrower than your comment suggests.
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Perhaps I should have admitted to a personal bias, since I was there at the gestation and birth of DataStage! :wink:
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
BI-RMA
Premium Member
Premium Member
Posts: 463
Joined: Sun Nov 01, 2009 3:55 pm
Location: Hamburg

Post by BI-RMA »

Oh Ray!

I wouldn't have though your as old as THAT!!! :)
"It is not the lucky ones are grateful.
There are the grateful those are happy." Francis Bacon
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

ODI does not have excellent XML parsing - ODI versions before 11.1.7 cannot handle XSD schema files where the same element exists in more than one part of the schema. It breaks on import. ODI version 11.x does not handle recursive XML within an XML hierarchy and just breaks. DataStage parses both easily, DataStage has much better import and filtering of nested XSD schema files and a large library of XML parse and construction stages. I would also be dubious of ODI on very large XML files, something IBM finally got right in version 8.7.

ODI is also a long way behind DataStage in terms of functionality and breadth of Transformations. I recently did some evaluation for a customer who had populated a Data Warehouse with ODI but found it too hard to make changes. I manually put together a Data Lineage diagram for a single table load into the Data Warehouse, I found a PL/SQL function created a table, a complex SQL statement created a view, a complex SQL select statement in an ODI job retrieved data and wrote to a warehouse table, a complex SQL statement in an ODI job retrieve data and wrote to a dimension table. In one instance the select statement for the population of a Fact table had 500 lines of SQL.

I asked the developers why they had so much complex SQL and PL/SQL in addition to ODI steps and they explained that ODI was limited in what it could do and they had to frequently revert back to manual coding. This entire process could have been replaced by two DataStage jobs with no manual coding.

DataStage Balanced Optimizer relies a lot of set based database operations, something Databases excel at, it does not generate PL/SQL routines because this is something databases do poorly. A lot of the advantages of parallel processing are lost if you generate PL/SQL and it can become a bottleneck. Balanced Optimizer gives you the load balancing between an ETL engine with massive parallel processing capabilities and a huge transformation library and the database engine for table based operations.
Post Reply