data insert on Indexed oracle DB

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dsedi
Participant
Posts: 220
Joined: Wed Jun 02, 2004 12:38 am

data insert on Indexed oracle DB

Post by dsedi »

Hi All,

We are using PX 7.0 with Oracle 9i
currently we don't have any indexs on our database.

As an enhancement we are palnning to add some indexes on most of the tables.

we would like to know where this will affect the DS performanace.
do we need to do any changes in our ETL jobs.


FYI ..our PX server is running in 2 node confiruation

Thanks in advance,Edi
kcbland
Participant
Posts: 5208
Joined: Wed Jan 15, 2003 8:56 am
Location: Lutz, FL
Contact:

Post by kcbland »

It will impact your ETL, depending on the type of indexes, tables, the loading method, and the nature of the data.

Partitioned tables with global indexes will have severe performance impact to ETL, because partitioned loads send their data directly to each partition using direct path sqlldr. Global indexes will defeat this type of loading. Conventional path is obscenely slower.

If your partitioned tables have local indexes, the index updates are isolated to the partition, which then allows direct path partition loading to fly.

If your data is being loaded using insert/update logic, then the type of indexes matter a lot, especially if the indexes are on columns that are being updated. Indexes degrade if they're sorted and you're updating the index.

Also, the indexes are disabled during direct path loads, which means queries against the table will be stalled while the data portion is loading, but once the data is loaded they will run without the benefit of indexes until the indexes update. If there's any issues with the index updating, the indexes will go into an invalid state and remain there until the data is fixed. If the index that's corrupted defined as unique, you won't get the benefit of using that index to find the errant data and remove it, resulting in full table/partition scans.

You really need to spend time with your ETL architect, DBA, and data architect to design the indexing strategy, because ETL, maintenance, as well as performance are all of the factors that go into the design strategy for indexes.
Kenneth Bland

Rank: Sempai
Belt: First degree black
Fight name: Captain Hook
Signature knockout: right upper cut followed by left hook
Signature submission: Crucifix combined with leg triangle
track_star
Participant
Posts: 60
Joined: Sat Jan 24, 2004 12:52 pm
Location: Mount Carmel, IL

Post by track_star »

When you do put indexes on your tables, there is an option in the Oracle EE stage to allow you to either rebuild or maintain your indexes. This option is ONLY available if you are loading (Load method), and then, only works with the append, create, and truncate method of loading. If you're using the upsert method, PX won't do anything with your indexes (as Ken said, it matters what kind of index you have and how it's implemented).
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Re: data insert on Indexed oracle DB

Post by ray.wurlod »

dsedi wrote:Hi All,

We are using PX 7.0 with Oracle 9i
currently we don't have any indexs on our database.

As an enhancement we are palnning to add some indexes on most of the tables.

we would like to know where this will affect the DS performanace.
do we need to do any changes in our ETL jobs.


FYI ..our PX server is running in 2 node confiruation

Thanks in advance,Edi
It will affect ANY form of insert/update, whether it's done from an ETL tool or not. There is a cost to updating the indices. End of story.

There's nothing you can do in the ETL stream to change this fact.

You could affect things around the edges, for example by disabling or dropping the indices before loading, then enabling or re-creating and building the indices subsequently. The same amount of work must be done - all you achieve here is a time shift (but your ETL will finish more quickly).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply