Processing Oracle CDC change tables

Moderators: chulett, rschirm

Post Reply
pauwel
Premium Member
Premium Member
Posts: 2
Joined: Mon Feb 04, 2008 4:38 am
Location: Brussels

Processing Oracle CDC change tables

Post by pauwel »

More or less the same topic as this post but here we already have asynchronuous Oracle CDC in place (as part of Oracle Enterprise) but we want to process the Oracle change tables with Datastage. Any experience/best practises/reference solutions out there?
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

Which Oracle CDC do you have in place? Streams, Data Guard or GoldenGate? Do you currently use that to replicate to another target? Does that replication include audit information such as type of action (insert, update or delete)?

You need to plug DataStage into that flow of replicated data - either by going to a replica database, or replicating to file (not a great option). As far as I know there is no agent between DataStage and GoldenGate so no easy way to share data other than sharing via some staging/replica tables.

I would design a Warehouse where GoldenGate/InfoSphere CDC creates a set of persistent raw data tables and the Warehouse uses those to build an enterprise layer and data marts.
pauwel
Premium Member
Premium Member
Posts: 2
Joined: Mon Feb 04, 2008 4:38 am
Location: Brussels

Post by pauwel »

We have Oracle CDC (layer on top of Streams) and Dataguard in place. We don't have Goldengate. So we want to implement Oracle CDC in asynchronous autolog mode meaning we will ship archive logs to a remote staging DB where the Oracle change tables will reside. I've browsed through chapter 16 of the Oracle DW guide a couple of times and understand that our Datastage ETL process will be a subscriber of the published Change Tables but now I have to work out how to process the info in these tables. The latter is what I'm looking for.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

It's been years since I dealt with that so the details are a little fuzzy. However, we did have the archive logs shipped off to a standby server where we were subscribed to the change tables. I believe there was a Control-M process that "extended the window" to the latest set of changes and then we sourced from them just like any other Oracle tables.

About the only thing different about them was the change reason that you need to decide how you want to leverage. From what I recall, we only really cared if it was a delete or not, for inserts/updates we decided what they were based on the current state of the target table.

The other decision I recall was if we wanted to process all of the changes or not. For type 2 targets, you may want to capture all of the changes made over the course of the day. We had some other targets that we didn't want to do that for so we only took "the most recent" version of the record from the change table for processing each time.

Lastly, we had to be careful about "data fixes" that they did on the source system as sometimes they would overwhelm the change replication process and just overload us (or the queue which could 'hang') with crap changes. For example, they seemed to love adding a new column and then running an update script to set all of them to zero or spaces or some such "default" value. We had to insert ourself into the notification process so we were aware of stuff like this and request that they shut down replication during the process. We then either didn't bother with capturing them at all or replicated the process on our side - adding the column and running the update ourselves.

Hope some of that helps.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply