Query regarding Surrogate Key Generator Stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
udayk_2007
Participant
Posts: 72
Joined: Wed Dec 12, 2007 2:29 am

Query regarding Surrogate Key Generator Stage

Post by udayk_2007 »

Hello Friends

I am having a query regarding database sequence and State File option of Surrogate Key Generator Stage.

Which option is better performance wise ?

Thanks in advance for your help

Regards
Ulhas
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Re: Query regarding Surrogate Key Generator Stage

Post by SURA »

Better with table is my choice. In my previous experience, file was corrupted. Then i moved to table.


You may not find performance difference is my view.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
udayk_2007
Participant
Posts: 72
Joined: Wed Dec 12, 2007 2:29 am

Post by udayk_2007 »

Thanks Sura for your reply

i think my question was not clear in previous post

we are creating a key source in the 1st job( which only having the surrogate key generator stage). In this job,we are selecting database sequence in the source Type option of SKG Stage

Does using this sequence in other job will make the DataStage hit to the database for each record ?

Does changing the Source Type to Flat file (in the 1st job), will improve the performance of the job using this sequence (as it will not require to hit the database for every record anymore) ?

Thanks for your help

Regards
Ulhas
Ravi.K
Participant
Posts: 209
Joined: Sat Nov 20, 2010 11:33 pm
Location: Bangalore

Post by Ravi.K »

Yes, It hits database based on the block specified.

In my view flat file gives much performance compare to DB sequence Since DB sequence will have I/O operations with DB server if DB is landed in other host then there will be network traffic.

As suggested by Sura, Security wise chances for corruption.
Cheers
Ravi K
SURA
Premium Member
Premium Member
Posts: 1229
Joined: Sat Jul 14, 2007 5:16 am
Location: Sydney

Post by SURA »

Still I am not very clear. Are you trying to use that stage for N number of jobs?

If that is the case, Don't do that. It seems like Global Surrogate Key concept and that is not a good approach.

It will not stop anything (Up to some extent), but this will be the bottle neck and very bad design.

In regards with FILE/DB, in table level you can create a new Sequence with the last number (If the SK corrupted) whereas in file I don't know whether we can able to do it or not, is the reason why i preferred Table.
Thanks
Ram
----------------------------------
Revealing your ignorance is fine, because you get a chance to learn.
udayk_2007
Participant
Posts: 72
Joined: Wed Dec 12, 2007 2:29 am

Post by udayk_2007 »

Thanks guys for your response

@SURA - Currently We are creating different database sequences for each job. We are planning to switch to file so that round trips to database can be saved.

We will be creating the file before the job run,taking the max surrogate key value from the target table and incrementing the value there after for each record.

Just wanted to know if anybody has tried this before and how much performance gain obtained ?
Ravi.K
Participant
Posts: 209
Joined: Sat Nov 20, 2010 11:33 pm
Location: Bangalore

Post by Ravi.K »

Yes, we can take max value even for file also. It is at Initial value.
Cheers
Ravi K
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Unless you are planning on managing this "SK file" manually, the state file the SKG stage leverages will do all of that max tracking automatically for you after you seed it.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply