Page 2 of 4

Posted: Tue Jan 18, 2011 12:05 am
by mavrick21
Thanks Ray :)

Posted: Tue Jan 18, 2011 5:24 pm
by mavrick21
Ray / Craig,

Once I've optimized the hashed file and have all the OVER.30 data moved over to DATA.30, do I have to modify the existing DataStage jobs? By modify I mean change the modulus and other values in the hashed file stage.

Below are the various jobs which touch this hashed file
1) Initial job - Clears the hashed file by inserting one record (with @NULL values) for all fields
2) Lookup file creation job - Inserts all the records into the hashed file
3) Normal job - which is the normal ETL job which looks up on the hashed file.

Jobs (1) and (2) are run each weekend.
Job (3) run from Monday through Friday.

Thanks

Posted: Tue Jan 18, 2011 5:48 pm
by chulett
Not sure if you have a legitimate need to separate #s 1 & 2 or not but you certainly don't need to insert any kind of null record to clear a hashed file. Just setting that Option is all that is needed, it will happen whether you write a record to it or not.

Posted: Tue Jan 18, 2011 5:55 pm
by mavrick21
Craig,

How about changing the values in Hashed file stage in the job after the Hashed file has been optimized? Isn't that necessary?

Thanks

Posted: Tue Jan 18, 2011 5:59 pm
by chulett
I don't believe so, they should remain intact even when the hashed file is cleared. If that's wrong, I'm sure Ray will be along shortly with a correction. :wink:

Now, if you dropped and recreated it each time that would be a different story... there you would have to ensure they were set properly in the Options section of the hashed file stage.

Posted: Tue Jan 18, 2011 6:32 pm
by mavrick21
We're not dropping and recreating them every time.

Thanks Craig :)

Posted: Sun Jan 23, 2011 1:06 pm
by mavrick21
Hello,

1) Why does HFC shows a separation value for dynamic hashed file? Please correct me if I'm wrong but I thought separation values are only for static hashed files and for dynamic hashed files we can only specify group size.

If I'm wrong then how to specify separation value for dynamic hashed file? I don't see a option in Designer. Should it be done via TCL?


2) Does optimizing a dynamic hashed file hold good only when a subset of records are selected from a hashed file?

I ask this question based on one of my posts - viewtopic.php?p=387741


Thanks.

Posted: Sun Jan 23, 2011 2:27 pm
by ray.wurlod
GROUP.SIZE 1 = separation 4
GROUP.SIZE 2 = separation 8
Any other value in HFC will cause a warning message to be displayed.
The CREATE.TABLE statement generated by HFC will include a GROUP.SIZE 2 clause if required. GROUP.SIZE 1 is the default, so does not have to be included.

Posted: Tue Jan 25, 2011 1:48 pm
by mavrick21
What's the purpose/advantage of having a hashed file stage between a transformer and a DRS stage for the lookup process? Why not a direct look up on DRS stage without the hashed file stage?

Does this have something to do with performance?

Thank you.

Posted: Tue Jan 25, 2011 2:41 pm
by ray.wurlod
No idea what the purpose might be - it seems to me to be a silly design.

Posted: Tue Jan 25, 2011 3:07 pm
by mavrick21
:D

Thanks Ray

Posted: Tue Jan 25, 2011 4:03 pm
by chulett
Define "between". If the DRS stage loads the Hashed File and then the job references the Hashed File there's nothing inherently silly about that.

Posted: Tue Jan 25, 2011 4:10 pm
by mavrick21
Craig,

Existing job design below

Code: Select all


                DRS
                 |
                 |
           Hashed File
                 |
                 |
DRS-------->Transformer--------->DRS

It's all in a single job, NOT in two separate jobs.

Posted: Tue Jan 25, 2011 4:39 pm
by chulett
I've done that myself many times. And yes, can definitely have something to do with performance. :wink:

Posted: Tue Jan 25, 2011 4:47 pm
by mavrick21
Craig,
And yes, can definitely have something to do with performance
"something" ??? :twisted:

Waiting to hear from Ray.