DS job and the data sets will not be cleaned up

p4paulian · Post by **p4paulian** » Tue Feb 17, 2009 3:57 pm

Hi

I was told by IBM support that if we have director open and are monitoring Column Analysis executions and the job finishes while we are monitoring, then the DS job and the data sets will not be cleaned up.

Is this how it works for Info analyzer column analysis??

I was assuming that any job irrespective of whether it is column analysis or a normal px job should work the same way as per resource allocation and clearance goes.

thanks

ray.wurlod · Post by **ray.wurlod** » Tue Feb 17, 2009 4:03 pm

Welcome aboard.

It's not a DataStage thing. No operating system will allow you to delete a file that is open. So it is here - if it's open (being viewed by Director) it can't be cleaned up.

p4paulian · Post by **p4paulian** » Tue Feb 17, 2009 4:25 pm

Thanks for your response Ray.

If a job fails, then it will not clean up or is it regardless what the end status of job is, it wont clean up if the director is open.

We do use director to monitor a running job, right?

chulett · Post by **chulett** » Tue Feb 17, 2009 4:37 pm

'Clean up'?

p4paulian · Post by **p4paulian** » Tue Feb 17, 2009 4:45 pm

Temporary datasets created during the job run "clean up".

Do we monitor a running a job or not via director

chulett · Post by **chulett** » Tue Feb 17, 2009 5:17 pm

I should have been more specific. I was wondering what about the job itself, not the data sets, would be cleaned up... since you mentioned both. But that's not important if the question is strictly in regards to 'temporary' datasets and if they're removed after the job runs.

As to your eye-rolling Director question, in general the answer is yes - that's exactly what it is for. However, I can't speak to what quirks your 'Column Analysis executions' may bring to the table, you'll need to wait for Ray to come back and clarify things.

p4paulian · Post by **p4paulian** » Tue Feb 17, 2009 6:00 pm

Director is for monitoring jobs, thats where I am lost,

I had been using it to monitor Datastage jobs, and am wondering how it is different when I monitor Column analysis jobs.

ray.wurlod · Post by **ray.wurlod** » Tue Feb 17, 2009 7:24 pm

All Information Analyzer tasks are run as DataStage jobs (the osh is generated directly by IA, there is no graphical job design generated) in the ANALYZERPROJECT project.

p4paulian · Post by **p4paulian** » Tue Feb 17, 2009 8:59 pm

Very true,

But Ray, how is it different as per viewing the logs in director for a column analysis job to viewing logs for a parallel job.

We never came across issues like disk space full if we had director open for a parallel job, but in case of column analysis it happens.

Due to this issue, they suggested us not to monitor while running a IA job.

ray.wurlod · Post by **ray.wurlod** » Tue Feb 17, 2009 9:39 pm

IA analyses require huge amounts of scratch space. My guess - and I don't know for sure - is that viewing logs or monitoring jobs exacerbates the total demand for disk space on the server to a point where it fills the disk. This would especially be true if you used the default configuration file supplied with the product, which places scratch disk in the same file system as the engine and the projects.

p4paulian · Post by **p4paulian** » Tue Feb 17, 2009 10:47 pm

Thanks for your response Ray.

We are using the default configuration file which is a single node.
So, in case we change the configuration file to 2 or 4 node.
Have scratch disk in file system other than where engine and projects reside.

Doing all this may also help, instead of not monitoring in Director.

Could we change the configuration file in IA, or only Administrator can do it.

syrup75 · Post by **syrup75** » Wed Feb 18, 2009 12:17 am

hello,

i guess.. you checked 'retain script' and 'retain datasets' options.

if you don't check these options, jobs and datasets will be clean up after doing column analysis, i think..

also,, you can change configuration file in engine tab. (you can find retain script & datasets options in engine tab also...)

however, it's not like datastage.. you should type a specific directory and .apt file name.

when you execute 'run column analysis', you may find scheduler, sample, option and engine tab in right side of window.

thanks,

DSXchange

DS job and the data sets will not be cleaned up

DS job and the data sets will not be cleaned up

Re: DS job and the data sets will not be cleaned up