Abnormal Termination-add_to_heap() Unable to allocate memory
Moderators: chulett, rschirm, roy
Abnormal Termination-add_to_heap() Unable to allocate memory
hi,
I get these two errors in my director log when i run my jobs.
1. Abnormal termination of stage. (Error)
This job has lot of hash file lookups. Around 10 hash file look ups.
The job reads from a sequential file, looks up in these 10 hash files and loads in to Oracle as well as a couple of hash files.
The same job runs on our DEV and TEST boxes. But in PROD i get this abnormal termination error and the job gets aborted.
I splitted the job in to two. 5 hash file look ups in each job. After splitting into two, the jobs complete with out aborting.
Any idea why this error comes in one box alone?
2. add_to_heap() - Unable to allocate memory (Warning)
This job reads from oracle and loads in to two hash files. Each hash file has around 15 million records.
Once it reaches around 2 miilion records it gives this warning and the job continues.
Both hash files are dynamic and write caching is enabled.
This warning comes on all three DEV / TEST / PROD boxes.
Any idea on what settings i need to change to avoid this warning. Are there any memory / kernel / shared memory settings i can work on to avoid this.
Thanks,
Srikanth
I get these two errors in my director log when i run my jobs.
1. Abnormal termination of stage. (Error)
This job has lot of hash file lookups. Around 10 hash file look ups.
The job reads from a sequential file, looks up in these 10 hash files and loads in to Oracle as well as a couple of hash files.
The same job runs on our DEV and TEST boxes. But in PROD i get this abnormal termination error and the job gets aborted.
I splitted the job in to two. 5 hash file look ups in each job. After splitting into two, the jobs complete with out aborting.
Any idea why this error comes in one box alone?
2. add_to_heap() - Unable to allocate memory (Warning)
This job reads from oracle and loads in to two hash files. Each hash file has around 15 million records.
Once it reaches around 2 miilion records it gives this warning and the job continues.
Both hash files are dynamic and write caching is enabled.
This warning comes on all three DEV / TEST / PROD boxes.
Any idea on what settings i need to change to avoid this warning. Are there any memory / kernel / shared memory settings i can work on to avoid this.
Thanks,
Srikanth
Srikanth
What is the size of the file? Usually this is a hash file which is corrupted. This usually happens when you have a system crash or run out of disk space. If your hash file is created in the account then at TCL do
DELETE.FILE MyHashFile
or
CLEAR.FILE MyHashFile
If it is a respoitory file like DS*, RT* or something important like VOC then you have problems. If a file is corrupt then you can usually tell by counting records:
COUNT MyHashFile
What is the size of the file? Usually this is a hash file which is corrupted. This usually happens when you have a system crash or run out of disk space. If your hash file is created in the account then at TCL do
DELETE.FILE MyHashFile
or
CLEAR.FILE MyHashFile
If it is a respoitory file like DS*, RT* or something important like VOC then you have problems. If a file is corrupt then you can usually tell by counting records:
COUNT MyHashFile
Mamu Kim
Re: Abnormal Termination-add_to_heap() Unable to allocate me
Turn OFF write caching. As best as we can tell, this annoying "warning" shows up once the cache fills. The other option would be to bump your default write cache size for the Project up high enough to stop this message from appearing - but that change will effect all jobs.rsrikant wrote:2. add_to_heap() - Unable to allocate memory (Warning)
This job reads from oracle and loads in to two hash files. Each hash file has around 15 million records.
Once it reaches around 2 miilion records it gives this warning and the job continues.
Both hash files are dynamic and write caching is enabled.
This warning comes on all three DEV / TEST / PROD boxes.
Any idea on what settings i need to change to avoid this warning. Are there any memory / kernel / shared memory settings i can work on to avoid this.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
If your hash file is over 2GB then you need to add the 64bit option. Look at the OS level and add the sizes of DATA.30 and OVER.30. I wouild say make it 64 bit no matter what. This is a large hash file and why worry about it. If it does it on all 3 boxes then it has to be too big for 32 bit hash file.
Mamu Kim
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Hi,
Thanks for the replies.
The hash file is not that big. It has very few columns. And it is below 1 GB. The warnings come when the hash file reaches 100 MB and they keep coming once in a while until the job completes.
Craig -- How to increase the write cache at the project level?? If i turn off the write caching the performance is very slow.
Kim -- Are you talking about the abnormal termination error? If yes, then i deleted the hash files from command prompt and tried running the job. But still i get this abnormal termination in PROD box alone.
For the abnormal termination error, this job runs fine on DEV and TEST boxes. Only on PROD box i get the error. Once i split the job into two to distribute the hash lookups between these jobs, i got rid of the problem.
I wanted to know why it occurs in PROD box alone. Is it some memory settings in PROD box not allowing to have 10 hash files to be opened at a time or something similar to that?
Thanks,
Srikanth
Thanks for the replies.
The hash file is not that big. It has very few columns. And it is below 1 GB. The warnings come when the hash file reaches 100 MB and they keep coming once in a while until the job completes.
Craig -- How to increase the write cache at the project level?? If i turn off the write caching the performance is very slow.
Kim -- Are you talking about the abnormal termination error? If yes, then i deleted the hash files from command prompt and tried running the job. But still i get this abnormal termination in PROD box alone.
For the abnormal termination error, this job runs fine on DEV and TEST boxes. Only on PROD box i get the error. Once i split the job into two to distribute the hash lookups between these jobs, i got rid of the problem.
I wanted to know why it occurs in PROD box alone. Is it some memory settings in PROD box not allowing to have 10 hash files to be opened at a time or something similar to that?
Thanks,
Srikanth
I hardly ever have write caching turned on and I can get very high speed performance from hashed file writes - provided the hashed file is properly precreated.rsrikant wrote:Craig -- How to increase the write cache at the project level?? If i turn off the write caching the performance is very slow.
That being said, if you want to try bumping the write cache size - it is done via the Adminstrator from the Tunables tab of each Project, from what I recall. The effect of a change there is immediate, nothing 'extra' needs to be done.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Kim - I believe the problem is with no. of hash files and not with the oracle client. Because once i split the job in to two and reduced the hash look ups the jobs are running in PROD as well. Is my understanding wrong?
Thanks craig. I found where to change the write cache limit in administrator.
Thanks,
Srikanth
Thanks craig. I found where to change the write cache limit in administrator.
Thanks,
Srikanth
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Maybe there are enough rows in production to exceed the hashed file cache size limit (default 128MB) but not in development or test environments?
Try increasing the read and write cache sizes in production, using Administrator client, Tunables tab.
Try increasing the read and write cache sizes in production, using Administrator client, Tunables tab.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Srikanth
The cache size can only be set to what shmtest command will allow. Beyond that you get these kinds of errors. This command will tell what to set the parameters in uvconfig. You must then do a uvregen. Then you need to stop and restart DataStage.
You need to open a ticket with support on this with IBM. Should not be happening. Your DEV and TEST boxes are different machines than PROD. Therefore uvconfig will be different as well as the results of shmtest.
You are correct in spliting the file because if the file exceeds the cache limit then it gives you a warning and uses it from disk and not in memory. That is a very clever solution. If you have a natural way to split your keys then why not have 2 files in memory because it will run lots faster with 2 files in memory and 2 lookups instead of one lookup from disk.
I will look for my notes on shmtest. I am doing all this from memory and my memory is not as good as it used to be. Maybe we can get a full Wurlod.
No matter. You are on the right track. Let us know how you solve it.
If you reinstalled or upgraded DataStage then your uvconfig file got overwritten. It went back to the defaults which are not optimal.
The cache size can only be set to what shmtest command will allow. Beyond that you get these kinds of errors. This command will tell what to set the parameters in uvconfig. You must then do a uvregen. Then you need to stop and restart DataStage.
You need to open a ticket with support on this with IBM. Should not be happening. Your DEV and TEST boxes are different machines than PROD. Therefore uvconfig will be different as well as the results of shmtest.
You are correct in spliting the file because if the file exceeds the cache limit then it gives you a warning and uses it from disk and not in memory. That is a very clever solution. If you have a natural way to split your keys then why not have 2 files in memory because it will run lots faster with 2 files in memory and 2 lookups instead of one lookup from disk.
I will look for my notes on shmtest. I am doing all this from memory and my memory is not as good as it used to be. Maybe we can get a full Wurlod.
No matter. You are on the right track. Let us know how you solve it.
If you reinstalled or upgraded DataStage then your uvconfig file got overwritten. It went back to the defaults which are not optimal.
Mamu Kim