Hi experts,
Running on Windows system with 32 GB RAM, 8 CPUs and 4 node file.
We have a flat file with 96 columns needing to be analyzed.
When running on a sample of 20 records it runs incredibly slow, with only a few columns analyzed before eventually failing with "java.lang.OutOfMemoryError: Java heap space" error message.
What we have noticed also, is that in Director's log there is 10 jobs for 1 column analysis being run. Guessing that is what is making the process heavy.
Does anyone know how can we fix this?
Regards,
Novak
Column Analysis failing due to java heap space
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
You can increase the size of Java heap space. Search here and/or IBM Information Center for "Xmx".
If you run your column analyses with "preserve scripts" enabled, you will be able to look at the jobs in DataStage director, including the logs, to ascertain what some of these processes do.
Note, too, that Information Analyzer will break up a column analysis request into multiple requests each of which doesn't process too many columns. So, to process your 96 columns, it's no real surprise that the workload was split into ten units each processing 10 (or 9) columns.
If you run your column analyses with "preserve scripts" enabled, you will be able to look at the jobs in DataStage director, including the logs, to ascertain what some of these processes do.
Note, too, that Information Analyzer will break up a column analysis request into multiple requests each of which doesn't process too many columns. So, to process your 96 columns, it's no real surprise that the workload was split into ten units each processing 10 (or 9) columns.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
I only ask because you have more ability with a 64bt system to increase the heap size than you do with a 32bit one and the memory limits it brings. There are several Technotes out there on that subject, here is one example.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Thanks a lot guys.
We will almost definitely upgrade to 64-bit on Linux and hopefully within couple of months. This is the second time I am running IA on Windows and it is painful to say the least. Not just because of this failure, but the overall end user response times.
Until then, and on advice from IBM's support, we have continued our data profiling on 2 nodes, rather than 8. Hardly any failures since then.
The run times between them are not that different so we can live with it.
Cheers,
Novak
We will almost definitely upgrade to 64-bit on Linux and hopefully within couple of months. This is the second time I am running IA on Windows and it is painful to say the least. Not just because of this failure, but the overall end user response times.
Until then, and on advice from IBM's support, we have continued our data profiling on 2 nodes, rather than 8. Hardly any failures since then.
The run times between them are not that different so we can live with it.
Cheers,
Novak