issue while reading 2gb xml file using XML Stage in 8.5FP1

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

gsingh
Participant
Posts: 21
Joined: Fri Jun 10, 2011 11:29 am

issue while reading 2gb xml file using XML Stage in 8.5FP1

Post by gsingh »

HI DS Guru's,

I'm trying to read a 2gb xml file using xml stage in 8.5. I have given Heap size as 2000 and stack size as 500.

The problem is the job is hanging, it reads only 1 row and then the job hangs. can anyone help me out in solving the issue.

Thanks...
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

How long did you wait before you decided it was "hung"? Unless the XML has been "pretty printed" or formatted for peoples, don't forget that each file is essentially only one long record.
-craig

"You can never have too many knives" -- Logan Nine Fingers
gsingh
Participant
Posts: 21
Joined: Fri Jun 10, 2011 11:29 am

Post by gsingh »

Hi Craig,

I waited for 6 hours and it kept on running and there was no log created after 3 minutes the job has been launched.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Let's wait and see what Ernie's thoughts on this are, he's most familiar with the 8.5 changes to the XML handling around here.
-craig

"You can never have too many knives" -- Logan Nine Fingers
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

First thing is the most critical --- does the job read the file perfectly with a test document that is 1k?

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
gsingh
Participant
Posts: 21
Joined: Fri Jun 10, 2011 11:29 am

Post by gsingh »

eostic wrote:First thing is the most critical --- does the job read the file perfectly with a test document that is 1k?

Ernie ...
Yes Ernie. The job reads the 1k file without any issues.

please advise what has to be done.

I increased the heap size to 4096 and tried, it didn't work.
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Re: issue while reading 2gb xml file using XML Stage in 8.5F

Post by qt_ky »

Any time I see "2 GB" I get suspicious because a lot of operating systems, by default, impose 2 GB limitations on file size and those can cause errors or hangs. Check your OS (ulimit -a) or check with your DataStage and/or Unix administrator to make sure that the right ulimit file size setting is set to unlimited.
Choose a job you love, and you will never have to work a day in your life. - Confucius
gsingh
Participant
Posts: 21
Joined: Fri Jun 10, 2011 11:29 am

Re: issue while reading 2gb xml file using XML Stage in 8.5F

Post by gsingh »

qt_ky wrote:Any time I see "2 GB" I get suspicious because a lot of operating systems, by default, impose 2 GB limitations on file size and those can cause errors or hangs. Check your OS (ulimit -a) or check with your DataStage and/or Unix administrator to make sure that the right ulimit file size setting is set to unlimited.
Hi Ernie,

I have used the ulimit -a command on our server and here is the result.
ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) unlimited
memory(kbytes) unlimited
coredump(blocks) 2097151
nofiles(descriptors) unlimited
threads(per process) unlimited
processes(per user) unlimited


I see the file size is set to unlimited.

please help!
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

I am not Ernie, but thanks for the compliment.

Just to be sure because ulimit is such a common problem, have you tried running the "ulimit -a" command as part of the before-job subroutine (ExecSH) and is the output you sent obtained from the job log? I'm just highlighting that now because there can sometimes be differences between the ID you telnet with and the ID that executes the job.

If you do the "ls -l" command on the 2 GB file, what is the exact size of the file in bytes?
Choose a job you love, and you will never have to work a day in your life. - Confucius
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Exactly, you must run the ulimit command from inside the job's environment, not just simply at the command line. If that's what you did.
-craig

"You can never have too many knives" -- Logan Nine Fingers
gsingh
Participant
Posts: 21
Joined: Fri Jun 10, 2011 11:29 am

Post by gsingh »

Here is the file size: 2295161299

And am not sure of how to run the command in sub routine. can you please help me in where do i exactly need to use the command.

Thanks
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

2 GB is 2147483648 bytes, so your file is a bit over 2 GB. This may not be the issue, but it's worth trying to rule it out.

Go into your test job, the one that does not hang, and go into Job Properties. On the General tab under Before-job subroutine choose ExecSH. For the Input Value, enter an OS command: ulimit -a

When you run the test job, you should find the output from the command in the job log.
Choose a job you love, and you will never have to work a day in your life. - Confucius
gsingh
Participant
Posts: 21
Joined: Fri Jun 10, 2011 11:29 am

Post by gsingh »

I have tried this and it even shows the file size as UNLIMITED.

Please let me know what can i try next

Thanks..
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

The next thing is to understand exactly how you are implementing the 8.5 xml stage....

What method, in your 1k working example, are you using to pass the document or name of the document to the xml Stage ...there are many ways to do it.....

Please describe the structure of your Job (stages and their order) and how you have configured your Input and xmlParser Steps within the Assembly.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
gsingh
Participant
Posts: 21
Joined: Fri Jun 10, 2011 11:29 am

Post by gsingh »

eostic wrote:The next thing is to understand exactly how you are implementing the 8.5 xml stage...What method, in your 1k working example, are you using to pass the document or name of the document to the xml ...
Hi Ernie,
we have used External source stage to pass the xml document to the xml stage. Here is the command: ls tagetfilepath/filename
The job has following stages:

External source stage--->XML STAGE---->DATASET

In xml stage under usage we have given Heap size as 4096 and stack size as 3000, threads as 4.
IN INput step: One column from external source stage is passed of type varchar of length 9999.
XML_Parser Step: we have used file set option and at validation we have used Minimal validation.

Please advise......

Thanks
Post Reply