DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
This topic has been marked "Resolved."
Author Message
jackson.eyton



Group memberships:
Premium Members

Joined: 26 Oct 2017
Posts: 110

Points: 1839

Post Posted: Wed Nov 29, 2017 10:11 am Reply with quote    Back to top    

DataStage® Release: 11x
Job Type: Parallel
OS: Windows
Hi everyone,
So I have been wondering how to present this question to IBM and figured I would ask here first. A quick summary is that DataStage Designer and all other applications that relate to DataStage (administrator, director, etc) will periodically tweak out and become incredibly slow. There are two scenarios in which this happens.

1. If my fellow ETL developer and I are both in DataStage Designer, working our separate jobs, and one of us decides to run a job, DataStage will then crunch for both of us until that job has completed. Meaning if he runs his job, I then cannot open any stages in my job without DataStage going "Not Responding" until his job has completed. This seems very odd to me.

2. Various scenarios can occur where DataStage Designer simply starts hanging up for one of us for no obvious reason. Notably yesterday I highlighted about 8 jobs that were backup copies and attempted to delete them. My DataStage stopped responding and as soon as that happened my coworker's DataStage started working incredibly slow. I waited an hour and ended up killing the process and disconnecting all of my sessions via the admin console. The slowness issue however did not resolve and persists even today. We have had this happen before with no obvious rhyme or reason and we've had to reboot the server to correct this.

We have a development environment on one server, we'll call it Dev1 and our web services for infosphere on another server, we'll call Svc1. These servers both run fully patched windows sever 2012. We have monitored the system performance and resources of both servers during both the above scenarios and there are no persistent spikes of any resources (CPU, Memory, Disk, or even Network). We are using Fat clients, not terminal services so the bog down of DataStage Designer when another individual is even running a job confuses me.

Does anyone have any suggestions on where to start with this?

*EDIT*
We have multiple projects on this development server and I have confirmed that the performance issue is persistent among the projects, pointing to the server as the common denominator.

*UPDATE*
We were able to get some improvement by closing the log pane in Designer. Thus we cleared some of the logs, which retention is set to the last 3 runs of a job, and this also helped some. Opening a stage in a job and looking at its properties for example now seems normal. However, saving a new job and compiling still take far longer than a freshly booted server. A job as simple as SRC---TRXFM---DEST would normally take 30 seconds or so to compile, now takes nearly 10 minutes (6 minutes as of last clocking).

_________________
-Me

Last edited by jackson.eyton on Wed Nov 29, 2017 12:11 pm; edited 2 times in total
boxtoby



Group memberships:
Premium Members

Joined: 13 Mar 2006
Posts: 132
Location: UK
Points: 1375

Post Posted: Wed Nov 29, 2017 10:38 am Reply with quote    Back to top    

Hi
I have had scenario 1 when working from home and the connection not as good as in the office. I found that turning off "show performance statistics" in designer helped a lot.

Hope that helps,
Bob

_________________
Bob Oxtoby
Rate this response:  
jackson.eyton



Group memberships:
Premium Members

Joined: 26 Oct 2017
Posts: 110

Points: 1839

Post Posted: Wed Nov 29, 2017 10:44 am Reply with quote    Back to top    

I may give that a shot and see if that makes any improvement, obviously we wont want to leave it off permanently. What strikes me as really odd is that it happens at all. I can understand MY Designer going slow when I run a job MYSELF. However, when my coworker runs a job, that should not slow down my Designer for things as simple as opening a transform stage and getting to the stage properties. Its almost as if Designer runs from as an instance from the server itself and isn't REALLY a fat client.

_________________
-Me
Rate this response:  
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42753
Location: Denver, CO
Points: 220316

Post Posted: Wed Nov 29, 2017 5:02 pm Reply with quote    Back to top    

jackson.eyton wrote:
A job as simple as SRC---TRXFM---DEST would normally take 30 seconds or so to compile, now takes nearly 10 minutes (6 minutes as of last clocking).

Is the same thing true if a transformer is not involved? Don't forget they compile down to C++ code and we've seen sites with a single concurrent user compiler license...

_________________
-craig

Research shows that 6 out of 7 dwarves aren't happy
Rate this response:  
jackson.eyton



Group memberships:
Premium Members

Joined: 26 Oct 2017
Posts: 110

Points: 1839

Post Posted: Wed Nov 29, 2017 5:31 pm Reply with quote    Back to top    

THAT is an interesting question.... At this stage of our warehouse development, most of our jobs contain at least one transform stage. Could you point me in the right direction where to check our compiler license status?

_________________
-Me
Rate this response:  
qt_ky



Group memberships:
Premium Members

Joined: 03 Aug 2011
Posts: 2813
Location: USA
Points: 21315

Post Posted: Thu Nov 30, 2017 8:24 am Reply with quote    Back to top    

We don't generally run into that problem but we don't run our servers on Windows either.

I would be mostly suspicious of any kind of security software, especially anti-virus software running scans on your clients, or even worse, on your server.

My next suspicion would be gremlins on your network. Years ago we did have some extreme network problems that would crop up and bring everything to a grinding halt.

_________________
Choose a job you love, and you will never have to work a day in your life. - Confucius
Rate this response:  
jackson.eyton



Group memberships:
Premium Members

Joined: 26 Oct 2017
Posts: 110

Points: 1839

Post Posted: Thu Nov 30, 2017 9:56 am Reply with quote    Back to top    

Yea, we have done the standard AV disabling, as well as network sniffing and monitoring and so far that all looks fine. I was able to verify the CPU is pegged when I run a transfornation job.

This does not occur when compiling a job. I did a little digging into the compiler license but I am unsure exactly how to confirm how many concurrent compiler licenses we have available. Additionally there are only two of us so we rarely compile anything at the same time. Which from the following quote is the real benefit of being multi-licensed?
"For some compilers, each developer must have a license at the time that the developer compiles the job with the Designer client. The maximum number of simultaneous processes that compile jobs determines the number of licenses."
-SRC https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.5.0/com.ibm.swg.im.iis.productization.iisinfsv.install.doc/topics/wsisinst_set_envars_cpp.html

As of now the server is still acting up as we are refusing to reboot it just to temporarily resolve the pain. As such, things like just running a job take 2+ minutes between the job log says starting job, and data actually starts processing. Sad

_________________
-Me
Rate this response:  
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42753
Location: Denver, CO
Points: 220316

Post Posted: Thu Nov 30, 2017 11:08 am Reply with quote    Back to top    

Only two of you? Not so much an issue then. Wink

_________________
-craig

Research shows that 6 out of 7 dwarves aren't happy
Rate this response:  
jackson.eyton



Group memberships:
Premium Members

Joined: 26 Oct 2017
Posts: 110

Points: 1839

Post Posted: Thu Nov 30, 2017 11:38 am Reply with quote    Back to top    

Correct, as far as the compiler licensing goes. However the issues we see when we run jobs in addition to the fact that there are only two of us, really raises a red flag for me. I imagine there are companies with dozens of developers all building and testing jobs. Its rather odd that me running a job causes my partners Designer to slow to a crawl and vice versa...

Does anyone have the actual InfoEng server recommended specifications? What I am finding from the IBM knowledge center doesn't make a whole lot of sense.

_________________
-Me
Rate this response:  
PaulVL



Group memberships:
Premium Members

Joined: 17 Dec 2010
Posts: 1254

Points: 8241

Post Posted: Thu Nov 30, 2017 1:05 pm Reply with quote    Back to top    

it's not odd if someone went and changed your apt file and went nuts with the quantity of nodes.
Rate this response:  
jackson.eyton



Group memberships:
Premium Members

Joined: 26 Oct 2017
Posts: 110

Points: 1839

Post Posted: Thu Nov 30, 2017 1:25 pm Reply with quote    Back to top    

I have done some playing around with those myself in the past, however our APT settings are configured as a two node system. Our Dev server does only have two cores however. I am of the mind that the server cores should be the result of the following equation: (N * E)+1
Where N is the number of Nodes in APT and E is the number of employees who could be running jobs simultaneously.

Additionally this is not the whole of our issue as the performance hit can occur and maintain itself even when jobs are not running. Such that simply opening jobs, or stages in a job, opening wizards, and even creating build packages are all extremely sluggish.

_________________
-Me
Rate this response:  
qt_ky



Group memberships:
Premium Members

Joined: 03 Aug 2011
Posts: 2813
Location: USA
Points: 21315

Post Posted: Thu Nov 30, 2017 3:45 pm Reply with quote    Back to top    

Perhaps you are on a virtual server and other virtual machines on the same hardware are hammering away. It could be thin-provisioned and overloaded.

_________________
Choose a job you love, and you will never have to work a day in your life. - Confucius
Rate this response:  
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42753
Location: Denver, CO
Points: 220316

Post Posted: Thu Nov 30, 2017 6:26 pm Reply with quote    Back to top    

On the subject of cores versus nodes, don't forget that "cores" are a physical concept while "nodes" are a logical concept... basically worker threads in the O/S where you typically have no control over which ones run where. There's WAY more to it than that, of course and I'm sure Ray has given a masters class or twelve on the subject but wanted to put out there that there is no simple equation for the number of nodes any given system could support.

<tip-toes silently away from the keyboard>

_________________
-craig

Research shows that 6 out of 7 dwarves aren't happy
Rate this response:  
jackson.eyton



Group memberships:
Premium Members

Joined: 26 Oct 2017
Posts: 110

Points: 1839

Post Posted: Fri Dec 01, 2017 8:44 am Reply with quote    Back to top    

qt_ky,
Yes our server is a vm on a cluster, and our IT department is resistant to adding more cores to our vm as that could potentially decrease performance due to the server having to potentially wait for a higher number of cores to become available.

chulett,
..... O_o ....well hmm....

_________________
-Me
Rate this response:  
PaulVL



Group memberships:
Premium Members

Joined: 17 Dec 2010
Posts: 1254

Points: 8241

Post Posted: Fri Dec 01, 2017 9:36 am Reply with quote    Back to top    

Here's my two cents:

1) I dislike VMs for any ETL work. To much politics with the back room server folks. You've paid a lot of money to license this software, it's used to push the data that your company feeds off of... get some hardware under it, not a virtual layer.

2) Ditch Windows and go to Unix for this ETL tool.

3) My phone has more cores than your company ETL tool host.

4) Do you have your xmeta running on the same host as your engine? You could farm that off onto a separate host. Thus freeing up CPU/Mem. I would keep the domain tier (WAS) with the engine.


Once you said you had two cores running on windows... there is little surprise that you are getting laggy performance once someone actually runs a job and person number 2 tries to do anything else.


Given that you have 2 cores... I am guessing that money is an issue for your dept. I would recommend getting a second host (since hosts are cheaper than software licensing for DataStage) and farm off all non essential "Data Manipulation" activities to that. Like a zip, an SFTP, etc... have the data on a shared mount. But that zip and ftp, placed upon a different host, will free up CPU and Memory on your primary ETL host. Thus you can run more concurrent jobs. You should write a standard set of scripts to consistently farm off that work to the other box.
Rate this response:  
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours