Page 1 of 1

DS Grid Tool Kit

Posted: Fri Jun 26, 2015 11:07 am
by oacvb
We ran a test job and it failed when it runs in Grid Enabled mode. The same job runs when grid is disabled. Job generates data from row generator and directly writes into Peak stage. It says config file that is generated dynamically is not present whereas the config file is generated and available. Verified mount points, they were correct and has the correct permissions. The same config file is used while running in grid disabled mode and it worked correctly. Any thought on this issue?

Posted: Sun Jun 28, 2015 6:42 am
by PaulVL
After installing the Grid Enablement Toolkit, you must always run the test.sh script present in that $GRIDHOME.

Hopefully you bounced your engine as well so that it can add the LSF info.

Double hopefully you added the entries into your dsenv that you were suppose to.

Posted: Sun Jun 28, 2015 7:29 am
by oacvb
We have run test.sh but it failed.

Posted: Sun Jun 28, 2015 1:13 pm
by chulett
:idea: If you want people to be able to help you, you'll need to be more forthcoming with details of the issues you have. Saying "it failed" doesn't really help anyone help you... so failed how? What was the error message?

It would also be helpful to respond to all points brought up, for example by Paul here - did you in fact 'bounce your engine' and add 'the entries into your dsenv that you were supposed to'?

Posted: Mon Jun 29, 2015 8:37 am
by lstsaur
If you can't even get your test.sh script to run successfully, most likely the installation of the grid_enabled components is not configured correctly.

Posted: Mon Jun 29, 2015 9:31 am
by PaulVL
Or the gridjob dir is not actually shared to the other compute nodes.

Validate that you can even submit a platform lsf job.


bsub -q your_queue "touch /tmp/hi_mom"

then go see if that file shows up on the server it was suppose to run on.

Posted: Thu Jul 02, 2015 3:49 am
by oacvb
It is a single node cluster, so we have mounted all the required file systems based on Virtual IP. It creates config file but not able to read the config file after creation. We typed cat command that was throwing error from command line and we could able to read it. So we have mounted back with Server Name. Intention is to have two node cluster, hence these file systems will be moved to another node during fail over. Please let me know if you need more details.

Posted: Thu Jul 02, 2015 4:51 am
by priyadarshikunal
Sounds like a High Availability solution in Clustered mode. Not a grid. You are basically trying to test grid on a single SMP box. Please correct me If i am wrong.

Please confirm whether you want an active passive cluster or a Grid.

Posted: Thu Jul 02, 2015 7:51 am
by oacvb
It is a grid environment with active - passive cluster.

Posted: Thu Jul 02, 2015 8:06 am
by PaulVL
Please engage your IBM Support. You've been down for far to long already.

Active / Passive is for head node only. Not grid compute nodes.

If you are trying to farm off a job onto the passive head node, then you may result in missing binaries depending on the way your active/passive failover mechanism works with mounts.


Go back to basics. Is Platform LSF working? Can you submit a bsub command and have it work?

If yes, then execute the test.sh command from within the $GRIDHOME, if that doesn't work then look at the grid job dir mount and ensure that the user id you are using can write to it from the head node and compute nodes. It must be a common mount point. ensure that the datastage engine is also exposed to the compute nodes.

But basically... go engage IBM Services to assist in your grid setup.