Performance issue

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

arsheshadri
Participant
Posts: 78
Joined: Wed Oct 26, 2005 6:12 am

Performance issue

Post by arsheshadri »

Hi,

We are trying to process 1 million records. Following is the job structure.

Source1 - Flat file (1 million records) (left)
Source2 - Data set (2 million records) (used as ref)

Before doing left outer join we are sorting based on key column.

The 1 million output, will be passed to transformer to do some mappings and the output of transformer is sent to another flat file.

This job has taken almost 12 hours and still it is not completed.

Now we are not able to open any director / designer / administrator or Manager, it is giving error as " Project UV : The Connection timeout (81015).

But when we checked the output file, it seems that the ds is still loading to the target file, as the file size & no. of rows still keep on increasing.

What can we do to improve the performance?
Should we use merge instead of join?
Should we use Auto or Hash partitioning?

Do we have to restart the ds to resolve above 81015 issue?

Please suggest

Thanks & Regards
Sheshadri
Thanks & Regards
Shesha
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

It sounds like you have two problems - one is a long run for a job, and the other is no longer being able to login to DataStage from any client.

Is there still a process called "dsrpcd" running? If you do a "ps -ef | grep ds" could you post that output?
arsheshadri
Participant
Posts: 78
Joined: Wed Oct 26, 2005 6:12 am

Post by arsheshadri »

[quote="ArndW"]It sounds like you have two problems - one is a long run for a job, and the other is no longer being able to login to DataStage from any client.

Is there still a process called "dsrpcd" running? If ...[/quote]

Thanks ArndW,

I am not able to view your entire msg, but the answer to dsrpcd is yes..There is a process running -

ps -ef|grep -e dsrpcd
root 253966 1 0 Feb 15 - 0:00 /DataStage/product/Ascential/DataStage/DSEngine/bin/dsrpcd

Thanks & Regards
Sheshadri
Thanks & Regards
Shesha
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

What is the output of "ps -ef | grep ds"?
arsheshadri
Participant
Posts: 78
Joined: Wed Oct 26, 2005 6:12 am

Post by arsheshadri »

[quote="ArndW"]What is the output of "ps -ef | grep ds"? ...[/quote]

There are around 195 entries some of them look like

dsadm 495660 253966 0 Mar 20 - 0:00 dscs 4 0 0
dsadm 634960 733188 0 Mar 20 - 0:01 dsapi_slave 7 6 0
dsadm 733188 253966 0 Mar 20 - 0:00 dscs 4 0 0
dsadm 1147030 253966 0 Mar 20 - 0:00 dscs 4 0 0
dsadm 1208396 1147030 0 Mar 20 - 0:05 dsapi_slave 7 6 0
dsadm 1384512 495660 0 Mar 20 - 0:14 dsapi_slave 7 6 0
dsadm 1966120 1716394 0 11:33:10 - 0:00 /DataStage/product/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag gb02qas53tefxx7 10004 0 30 node0 gb02qas53tefxx7 1206012772.257362.ee012
dsadm 2007134 1695892 1 0:00 <defunct>
dsadm 2023488 1716394 0 11:33:10 - 0:00 /DataStage/product/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag gb02qas53tefxx7 10004 0 30 node0 gb02qas53tefxx7 1206012772.257362.ee012
dsadm 2035886 1573114 1 0:00 <defunct>
dsadm 2052254 1941522 0 11:33:00 - 0:00 /DataStage/product/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag gb02qas53tefxx7 10004 0 30 node0 gb02qas53tefxx7 1206012772.257362.ee012
dsadm 2080826 1695892 0 0:00 <defunct>

Thanks & Regards
Sheshadri
Thanks & Regards
Shesha
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Those "defunct" processes shouldn't be there anymore. What does "ps -ef | grep 1695892" show? The PX jobs aren't using any system time, either.

The processes show that you still have a couple of client tool sessions open on the system, though.

If you still cannot login to any client tool it might be best to restart datastage. Do you have administrator privileges and is this an installation which can be restarted as the DataStage administrator?
arsheshadri
Participant
Posts: 78
Joined: Wed Oct 26, 2005 6:12 am

Post by arsheshadri »

[quote="ArndW"]Those "defunct" processes shouldn't be there anymore. What does "ps -ef | grep 1695892" show? The PX jobs aren't using any system time, either.

The processes show that you still have a couple of client tool sessions open on the system, though.

If you still cannot login to any client tool it might be best to restart datastage. Do you have administrator privileges and is this an installation which can be restarted as the DataStage administrator?[/quote]

Hi,

The above grep cmd is showing below entries -
dsadm 1695892 1941522 0 11:32:59 - 0:01 /DataStage/product/Ascential/DataStage/PXEngine/bin/osh -APT_PMsectionLeaderFlag gb02qas53tefxx7 10004 0 30 node0 gb02qas53tefxx7 1206012772.257362.ee012
dsadm 2007134 1695892 1 0:00 <defunct>
dsadm 2080826 1695892 0 0:00 <defunct>

Yes, I have admin priv. can you please tell the procedure to terminate active sessions and how to restart the server, I think nobuddy is using the DS currently.

By the way, We have found out why it was consuming more time. It was because of some statements which we were using in the transformer. Earlier it was processing with 8 rows per sec. Now after changing the statements, it has increased to 6000 rows per sec. :D

Thanks & Regards
Sheshadri
Thanks & Regards
Shesha
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

It looks like process 1695892 might be hung. Do a "kill 1695892" and see if the defunct processes go away and if you can login to the client tools.
arsheshadri
Participant
Posts: 78
Joined: Wed Oct 26, 2005 6:12 am

Post by arsheshadri »

[quote="ArndW"]It looks like process 1695892 might be hung. Do a "kill 1695892" and see if the defunct processes go away and if you can login to the client tools. ...[/quote]

I killed the process, but still getting same connection timeout error. Can I restart the server by using below?

dshom/bin/uv -admin -stop

dshom/bin/uv -admin -start

Thanks & Regards
Sheshadri
Thanks & Regards
Shesha
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Yes, but you need to wait a bit of time between the stop and start and make sure that all processes have terminated.
arsheshadri
Participant
Posts: 78
Joined: Wed Oct 26, 2005 6:12 am

Post by arsheshadri »

[quote="ArndW"]Yes, but you need to wait a bit of time between the stop and start and make sure that all processes have terminated. ...[/quote]

I am getting the below error while stopping the server -

./bin/uv -admin -stop
Unable to remove the following shared memory segment(s) during shutdown:
m 3145730 0xadec7512 --rw-rw-rw- root system 1364054 635124
m 84934708 0xadee7512 --rw-rw-rw- root system 1765440 635124
Stopping JobMonApp
JobMonApp has not been started from: /DataStage/product/Ascential/DataStage/PXEngine
2 error(s) encountered during shutdown procedure.
DataStage Engine 7.5.1.2 instance "ade" may be in an inconsistent state.
gb02qas53tefxx7[/DataStage/product/Ascential/DataStage/DSEngine]$

Thanks & Regards
Sheshadri
Thanks & Regards
Shesha
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

Perhaps someone used "kill -9"? Are the any DataStage processes left? If you start datastage you should see a "dsrpcd" started as well.
arsheshadri
Participant
Posts: 78
Joined: Wed Oct 26, 2005 6:12 am

Post by arsheshadri »

[quote="ArndW"]Perhaps someone used "kill -9"? Are the any DataStage processes left? If you start datastage you should see a "dsrpcd" started as well. ...[/quote]

I think it did not stop the server properly. Even after giving stop cmd, I could able to see the dsrpcd.

ps -ef|grep -e dsrpcd
root 253966 1 0 Feb 15 - 0:00 /DataStage/product/Ascential/DataStage/DSEngine/bin/dsrpcd

While starting I gave "y" at the prompt

./bin/uv -admin -start
Starting JobMonApp
override mode 664 on /DataStage/product/Ascential/DataStage/PXEngine/java/JobMonApp.log.DUMMYVALUE? y
JobMonApp has been started.

But when I gave dsjob -lprojects, it is showing 81015 status

Thanks & Regards
Sheshadri
Thanks & Regards
Shesha
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

stop it again. Is the dsrpcd still running? (If yes, you can't start again anyway). Use "kill {pid}" to stop it. Enter the command "ipcs -a | grep 0xade" and there should no longer be any segments shown.
arsheshadri
Participant
Posts: 78
Joined: Wed Oct 26, 2005 6:12 am

Post by arsheshadri »

[quote="ArndW"]stop it again. Is the dsrpcd still running? (If yes, you can't start again anyway). Use "kill {pid}" to stop it. Enter the command "ipcs -a | grep 0xade" and there should no longer be any segments sho ...[/quote]

I dont have root priv. I only have ds admin priv. So I am not able to kill dsrpc. Is there a way to stop / kill from $DSHOME? or through any other method?
Thanks & Regards
Shesha
Post Reply