Issue with Dataset performance in RedHat Linux
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 12
- Joined: Fri Jun 13, 2008 2:51 pm
Issue with Dataset performance in RedHat Linux
We have a very similar issue in our shop... writing to or reading from datasets seems to take forever.
as an extreme example, I had a job that read 145 rows from a sequential file and wrote them to a dataset.
We use an 8 node config file running DS 8.5
seqFile------------------>Transformer-------------------->Dataset
2 columns on seq file
COL1 VARCHAR(4)
COL2 VARCHAR(4)
Transformer
Does nothing, I did not write this job or it would not be there
2 columns on Dataset
COL1 VARCHAR(4)
COL2 VARCHAR(4)
Here is the Director job start and end times Not kidding either...
Starts at 2015-12-19 11:26:19 AM
Ends at 2015-12-19 04:01:25 PM
No warnings
nothing captured/ignored in a message handler at the job or project level.
on our older SOLARIS server, the job ran fine in seconds.
I tried the
RowGenerator------------------>Transformer-------------------->Dataset
took seconds to run for 145 rows.
Some other info....
in July 2015, we migrated from a Solaris server to a server running Red Hat Linux 2.6.32-431.5.1.el6.x86_64
when I ran the RowGenerator, it was during a quiet time on our production server so this may explain why the job ran fast
When to original job ran, it was during the busy time on the production server.
Could LINIUX be trying to implement some sort of resource management when the box is busy?
as an extreme example, I had a job that read 145 rows from a sequential file and wrote them to a dataset.
We use an 8 node config file running DS 8.5
seqFile------------------>Transformer-------------------->Dataset
2 columns on seq file
COL1 VARCHAR(4)
COL2 VARCHAR(4)
Transformer
Does nothing, I did not write this job or it would not be there
2 columns on Dataset
COL1 VARCHAR(4)
COL2 VARCHAR(4)
Here is the Director job start and end times Not kidding either...
Starts at 2015-12-19 11:26:19 AM
Ends at 2015-12-19 04:01:25 PM
No warnings
nothing captured/ignored in a message handler at the job or project level.
on our older SOLARIS server, the job ran fine in seconds.
I tried the
RowGenerator------------------>Transformer-------------------->Dataset
took seconds to run for 145 rows.
Some other info....
in July 2015, we migrated from a Solaris server to a server running Red Hat Linux 2.6.32-431.5.1.el6.x86_64
when I ran the RowGenerator, it was during a quiet time on our production server so this may explain why the job ran fast
When to original job ran, it was during the busy time on the production server.
Could LINIUX be trying to implement some sort of resource management when the box is busy?
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
-
- Premium Member
- Posts: 12
- Joined: Fri Jun 13, 2008 2:51 pm
Ray
It happens every week since we moved to Linux Red Hat.. There are other jobs experiencing the same issue in performance. The one I wrote about is an extreme example
Not sure how to tell how big each segment is...
is this info from Dataset Management useful?
It happens every week since we moved to Linux Red Hat.. There are other jobs experiencing the same issue in performance. The one I wrote about is an extreme example
Not sure how to tell how big each segment is...
is this info from Dataset Management useful?
Code: Select all
##I IIS-DSEE-TFCN-00001 08:47:18(000) <main_program>
IBM WebSphere DataStage Enterprise Edition 8.5.0.6152
Copyright (c) 2001, 2005-2008 IBM Corporation. All rights reserved
##I IIS-DSEE-TUTL-00031 08:47:18(001) <main_program> The open files limit is 16384; raising to 32768.
##I IIS-DSEE-TFCN-00006 08:47:18(002) <main_program> conductor uname: -s=Linux; -r=2.6.32-431.3.1.el6.x86_64; -v=#1 SMP Fri Dec 13 06:58:20 EST 2013; -n=EC24LP4060; -m=x86_64
##I IIS-DSEE-TFSC-00001 08:47:19(000) <main_program> APT configuration file: /disk/temp/datastage/ADW_GST_AUDIT/TMPDIR/aptoa6660cc0f71fc
##I IIS-DSEE-TOIX-00059 08:47:19(000) <APT_RealFileExportOperator in APT_FileExportOperator,0> Export complete; 101 records exported successfully, 0 rejected.
Name: /disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds
Version: ORCHESTRATE V8.5.0 DM Block Format 6.
Time of Creation: 12/19/2015 14:30:39
Number of Partitions: 8
Number of Segments: 1
Valid Segments: 1
Preserve Partitioning: false
Segment Creation Time:
0: 12/19/2015 14:30:39
Partition 0
node : node1
records: 19
blocks : 1
bytes : 168
files :
Segment 0 :
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0000.0000.aea.d83efb5f.0000.0527aae4 131072 bytes
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0000.0001.aea.d83efb5f.0001.ef31cece 0 bytes
total : 131072 bytes
Partition 1
node : node2
records: 18
blocks : 1
bytes : 160
files :
Segment 0 :
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0001.0000.aea.d83efb5f.0002.6e1bf330 131072 bytes
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0001.0001.aea.d83efb5f.0003.0f1ca6f1 0 bytes
total : 131072 bytes
Partition 2
node : node3
records: 18
blocks : 1
bytes : 158
files :
Segment 0 :
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0002.0000.aea.d83efb5f.0004.0b748e6b 131072 bytes
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0002.0001.aea.d83efb5f.0005.287499f0 0 bytes
total : 131072 bytes
Partition 3
node : node4
records: 18
blocks : 1
bytes : 160
files :
Segment 0 :
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0003.0000.aea.d83efb5f.0006.d8c1bf43 131072 bytes
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0003.0001.aea.d83efb5f.0007.b0f249f6 0 bytes
total : 131072 bytes
Partition 4
node : node5
records: 18
blocks : 1
bytes : 160
files :
Segment 0 :
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0004.0000.aea.d83efb5f.0008.fbbfcb0d 131072 bytes
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0004.0001.aea.d83efb5f.0009.432e22ca 0 bytes
total : 131072 bytes
Partition 5
node : node6
records: 18
blocks : 1
bytes : 158
files :
Segment 0 :
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0005.0000.aea.d83efb5f.000a.8edb13b2 131072 bytes
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0005.0001.aea.d83efb5f.000b.7505a734 0 bytes
total : 131072 bytes
Partition 6
node : node7
records: 18
blocks : 1
bytes : 162
files :
Segment 0 :
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0006.0000.aea.d83efb5f.000c.3e74c98b 131072 bytes
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0006.0001.aea.d83efb5f.000d.be714afc 0 bytes
total : 131072 bytes
Partition 7
node : node8
records: 18
blocks : 1
bytes : 160
files :
Segment 0 :
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0007.0000.aea.d83efb5f.000e.48465cc3 131072 bytes
/disk/data/datastage/ADW_DWMA/GST_AUDIT/Datasets/LkupFoundInADWorg.ds.iwadwp.ec24lp4060.0000.0007.0001.aea.d83efb5f.000f.0a10ea2d 0 bytes
total : 131072 bytes
Totals:
records : 145
blocks : 8
bytes : 1286
filesize: 1048576
min part: 131072
max part: 131072
Schema:
record
( ADW_Office_Value: string;
ORG_office: string;
)
##I IIS-DSEE-TFSC-00010 08:47:19(001) <main_program> Step execution finished with status = OK.
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
Yes, your segment files are minimally sized (128KB each). Therefore that's not the problem. Data are moved to/from data sets in units of not less than 32KB, so you should be seeing very few I/O operations.
Time to involve your official support provider, methinks.
Time to involve your official support provider, methinks.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
-
- Premium Member
- Posts: 12
- Joined: Fri Jun 13, 2008 2:51 pm
Do you know what filesystem is used on "/disk/data/datastage/" and does that reside on a SAN or mounted/remote disk?
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>
-
- Premium Member
- Posts: 12
- Joined: Fri Jun 13, 2008 2:51 pm
-
- Premium Member
- Posts: 12
- Joined: Fri Jun 13, 2008 2:51 pm
Update...
Update...
We have just upgraded to 11.5. We were on 8.5 without support as we were past our license expire date by a year.
We contacted IBM as soon as we were in 11.5... here is what were told...
Datasets make use of a linux system call named fsync. As a test, IBM told our support area how to disable calls to fsync. Jobs then ran in Seconds with out fail.
Sadly, we cannot disable fsynch permanently but this proved the isdue not with Datastage itself but rather our setup...
I did not know at the time, but we are also using vmware over top of linux... vmware is interfering with calls to fsync.
Upshot... we will soon move to hardware using linux but vmware..
We have just upgraded to 11.5. We were on 8.5 without support as we were past our license expire date by a year.
We contacted IBM as soon as we were in 11.5... here is what were told...
Datasets make use of a linux system call named fsync. As a test, IBM told our support area how to disable calls to fsync. Jobs then ran in Seconds with out fail.
Sadly, we cannot disable fsynch permanently but this proved the isdue not with Datastage itself but rather our setup...
I did not know at the time, but we are also using vmware over top of linux... vmware is interfering with calls to fsync.
Upshot... we will soon move to hardware using linux but vmware..
-
- Premium Member
- Posts: 12
- Joined: Fri Jun 13, 2008 2:51 pm
apologies for reviving this thread....
sorry no CASE number from IBM only because I am no in the support gropu that would deal with them.
I asked my support area how they fixed it and this is what I was told:
the group that maintains our Data stage added this line to the dsenv file
APT_DATASET_FLUSH_NOSYNC=1; export APT_DATASET_FLUSH_NOSYNC
sorry no CASE number from IBM only because I am no in the support gropu that would deal with them.
I asked my support area how they fixed it and this is what I was told:
the group that maintains our Data stage added this line to the dsenv file
APT_DATASET_FLUSH_NOSYNC=1; export APT_DATASET_FLUSH_NOSYNC