Grasping at straws: Unix-mainframe FTP performance

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

Post Reply
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Grasping at straws: Unix-mainframe FTP performance

Post by FranklinE »

Hoping that someone may have had anything similar, about which you can share any details with me here. The situation:

We have jobs that loop through FTP sessions, one at a time, to get mainframe datasets and land them on our DataStage server. Tracing indicates that ACK is delayed, with only 3 or 4 data packets being transferred at a time. FTP settings are generic -- one multiple-instance parallel job is re-used in each loop, consisting of FTP Enterprise input to Sequential File output, transfer mode is binary, with EBCDIC being maintained at both ends. Log on and data port connections are confirmed at the start of each session, and the only error message we see is

Code: Select all

error in readCommFifo or ftp command function.
Volume of data doesn't seem to be a factor. Datasets in the hundreds or low thousands of records will take 2 to 5 minutes to complete, the job that made us notice the situation averages around 20,000 records and runs for 45 minutes or more. Oddly, and irritatingly, one of the largest files over 100,000 records runs in under 10 seconds. In case it's relevant, average record length is about 250 bytes.

I've disclosed as much detail as I'm allowed to here. We will rerun our trace from both sides concurrently next week to see if the Unix process is at fault. For now we just have messaging from the mainframe server side.
Last edited by FranklinE on Fri Sep 15, 2017 2:40 pm, edited 1 time in total.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Is the Mainframe host target consistent? Meaning that you do not target different a sysplex for a given ftp.

Is there a load balancer on the MF side that would farm you off to a given ftp box on that side?

Is there a gateway between you and the MF?

That traceroute will be handy.

how does the speed look when you manually get the file with those same credentials?
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

Paul,

Single server/gateway, with load balancing dynamic on three LPARs.

EDIT: server configuration limits number of data ports in use, and has disabled the active/passive command attribute. I've confirmed that this is not a data port issue.

There's no gateway (or firewall) between the Unix server and the MF.

Due to security restraints, I can't do manual testing with the same credentials, though if other diagnostics fail to provide a root cause, I can engage resources for that.

I failed to mention, and will update the main post, that we are using top-line IBM hosts running z/OS.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Were the datasets that you tried to retrieve migrated? I know the MF sometimes does that and you have to wait for the file to be retrieved before it can be used. That would account for a speed issue.
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

I dislike posting about my limited MF knowledge... makes me feel dirty. I wanted to depart that life and never look back...

(I used to be a MF support person as a contractor at IBM in Rochester MN.)

Debugging PL1 programs that had GOTO logic for their conditional branching... uggg...
qt_ky
Premium Member
Premium Member
Posts: 2895
Joined: Wed Aug 03, 2011 6:16 am
Location: USA

Post by qt_ky »

You might be able to prove in which system the problem exists by ftp'ing the same files from one DataStage server to another, then back again. Based on those timings and comparisons it may help everyone be sure where to focus.

I'm no mainframe expert but I have seen some cases where a large file we are ftp'ing comes off of mainframe tape instead of disk. You might also check if that is a factor coming into play.
Choose a job you love, and you will never have to work a day in your life. - Confucius
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

Updates on things learned:

Paul, the datasets are all newly created in the current cycle. Migration is not an issue, but I do have observations that it would not cause this sort of performance loss.

qt_ky: Interesting suggestion, but it can't fit, because MF-Unix connectivity is very different from local network connectivity. No tape drives are involved.

The ACK slow response is a symptom. They're investigating further, but the real comparison information will come when we have concurrent traces running on both sides. FYI this FTP session is initiated from the DataStage server to the host FTP server. The host handles all aspects of the session, from id auth to completion of the transfer. The host server is not part of the network.

Thanks for the responses so far, the concurrent trace will hopefully run tomorrow (Tuesday) night.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

Are you using an external FTP command or the FTP Stage? FTP Stage processes one record at a time. with acknowledgement for each record, which can be making the process worse if you have slow ACK.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

Andy,

We use FTP Enterprise exclusively. One clarifying item not posted previously is that all transfers are binary with undefined length Binary single column for the table definition on input and output. Downloaded files are read by their main processing jobs using CFF with specific CFD loaded for the dataset being used. All downloads preserve the physical storage format of the MF datasets, which have cataloged definitions for lrecl and block size. RCP is disabled for every job.

Your question prompts me to "think out loud" here. My configuration implies a single continuous stream of data, yet the Director logs clearly show a distinct number of rows on the Sequential File output. Every transfer shows an lrecl that conforms to the lrecl in the dataset catalog.

In short, despite my desire to download data in a continuous stream, it is being read using the CFD.

I'll do more research to determine if the performance hit coincides with our migration from v8.7 to v11.5. I didn't consider that before.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

Final Update:

The performance changed 4 months after our migration from 8.7 to 11.5. This points to an internal "change" not associated with DataStage.

Sniffer trace on the Linux side of the FTP session did not reveal anything specific. My support team is proceeding with low-level investigation of all of our DS servers. We have issues with Oracle connection performance, not on the server for my apps, and no determination has yet been made if there is any correlation with the FTP issue.

I'm grateful for the comments so far. I'm marking this thread resolved. I don't expect any new information for several days or possibly a few weeks. We don't have much yet to establish any sort of pattern.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
Post Reply