Tsort vs. Syncsort

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
bigpoppa
Participant
Posts: 190
Joined: Fri Feb 28, 2003 11:39 am

Tsort vs. Syncsort

Post by bigpoppa »

All-
I used to believe that tsort was faster than sycnsort most of the time, but from the recent posts, I'm wondering which is truly faster.

Maybe this community can come up with a heuristic for determing when to use each sort. To do this, I propose that PX users post sort statistics, and then we can analyze the stats and come up with some generalizations.

Here are the stats to post:

1. Type of Sort (sync or t):
2. Type of file in (dataset, flat file, etc):
3. Type of file out (dataset, flat file, etc):
4. Version of PX:
5. Version, flavor of UNIX:
6. Gbs scratchdisk:
7. Gbs swap space:
8. # Cpus:
9. # Partitions:
10. Amount of time it took for the sort to run:

And any other stat you think would be useful in this analysis.

Thanks,
BP
clshore
Charter Member
Charter Member
Posts: 115
Joined: Tue Oct 21, 2003 11:45 am

Post by clshore »

Pardon my provincialism, I'm sure that the tsort you refer to is not the ancient UNIX utility tsort ( do 'man tsort' from the command line ) which performs a topological sort on partial orderings.

So I assume that it's the Orch tsort you refer to?

Carter
Teej
Participant
Posts: 677
Joined: Fri Aug 08, 2003 9:26 am
Location: USA

Post by Teej »

Yes. If you would take a look at the OSH code generated by any jobs that have a Sort on it, or uses a stage that automatically invoke a sort, you would see the command starting with TSORT(...

However, I do not have any performance metrics with SyncSort or CoSort.

-T.J.
Developer of DataStage Parallel Engine (Orchestrate).
Post Reply