Hi all,
There are alternative ways to read a sequential file by using options such as number of readers. Are there similar options while WRITING too? Referring to a single file though and not a file pattern. Appreciate your thoughts.
possible ways to write a Sequential File stage in PARALLEL?
Moderators: chulett, rschirm, roy
you can split the file into X chunks with filenamechunknumber format and open each one in its own reader. Then write X chunks again and use cat or something to reassemble them. you can also write a dataset and use orchadmin to convert that to a flatfile at the end for writing, but reading, I don't see a way around splitting it externally. Is the file reading the actual bottleneck? How long does it take a dumb job to read the file and write to another file of another name, no processing, no implicit data conversions, just a pass through RCP job?
That would be the longer answer... write the file out in chunks and then cat the chunks back together once they are all complete. However, I've never personally seen a situation where doing that made sense, never mind the (albeit temporary) need for twice the amount of disk space.
FWIW, the question is strictly about writing in parallel.
FWIW, the question is strictly about writing in parallel.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
You're fighting the nature of the beast... it's called a sequential file for a reason. They can support multiple readers but always support only a single writer, which is why you're not going to find any such option built into DataStage.
-craig
"You can never have too many knives" -- Logan Nine Fingers
"You can never have too many knives" -- Logan Nine Fingers
Although your question has already been answered, think of using DataSets, which are exactly what you are looking for. You can have parallel processes writing to datasets (which are nothing but glorified parallel sequential files) in "Append" mode in each process.
The result is effectively a sequential file, although the order of records is non-deterministic.
The result is effectively a sequential file, although the order of records is non-deterministic.
<a href=http://www.worldcommunitygrid.org/team/ ... TZ9H4CGVP1 target="WCGWin">
</a>
</a>