Page 1 of 1

Single,variable length seq file to be read in parallel

Posted: Wed Jun 27, 2012 1:18 am
by sharmabhavesh
Hi,
I have a single sequential file which I want to read in parallel. How can I achieve that? I went through multiple previous posts on the same topic but could not find a solution. There are many contradiciting answers in many posts.

Posted: Wed Jun 27, 2012 3:45 am
by pandeesh
Reading parallel from different jobs?

Posted: Wed Jun 27, 2012 4:19 am
by sharmabhavesh
I am reading a single sequential file from a single parallel job.

Re: Single,variable length seq file to be read in parallel

Posted: Wed Jun 27, 2012 6:13 am
by ntr
by making read method as file pattern or by increasing of readers

Re: Single,variable length seq file to be read in parallel

Posted: Wed Jun 27, 2012 9:02 am
by sharmabhavesh
Hi,
Can you please elaborate a bit?
I have heard that number of readers per node can be set only for fixed length files.
Also, please elaborate the first option you have specified.

Re: Single,variable length seq file to be read in parallel

Posted: Wed Jun 27, 2012 10:01 am
by zulfi123786
sharmabhavesh wrote:Also, please elaborate the first option you have specified.
The first option is not applicable in your case. when you have multiple files to be read and all share a specific pattern, only then you can go ahead with file pattern which is not in your case

Posted: Wed Jun 27, 2012 1:04 pm
by chulett
There's nothing stopping someone from using a file pattern to read a single file. And I believe you may find that the 'only for fixed-width files' mulitple readers advise is for older versions and that restriction was lifted in newer versions. Hopefully someone can confirm / deny...

Posted: Wed Jun 27, 2012 5:35 pm
by ray.wurlod
What version? Recent versions allow multiple readers per node to work with delimited formats.

Posted: Wed Jun 27, 2012 5:49 pm
by Kryt0n
chulett wrote: And I believe you may find that the 'only for fixed-width files' mulitple readers advise is for older versions and that restriction was lifted in newer versions. Hopefully someone can confirm / deny...
That's good to know... didn't realise it had changed. Although your source file mustn't have a header row as it doesn't like them with multiple readers...

Posted: Wed Jun 27, 2012 10:09 pm
by chulett
Headers are still an issue for multiple files read using a file pattern, but I don't believe they would have an adverse effect on multiple readers on a single file, however.

Posted: Wed Jun 27, 2012 11:36 pm
by Kryt0n
You wouldn't have thought so... but the options are mutually exclusive...

Posted: Thu Jun 28, 2012 2:32 am
by chandra.shekhar@tcs.com
In version 8.1 and newer versions it works like

For delimited Files ---> Num of readers per node.
For Fixed Width Files -> Read from Multiple nodes.

These options are also mutually exclusive.

Posted: Thu Jun 28, 2012 1:40 pm
by sharmabhavesh
Thanks for your quick responses. Things look more clear now.