File Pattern -- Reading sequentially

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
Shruthi
Participant
Posts: 74
Joined: Sun Oct 05, 2008 10:59 pm
Location: Bangalore

File Pattern -- Reading sequentially

Post by Shruthi »

Hi,

I have 3 files which match a pattern.

JobName_Date1.txt
JobName_Date2.txt
JobName_Date3.txt

I want the file with JobName_Date1.txt to be read first and then JobName_Date2.txt and then JobName_Date3.txt

Can we specify this order anywhere that it has to read in a sorted order?

Thanks,
Anitha
Sreenivasulu
Premium Member
Premium Member
Posts: 892
Joined: Thu Oct 16, 2003 5:18 am

Post by Sreenivasulu »

Use the 'ls options' in execute command activity .
Check in unix if it comes sorted the way you want the same can be plugged in datastage

Regards
Sreeni
Shruthi
Participant
Posts: 74
Joined: Sun Oct 05, 2008 10:59 pm
Location: Bangalore

Post by Shruthi »

Our OS is Windows. How can it be done here?
samyamkrishna
Premium Member
Premium Member
Posts: 258
Joined: Tue Jul 04, 2006 10:35 pm
Location: Toronto

Post by samyamkrishna »

the DIR command
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

DataStage installs the MKS Toolkit on Windows so you should in fact have UNIX-like capabilities and thus commands like "ls" can be used. Or just stick with "dir /b", either will display files alphabetically by default.

That "/b" option gives you a "bare" listing, one with just the filenames which is what would be appropriate in the Sequential File stage's File Pattern option.
Last edited by chulett on Thu Dec 30, 2010 9:41 am, edited 1 time in total.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Abhijeet1980
Participant
Posts: 81
Joined: Tue Aug 15, 2006 8:31 am
Location: Zürich
Contact:

Post by Abhijeet1980 »

Shruthi,

Kindly explain us the logic for sorting the files.

Files may be sorted on following attributes:
Modified Date
File Name
.
.
.
Many othet atrributes

I hope, that helps.
daignault
Premium Member
Premium Member
Posts: 165
Joined: Tue Mar 30, 2004 2:44 pm
Contact:

Post by daignault »

Setup your Seq stage to read the 3 files using a wildcard. Create a new column as part of the take-on that is the name of the source file (use a parameter maybe?)

Use a hash partition method on the source file key you created above and the data will be grouped in the same way as the take-on.

Cheers,

Ray D
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Read the files regardless of the order. Sort it downstream on the file name. This way you will ensure the proper sort order regardless of the order.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Shruthi
Participant
Posts: 74
Joined: Sun Oct 05, 2008 10:59 pm
Location: Bangalore

Post by Shruthi »

Thanks for all your reply.
If I want to specify "dir \b" or "ls -a", where should I specify it?

dir /b D:/Employee*.txt

But its failing with the error message as follows

Sequential_File_1,0: Couldn't find any files on host nibc1521 with pattern dir\ /b\ D:/Employee*.txt.
samyamkrishna
Premium Member
Premium Member
Posts: 258
Joined: Tue Jul 04, 2006 10:35 pm
Location: Toronto

Post by samyamkrishna »

its in the file property something like command
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You've got things in the right place as it is trying to do what you asked. And I wasn't specific enough in my previous reply - that "dir /b" output is an example of what the stage needs, not the actual command itself to be used there. Sorry.

Try simply putting "D:/Employee*.txt" there and letting us know how it goes.
-craig

"You can never have too many knives" -- Logan Nine Fingers
Shruthi
Participant
Posts: 74
Joined: Sun Oct 05, 2008 10:59 pm
Location: Bangalore

Post by Shruthi »

If I simply put "D:/Employee*.txt", its getting read by alphebetical order. Now, I want it to be read according to the system time and date.

Is there anything called as indirect file reading in DataStage? Meaning, all the files to be read will be put in one file and DataStage should read this file to get the file names and in turn read the files.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Capture the output of ls -1rt Employee*.txt command, convert the line terminators to, say, commas, and use that string as the "list of things" processed by a Start Loop activity.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
jwiles
Premium Member
Premium Member
Posts: 1274
Joined: Sun Nov 14, 2004 8:50 pm
Contact:

Post by jwiles »

Shruthi wrote: Is there anything called as indirect file reading in DataStage? Meaning, all the files to be read will be put in one file and DataStage should read this file to get the file names and in turn read the files.
You could use a FileSet stage. Create a fileset file (name.fs) containing rows which look like this:

nodename:full_filename

and pass the fileset name to the FileSet stage. You'll probably need to set the "Use Schema defined in File Set" to False.

An example would be:

compute1:/home/dsadm/my_input_file_1.txt
compute1:/home/dsadm/my_input_file_2.txt

However, building on Craig and Ray's suggestions you should be able to do the following with the SeqFile stage

ls -1t D:/Employee*.txt

Either method will work and I expect the second one will be easier for you.

Regards,

[/b]
- james wiles


All generalizations are false, including this one - Mark Twain.
Post Reply