Remove Carriage Return in Sequential Stage Filter

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
sarathchandrakt
Participant
Posts: 50
Joined: Fri Aug 29, 2014 1:32 pm
Location: Mumbai

Remove Carriage Return in Sequential Stage Filter

Post by sarathchandrakt »

Hi,

I am already using sed '1d;$d' to remove first and last line of the file. Now I have to remove carriage return from the file too. I have commands that can do both tasks separately. But I'm trying to find a command that can do all tasks in one filter.

Any help is appreciated.

Thanks.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Can I ask the "why" of this? Is it because it's a Windows/DOS file? If so, there are other ways of handling it.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

You can probably adapt the sed script, but I'd use tr -d \n to remove newlines (or tr -d \r to remove carriage returns).

Craig's question is relevant, though, because if it's only DOS-style line terminators, you can simply use the Record Terminator String property, setting its value to DOS-style.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
mouthou
Participant
Posts: 208
Joined: Sun Jul 04, 2004 11:57 pm

Re: Remove Carriage Return in Sequential Stage Filter

Post by mouthou »

Based on Craig's and Ray's response focusing on carriage returns, I am slightly confused of your need. I think you are looking for a command pattern which does both header/trailer removal and CR removal in one shot. If so, this seems a unix command related question which is to be put in SeqFile stage option.

Did you try unix command piping something like " sed '1d;$d' <file> | tr -d ''. Another option of enclosing commands with && would also work like "(sed '1d;$d') && (tr -d )". Explore the exact syntaxt for the both the patterns, try running it from command prompt first and it should fix your need.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

It does seem that they are looking for such a "one shot" command. I was just trying to see if this CR was part of a DOS CR/LF record terminator string and if the command was (in essence) doing a DOS2UNIX terminator change to the UNIX LF. If that's the case, it's an unnecessary change as noted.
-craig

"You can never have too many knives" -- Logan Nine Fingers
sarathchandrakt
Participant
Posts: 50
Joined: Fri Aug 29, 2014 1:32 pm
Location: Mumbai

Post by sarathchandrakt »

Thank you everyone for the responses. I am getting extra CRs in file that we get from third party source. We figured it would be easy to fix it from our end.

I used TR in before job sub routineto remove CRs and then used sed in seq stage filter to remove header and footer. I was trying to accomplish both in one statement.
mouthou
Participant
Posts: 208
Joined: Sun Jul 04, 2004 11:57 pm

Post by mouthou »

Any particular reason to go for a routine for CR removal? What is the issue in using unix in removing CRs and that too in the same place where header and trailer are handled.

Wondering what is made possible by a routine when unix can easily do the same.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Probably using ExecSH as the before-job subroutine as a quick, easy interface to the UNIX command pipeline, thus avoiding the need for a sequence job.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

perhaps less well know, you can call several unix commands back to back with a ; between them in the shell execute stage. here, piped stream is perhaps as good or better, but if you need it...

you can invoke the unix commands in a parallel job too, via a routine, if the before/after is insufficient.
sarathchandrakt
Participant
Posts: 50
Joined: Fri Aug 29, 2014 1:32 pm
Location: Mumbai

Post by sarathchandrakt »

The reason why we didn't run unix commands in sequence job is sometimes, we will be asked to process just a single file and running the sequence will trigger multiple jobs. So, we decided to keep the whole logic in the parallel job itself.

Honestly, I didn't think of using routines in parallel job. That is something I will defiantly consider in future. We were in a rush to do a quick fix to move to production.

Thanks again for all the input.
mouthou
Participant
Posts: 208
Joined: Sun Jul 04, 2004 11:57 pm

Post by mouthou »

Just so you know. I don't think any of the response above gave a slight reference of using a Sequence as such. Those direct options were such that you could either put in ExecSH section as Ray mentioned or in Seq File stage directly.
Last edited by mouthou on Wed May 22, 2019 1:33 pm, edited 1 time in total.
Post Reply