Page 1 of 1

Remove Carriage Return in Sequential Stage Filter

Posted: Wed May 08, 2019 10:46 pm
by sarathchandrakt
Hi,

I am already using sed '1d;$d' to remove first and last line of the file. Now I have to remove carriage return from the file too. I have commands that can do both tasks separately. But I'm trying to find a command that can do all tasks in one filter.

Any help is appreciated.

Thanks.

Posted: Thu May 09, 2019 6:26 pm
by chulett
Can I ask the "why" of this? Is it because it's a Windows/DOS file? If so, there are other ways of handling it.

Posted: Fri May 10, 2019 5:01 pm
by ray.wurlod
You can probably adapt the sed script, but I'd use tr -d \n to remove newlines (or tr -d \r to remove carriage returns).

Craig's question is relevant, though, because if it's only DOS-style line terminators, you can simply use the Record Terminator String property, setting its value to DOS-style.

Re: Remove Carriage Return in Sequential Stage Filter

Posted: Sun May 12, 2019 4:15 pm
by mouthou
Based on Craig's and Ray's response focusing on carriage returns, I am slightly confused of your need. I think you are looking for a command pattern which does both header/trailer removal and CR removal in one shot. If so, this seems a unix command related question which is to be put in SeqFile stage option.

Did you try unix command piping something like " sed '1d;$d' <file> | tr -d ''. Another option of enclosing commands with && would also work like "(sed '1d;$d') && (tr -d )". Explore the exact syntaxt for the both the patterns, try running it from command prompt first and it should fix your need.

Posted: Sun May 12, 2019 6:51 pm
by chulett
It does seem that they are looking for such a "one shot" command. I was just trying to see if this CR was part of a DOS CR/LF record terminator string and if the command was (in essence) doing a DOS2UNIX terminator change to the UNIX LF. If that's the case, it's an unnecessary change as noted.

Posted: Mon May 13, 2019 1:05 am
by sarathchandrakt
Thank you everyone for the responses. I am getting extra CRs in file that we get from third party source. We figured it would be easy to fix it from our end.

I used TR in before job sub routineto remove CRs and then used sed in seq stage filter to remove header and footer. I was trying to accomplish both in one statement.

Posted: Wed May 15, 2019 12:12 pm
by mouthou
Any particular reason to go for a routine for CR removal? What is the issue in using unix in removing CRs and that too in the same place where header and trailer are handled.

Wondering what is made possible by a routine when unix can easily do the same.

Posted: Wed May 15, 2019 5:19 pm
by ray.wurlod
Probably using ExecSH as the before-job subroutine as a quick, easy interface to the UNIX command pipeline, thus avoiding the need for a sequence job.

Posted: Wed May 15, 2019 11:24 pm
by UCDI
perhaps less well know, you can call several unix commands back to back with a ; between them in the shell execute stage. here, piped stream is perhaps as good or better, but if you need it...

you can invoke the unix commands in a parallel job too, via a routine, if the before/after is insufficient.

Posted: Tue May 21, 2019 9:31 pm
by sarathchandrakt
The reason why we didn't run unix commands in sequence job is sometimes, we will be asked to process just a single file and running the sequence will trigger multiple jobs. So, we decided to keep the whole logic in the parallel job itself.

Honestly, I didn't think of using routines in parallel job. That is something I will defiantly consider in future. We were in a rush to do a quick fix to move to production.

Thanks again for all the input.

Posted: Wed May 22, 2019 1:18 pm
by mouthou
Just so you know. I don't think any of the response above gave a slight reference of using a Sequence as such. Those direct options were such that you could either put in ExecSH section as Ray mentioned or in Seq File stage directly.