how to count the number of delimiters in a reord

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
kollurianu
Premium Member
Premium Member
Posts: 614
Joined: Fri Feb 06, 2004 3:59 pm

how to count the number of delimiters in a reord

Post by kollurianu »

Hi All,

iam getting a pipe delimited file.

iam getting warnings saying that too many columns in a record , looks like the some of the field values are having pipes , so inorder to eliminate those records , i want pass the record only when it has right number of delimiters , so inorder to check the number of delimiters in the record , what is better way to do this?

i was guessing DCOUNT function might be helpful , but not as it would count two consequtive delimiters as one delimiter.

any inputs , most welcome.

thank you all.
kollurianu
Premium Member
Premium Member
Posts: 614
Joined: Fri Feb 06, 2004 3:59 pm

Post by kollurianu »

i just re read the fundtionality of DCOUNT , i guess that is going to help.

Thank you all.
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Yes. Count and DCOUNT both will work. You need to read your row as a single column and something like DCOUNT(in.Link,"|") = x in contraint where x is the number of pipes your looking for.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
chalasaniamith
Participant
Posts: 36
Joined: Wed Feb 16, 2005 5:20 pm
Location: IL

Post by chalasaniamith »

first question back to you
is it a fixed length file or a variable length file?

if its a fixed width file you can get the length of each column right.Then check the length of the column and check for | and replace by space or some thing default value you want.
then you wont have any problem

let me know if it works
Amith Chalasani
Infosphere Solution Architect & Admin
Northern Trust
I_Server_Whale
Premium Member
Premium Member
Posts: 1255
Joined: Wed Feb 02, 2005 11:54 am
Location: United States of America

Re: how to count the number of delimiters in a reord

Post by I_Server_Whale »

kollurianu wrote:Hi All,

iam getting a pipe delimited file.
Anything that won't sell, I don't want to invent. Its sale is proof of utility, and utility is success.
Author: Thomas A. Edison 1847-1931, American Inventor, Entrepreneur, Founder of GE
chalasaniamith
Participant
Posts: 36
Joined: Wed Feb 16, 2005 5:20 pm
Location: IL

Post by chalasaniamith »

sorry about that
Last edited by chalasaniamith on Wed Dec 13, 2006 9:52 am, edited 1 time in total.
Amith Chalasani
Infosphere Solution Architect & Admin
Northern Trust
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

A pipe delimited would automatically mean that its not fixed. A fixed width file has no delimiter, just that every column's length is defined and the values are picked based upon that column length.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Krishna Nair
Participant
Posts: 19
Joined: Mon Jul 18, 2005 7:43 am

Use AWK

Post by Krishna Nair »

This is the ideal scenario to go for Unix tools like AWK scripting.
This can be easily done in awk.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

... and easily done in DataStage as well.
-craig

"You can never have too many knives" -- Logan Nine Fingers
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

More easily done in DataStage for those who don't know or don't have awk.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Exactly. One tool to rule them all... :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
aakashahuja
Premium Member
Premium Member
Posts: 210
Joined: Wed Feb 16, 2005 7:17 am

Post by aakashahuja »

Since this is a troubleshooting activity, so u can use

awk < YOUR_FILE_PATH_AND_NAME ' BEGIN {FS="<YOUR_DELIMITER_CHAR>"} {print NR " : " NF+1}'

The above will give you the line number followed by its lengths. This can be cumbersome if your file is too big. IN that case you can simply use

awk < YOUR_FILE_PATH_AND_NAME ' BEGIN {FS="<YOUR_DELIMITER_CHAR>"} {print NF+1}'
# The above will give you the culprit length. Lets says it is BadLen

then use

awk < YOUR_FILE_PATH_AND_NAME ' BEGIN {FS="<YOUR_DELIMITER_CHAR>"} {print NF+1}' | grep -n BadLen

This will give you the line number which has less or more number of
columns.

Hope this helps.

Cheers
DSguru2B
Charter Member
Charter Member
Posts: 6854
Joined: Wed Feb 09, 2005 3:44 pm
Location: Houston, TX

Post by DSguru2B »

Once again. Stick to DataStage. You will have much lesses pieces to bind and maintain.
Creativity is allowing yourself to make mistakes. Art is knowing which ones to keep.
Raghavendra
Participant
Posts: 147
Joined: Sat Apr 30, 2005 1:23 am
Location: Bangalore,India

Post by Raghavendra »

As your file is a piped delimited file you will be knowing the number of pipes (delimiters) per record. That is; you might know the number of columns present in the record.You can use an Index function to check the number of delimiters here.

Check whether you can use for pipe symbol.

Index function syntax is Index(string,substring,instance)
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Index() is the wrong function. It reports the character position in the string of the nth occurrence of the substring.

Count() or DCount() is the correct function for counting instances, or fields delimited by instances of a delimiter, respectively.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply