Page 1 of 1

How to validate a string of code

Posted: Wed Sep 27, 2017 8:13 am
by ScottDun
Hi,

I am trying to validate a sting that comes in from a file. I am taking a set of characters and I need to make sure that they meet a certain criteria. The character string will be in a fixed position in a file, 3500-3510 to be exact.

First off, I need to make sure that there are no spaces and zeros. Then I need to make sure that the positions are meeting certain requirements, for example:

11 characters
char 1 has to be number 1-9
char 2 has to be A-Z
char 3 has to be 0-9 and A-Z
char 4 has to be 0-9
char 5 has to be A-Z
char 6 has to be 0-9 and A-Z
char 7 has to be 0-9
char 8 has to be A-Z
char 9 has to be A-Z
char 10 has to be 0-9
char 11 has to be 0-9

Thanks

Posted: Wed Sep 27, 2017 9:52 am
by chulett
Sounds like a job for a 'regular expression', something the Filter stage supports as one example.

Posted: Wed Sep 27, 2017 10:41 am
by ScottDun
Im using this in the transformer using stage variables to create an error code. I figured it would be an If statement but I didnt know what the actual working of that would look like

Posted: Wed Sep 27, 2017 11:14 am
by FranklinE
Much depends on what you do next when one of your tests is failed by the data being examined.

The simplest approach is to do each test separately. For example (pseudo-code expression):

Code: Select all

If inChar[3500,1] is numeric, set inChar01NumericFlag to "Y", else "N".
This allows you to prioritize the tests. For example, if there are spaces or zeroes in the full string, you might want to skip the other tests because this is a "fatal" error for the data.

Appendix B in the Parallel Job Developer's Guide is excellent reading, especially the String section. :wink:

Posted: Wed Sep 27, 2017 11:04 pm
by jhmckeever
You might find this post help you:
viewtopic.php?t=107882

Posted: Wed Oct 04, 2017 12:46 pm
by ScottDun
FranklinE,

How would I use the Num function to get rid of the 0? For ex: Num(Link[3500,1]) will see if the 1st char is 0-9. Is there a manipulation that will be able to make sure that 0 is not included?

Thanks

Posted: Wed Oct 04, 2017 2:13 pm
by chulett
Not via the NUM function, you would need to explicitly check your valid values or valid range instead.

Posted: Thu Oct 05, 2017 8:24 am
by ScottDun
So for somebody with no DB2 knowledge, I could use ISNUMERIC for the 1st character. Is there a ISALPHA and an ISALPHANUM function as well. I am referring to your 'regular expression' comment from earlier.

Posted: Thu Oct 05, 2017 9:15 am
by UCDI
if it becomes too complicated to do in a transformer with string functions and you don't want to fight regular expressions you can also write a basic routine (or if you want to go external, any language) to do it as a third option.

You may also be able to use the IA/rules stage (?) for it. I have not used that much.

Posted: Mon Oct 09, 2017 8:26 am
by FranklinE
Ditto to what UCDI posted, but there's an implied issue here that seems odd to me.

I get that transformer stages have large footprints. I get that there may be more efficient ways of doing things if one has extended knowledge of those other methods (BASIC, etc.). What I don't get is that the entire point of a rapid application development tool is that one uses out-of-the-box methods as much as possible.

I have limited ability with BASIC (last used it extensively in high school, in the dark ages of phone connection to a time share), so grain of salt: if the simplest and most direct approach is using a transformer stage and String transform functions, why would another approach be considered here?

Scott, don't get me wrong. If you see value in one of the alternatives, that's your best choice. I suggest at least researching the String functions, perhaps with an eye to finding their equivalences in the other approach(es).

Posted: Mon Oct 09, 2017 11:28 am
by UCDI
I think what it comes down to is that 'rapid' is subjective. I can write a C++ or basic routine to do advanced string processing in a couple of min and have it debugged and delivered inside an hour (problems on par with the question posed here). Trying to build that in datastage is the round hole and square peg problem... you can do it, but it is overly convoluted because the DS string processing routines, while they cover 99% of what one needs to do, are very simplistic and crude do when you hit a complex problem in the 1% it becomes a valid choice *for those who know the low level languages* to do it another way. If you are uncomfortable with the low level coding, that approach may not be 'rapid' or viable for you. I happen to have spent 15+ years writing C++. I may as well make use of that from time to time :)

Posted: Tue Oct 10, 2017 6:55 am
by qt_ky
DataStage 11.5 built-in parallel string functions are documented here:

https://www.ibm.com/support/knowledgece ... tions.html

Substring Operator is documented here (one of several places):

https://www.ibm.com/support/knowledgece ... rator.html