char argument in parallel routines. The point?

A forum for discussing DataStage<sup>®</sup> basics. If you're not sure where your question goes, start here.

Moderators: chulett, rschirm, roy

IASAQ
Premium Member
Premium Member
Posts: 31
Joined: Wed May 04, 2016 11:07 am
Location: Montréal

char argument in parallel routines. The point?

Post by IASAQ »

I found this old post while searching for the same error I got:

viewtopic.php?p=361780#361780

My question is why have the choice to have a char argument in the list if they create the error referenced at runtime in the post above?

Also, why does the datastage compiler makes a char argument and return value into int8 type?

Code: Select all

char testme(char c)
{
    return c;
}

// give this in ds project and will give error at runtime: 

extern int8 testme(int8 c);  /
I found out if I change the char argument into an unsigned char argument, the job will not give an error at runtime, but it won't give me what I expect either.

Code: Select all

char testme(unsigned char c)
{
    return c;
}

// give this in ds project. Doesn't crash at runtime 

extern int8 testme(uint8 c);
With the unsigned char version, if I call testme('c'), I expect to see 'c' in my peek stage. I get 0. It works properly when I call the function in my routines testbed on aix.

Can anybody shed some lights on this please? My C++ is very rusty but I can get by with some effort.
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

in C, chars ARE integers. Specifically, type char is signed int range of -127 to +128 or thereabouts (I always work with unsigned because you can't do some operations safely on signed values). Unsigned is (for sure) 0-255.

They later added to the language new words that mean the same thing, specifically the C99 standard which put new words on integer types to be more exact about sizes. int8 is 8 bits, int16 is 16 bits, etc. This is the same as char and wchar, respectively --- you can safely interchange them and many programmers do (a little sloppy, but they do it).

I do not know what the datastage compiler might have done but at the end of the day all character types in C are really integers. This isnt your problem.

How you compile the library for your datastage routine matters.
I don't know what you need on your system to make it work, but for us, to compile a little test program like this one, we use:

g++ -O -fPIC -Wno-deprecated -m64 -mtune=generic -mcmodel=small -shared -m64 -c filename

where filename is obviously or not your c++ source file. I dont generally fool with .h files for these tiny functions, and if I do, the source includes it directly, avoiding having to make a makefile for it. This may or may not be what your system needs to do.

Your C function is fine. It should return the forced signed conversion of the input. So for 'c' you get 'c'. If you put in 200 (whatever symbol that is) you will get a negative number back out, but in ascii, its all going to map to the same location in the end so you won't see this change when viewing the values as text. But that is what it does, is convert unsigned to signed.

I can try to help more but I suspect the issue here is simply how the routine is being compiled, not the source code itself.

I can also add that I have several routines in C that work fine and use the char type (mostly, char * type, but certainly I did not bother to use uint8 name). The keyword char is fine ... datastage doens't even KNOW what you put in the code. datastage just looks at compiled library object code which is in machine language or nearly and is effectively integer types there anyway. This has to be a unix / compiler problem.
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

hmm I am actually having the same woes.
I was able to get a value (integer, not character) with

return 'R';

and various other constants. But I have been unable so far to return a value from a variable. I am still looking at this.
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

This has to be a bug.

I tried a variety of things and nothing worked.
It worked fine with c-strings:

char *testme(char *c)
{
return (char *) c;
}

but you cannot send an unsigned char * to it from datastage, there is no option for that.

I tried taking an unsigned char in and putting it into a string, and it did not work. I tried using char pointers to single characters, and it did not work. I tried all kinds of pointer tricks and they did not work.

The only thing that worked was returning a constant character expression, and the above string approach.

What exactly do you NEED to do in your C? Is this a real issue for you?
IASAQ
Premium Member
Premium Member
Posts: 31
Joined: Wed May 04, 2016 11:07 am
Location: Montréal

Post by IASAQ »

UCDI wrote:hmm I am actually having the same woes.
I was able to get a value (integer, not character) with

return 'R';

and various other constants. But I have been unable so far to return a value from a variable. I am still looking at this.
I'm not at work, but the only way I was able to get the result I wanted was to write something like Char(testme(Seq('c')) in the derivation expression cell, which is ridiculous.

As for how I compiled the code, I go from memory, but it was something like:

xlC_r -c -q64 -w testme.cpp -o testme.o

As for the situation being a bug, hopefully a DS dev will see this thread and be able to confirm the situation.
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

Is that working for you?
I am getting zero back for anything that does not return an actual constant; I have not had time to figure it out but I suspect the input character is being clobbered somehow.

Casting it looks funny but that does not bother me. The zeros coming out are what I am calling a bug. The original code

char foo(unsigned char bar)
{
return (char)bar; ///always returns 0 or '/0' if you prefer that notation.
}

I think your compilation is fine. Its possible to compile an object that will run but is broken -- generally it does nothing or returns total garbage -- but that does not seem to be the issue here.
IASAQ
Premium Member
Premium Member
Posts: 31
Joined: Wed May 04, 2016 11:07 am
Location: Montréal

Post by IASAQ »

UCDI wrote:Is that working for you?
Nope. I tried casting like you did and it also returned 0. They only method I tried that works is the Char() Seq() combo I tested earlier. If you only use the Seq() call, it returns the proper ASCII value.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I'm guessing part of that is due to you using two different UNIX distros and two different compilers? I know "C is C" and all that but still... wondering.
-craig

"You can never have too many knives" -- Logan Nine Fingers
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

Im getting the same results and I only have 1 unix system under me. I compiled on the same system datastage runs on, and I have 5 or so working C routines (all string manupulators, nothing for 1 char though) already.

It seems to be some sort of bug with how datastage sends a single char to C. C does not really care much, all it needs is 1 byte passed in and it will work -- at the end of the day C just wants the bits pushed on the parameter stack for the subroutine.

Im going to try some C uglies to see if I can figure out exactly what is going wrong, soon as I get a chance. I wonder if trying to pass 'x' is really trying to pass a string, or a pointer, or something like that since there really isnt a way to designate 1 character in datastage because '' is used for strings same as "".
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I didn't mean you as in you but more as "you all". Meaning differences between UCDI and IASAQ.
-craig

"You can never have too many knives" -- Logan Nine Fingers
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

Ok, I think I understand.

I have it working by passing the original function a column of type tinyint. This appears to be C's unsigned char / unsigned byte type. It returns a signed byte.

That is where it breaks down. It "works" but its confusing and clunky. Feeding the result to a single column of type "char" ... still displays the numeric value instead of the ascii code (or whatever local setting would use).

It was giving back zeros because I was passing it strings inadvertently ... 'x' in the transformer is, as I noticed earliler today, a string not the letter x. So that makes sense now, a string parameter isnt compatible with "unsigned char".

You can obviously force it to display the character value and work with it as a character to build a string, so its not broken, just weird.
IASAQ
Premium Member
Premium Member
Posts: 31
Joined: Wed May 04, 2016 11:07 am
Location: Montréal

Post by IASAQ »

UCDI wrote:It was giving back zeros because I was passing it strings inadvertently ... 'x' in the transformer is, as I noticed earliler today, a string not the letter x. So that makes sense now, a string parameter isnt compatible with "unsigned char".

You can obviously force it to display the character value and work with it as a character to build a string, so its not broken, just weird.
How did you pass 'x' to the function? If I don't put at least single quotes, the function turns red in the derivation cell.
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

The single quotes makes it into a string in datastage. This does not work because it isnt the correct parameter type, and it returns a zero from the function in that case.

I put the ascii value for x into a tinyint column, to finally get it working.
This is identical to what you did with seq.

All that aside, is there a practical application you need that uses this? If not, I am going to withdraw from messing with it (I can still chat about it, but I am done poking at it in C/datastage). I think you will be stuck using the number to ascii conversion calls, ugly as it is, but it will work.
IASAQ
Premium Member
Premium Member
Posts: 31
Joined: Wed May 04, 2016 11:07 am
Location: Montréal

Post by IASAQ »

To answer your question about the practical applications, yeah. I was testing the functions from my custom routines library and one of the functions has a char as an argument.

Thanks for checking it out. Your results and mine seem to imply that there is indeed a problem with the way char arguments are handled in DS. At least, there is a workaround in the meantime. I'll close the topic.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Have either of you involved any kind of official support on this topic?
-craig

"You can never have too many knives" -- Logan Nine Fingers
Post Reply