TRIM() vs LTRIM(RTRIM())

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

TRIM() vs LTRIM(RTRIM())

Post by chandra.shekhar@tcs.com »

Hi,

I have been asked to remove the spaces from all the columns of a table. So I simply added the TRIM function in the SELECT SQL which worked fine.

But my Client asked me to change it and use LTRIM(RTRIM)) function.
I said that they both will work fine as long as we are removing the spaces. But he's almost forcing me to use LTRIM RTRIM.

Does LTRIM(RTRIM)) works better than TRIM given the scenario of removing spaces only? :roll:
Thanx and Regards,
ETL User
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

So, this is a database question? If so, it would help to know which one. :wink:
Last edited by chulett on Tue Dec 12, 2017 8:53 am, edited 1 time in total.
-craig

"You can never have too many knives" -- Logan Nine Fingers
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Post by chandra.shekhar@tcs.com »

Yes, you can say that.
These functions would be applied at the Teradata Database using the SQL.
Thanx and Regards,
ETL User
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I'm sorry but why not simply read the documentation? TRIM defaults to BOTH left and right trim, a.k.a. leading and trailing as well as space as the character to trim and I'm assuming you let it default. Suggesting that you need to nest those two functions to accomplish what the original already does is a bit... silly. And makes LTRIM and RTRIM a bit redundant. IMHO.

ps. And I'm not aware where the combo would "work better" than TRIM in any situation. But perhaps I just need enlightenment after all these years. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

The only other thing that pops into my head is I have seen specialized implementations of TRIM that not only trim leading/trailing but also internal 'extra' spaces as well. If memory serves, the DataStage Server job trim function does that. But that's not SQL and not ANSI compliant, which Teradata's trim function is.
-craig

"You can never have too many knives" -- Logan Nine Fingers
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Post by chandra.shekhar@tcs.com »

I couldn't agree with you more Chulett.
I know that it is better and efficient to use TRIM instead of nested RTRIM and LTRIM functions if need to remove the spaces from a column.

I just asked this because my client says that nested one would work better.
I just wanted to know who's right, I or my client ? :D
The only other thing that pops into my head is I have seen specialized implementations of TRIM that not only trim leading/trailing but also internal 'extra' spaces as well.
That is the TRIM function available in Datastage Transformer stage.
Thanx and Regards,
ETL User
UCDI
Premium Member
Premium Member
Posts: 383
Joined: Mon Mar 21, 2016 2:00 pm

Post by UCDI »

Easy. Your client is right ! Do it their way, make them happy, get paid lol.

------------
I don't know the implementation under the hood but it could be argued either way if you can't get your hands on this.

consider -- all those need to do is check the extreme end of the data and remove whitespace. For trailing, that means assigning the byte after the last non whitespace byte to zero. For leading, it means adjusting the pointer to the data to point to the first non whitespace byte as starting point. In both cases, the work done is trivial. To combine the operations into 1 function.. there is no way I see to reduce the work, so all you save is the function call and function delay. But here is the kicker... in parallel, if you optimally called trailing before leading, you could do both at once, as by the time you started to adjust the leading pointer you already found the trailing point, so each cpu can doctor the same string without hurting the other.

so if it can optimally call the 2 functions in parallel, its identical.
if it can't, calling one function is a tiny handful of sub nanosecond delay per record. After a few billion, you have a couple of seconds wasted.
PaulVL
Premium Member
Premium Member
Posts: 1315
Joined: Fri Dec 17, 2010 4:36 pm

Post by PaulVL »

Differ that question to a Teradata DBA.
Post Reply