Re[2]: A question about NLS

Archive of postings to DataStageUsers@Oliver.com. This forum intended only as a reference and cannot be posted to.

Moderators: chulett, rschirm

Locked
admin
Posts: 8720
Joined: Sun Jan 12, 2003 11:26 pm

Re[2]: A question about NLS

Post by admin »

Hi David,

Thank you for the answer. Is DS (UV) able to operate like IDS with its GLS? It would be great!!! But I think it is not. BTW Are there any plans about using DataStage somehow wiht the repository on IDS?

Anyway If Ive not turned on NLS and my data sources are NLS-data (by ex. CP866, PC1251 etc.) what features Im loosing ? Is it only sorting?

Why these questions youll say? In the case of some "big work" overhead in one innecessary byte is significant.

Wednesday, January 31, 2001, 4:55:53 PM, you wrote:

DTM> At 04:36 PM 1/31/01 +0300, Marat S. Salimov wrote:
>>Hi all,
>>
>>Its a simple question. But I wont to know exactly to be sure. When I
>>start a project with NLS is it the truth that all interaction with DS
>>Server and workflow of jobs are carriet out in UNICODE? As Ive got
>>its overhead in one byte per byte. I.e. thats an productivity
>>impact. Am I right? Is it the truth that theres no necessarity to
>>turn it on?



DTM> Sort of. The DS Server (UniVerse) employs a full range of NLS
DTM> capabilities, including both locale support as well as character
DTM> mapping.

DTM> For character mapping, all data is held internally in what is known
DTM> as UTF-8 format, which is a variation of UniCode. UniCode
DTM> imposes a double byte standard on all character data, meaning your
DTM> data set sizes double for everything you do. It also makes it very
DTM> hard to have reserved characters that are available in any
DTM> language. This poses a problem for the DS Server because it uses a
DTM> series of special characters to help delimit data (what we call the
DTM> mark characters) as well as represent the SQL NULL character.

DTM> What UTF-8 does is provide a mapping whereby the byte sequence is
DTM> self-defining.
DTM> It provides two important features:
DTM> a) All ascii data, which is traditionally represented with a
DTM> single byte,
DTM> is still represented in that fashion. For shops that deal
DTM> with
DTM> predominately the ISO8859 character set, this greatly
DTM> reduces your
DTM> storage space as well as your mapping requirements.
DTM> b) It allows for us to create a private reserved set of
DTM> characters in the
DTM> mapping that are uniquely ours and available.

DTM> Now, thats how data is handled internally. At every external
DTM> boundary,
DTM> you can
DTM> establish the appropriate mapping to occur, so that you can map the data
DTM> into the
DTM> appropriate external representation. This means, if need be, you can
DTM> define multiple
DTM> different maps for different data source/targets. For example, you may
DTM> have some
DTM> data source/targets that want ISO8859-5 (for Cyrillic), or perhaps PC866 if the
DTM> data comes from an NT data source. Additionally, perhaps one of your
DTM> sources uses
DTM> KOI8-R, another version of the Russian/Cyrillic character set. And still
DTM> another
DTM> uses ISO8859-1. All could be correctly handled using NLS.

DTM> Because of all this, the answer to your question of whether you need it is
DTM> it depends
DTM> If all your data exchange is done using ISO8859-1, then you dont
DTM> need
DTM> it. Otherwise,
DTM> its very possible you do.

DTM> If you have any other questions, let me know...

DTM> Dave

DTM> ========================================================================
DTM> David T. Meeks || "All my life Im taken by surprise
DTM> Development Engineer, DataStage || Im someones waste of time
DTM> Ascential Software || Now I walk a balanced line
DTM> dave.meeks@ascentialsoftware.com || and step into tomorrow" - IQ
DTM> ===================================================================
DTM> =====



Best regards,
Marat mailto:maratkotik@mail.ru
Locked