NLS Map for Special Characters

Infosphere's Quality Product

Moderators: chulett, rschirm

Post Reply
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Welcome aboard.

Does not MS-1252 recognise these Windows special characters?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Can you not ask whomever created the file for you what encoding it is using?
-craig

"You can never have too many knives" -- Logan Nine Fingers
QualityStageatGM
Premium Member
Premium Member
Posts: 10
Joined: Thu Jul 09, 2015 7:10 am

Post by QualityStageatGM »

It came in with Oracle's version of UTF-8.

I'll try creating a rule and see if the rule will correctly identify each special character as different from the others.

Thanks.
weiyi_will
Participant
Posts: 10
Joined: Sun Aug 11, 2013 10:46 pm
Location: Dalian

Re: NLS Map for Special Characters

Post by weiyi_will »

Did you try saving the text files as UTF-8 manually and set NLS as UTF on JOB property and set datatype as NvarChar on column definition?
QualityStageatGM
Premium Member
Premium Member
Posts: 10
Joined: Thu Jul 09, 2015 7:10 am

Post by QualityStageatGM »

Sorry for the late reply, was out of office for a couple days. I've checked the job parameters and the stage properties and they are both UTF-8. It seems that the box-like characters are having trouble being correctly transferred.
QualityStageatGM
Premium Member
Premium Member
Posts: 10
Joined: Thu Jul 09, 2015 7:10 am

Post by QualityStageatGM »

Would you guys know what the most encompassing NLS map name is?

Thanks.
priyadarshikunal
Premium Member
Premium Member
Posts: 1735
Joined: Thu Mar 01, 2007 5:44 am
Location: Troy, MI

Post by priyadarshikunal »

can you get the hex dump of those characters?
Priyadarshi Kunal

Genius may have its limitations, but stupidity is not thus handicapped. :wink:
QualityStageatGM
Premium Member
Premium Member
Posts: 10
Joined: Thu Jul 09, 2015 7:10 am

Post by QualityStageatGM »

I ran xxd on a file I created with ▒, ╦, ┼ type characters
Below is the output
Please let me know if this is what you're looking for or if you need a different file

Code: Select all

0000000: efbb bf22 496e 666f 7370 6865 7265 496e  ..."InfosphereIn
0000010: 666f 726d 6174 696f 6e22 0d0a 2249 6e66  formation".."Inf
0000020: 6f73 7068 6572 e296 9249 6e66 6f72 6d61  ospher...Informa
0000030: 7469 6f6e 220d 0a22 496e 666f 7370 6865  tion".."Infosphe
0000040: 72c3 ac49 6e66 6f72 6d61 7469 6f6e 220d  r..Information".
0000050: 0a22 496e 666f 7370 6865 72c5 9549 6e66  ."Infospher..Inf
0000060: 6f72 6d61 7469 6f6e 220d 0a22 496e 666f  ormation".."Info
0000070: 7370 6865 72c7 9d49 6e66 6f72 6d61 7469  spher..Informati
0000080: 6f6e 220d 0a22 496e 666f 7370 6865 72cf  on".."Infospher.
0000090: 81c7 9c49 6e66 6f72 6d61 7469 6f6e 220d  ...Information".
00000a0: 0a22 496e 666f 7370 6865 72e2 94bc 496e  ."Infospher...In
00000b0: 666f 726d 6174 696f 6e22 0d0a 2249 6e66  formation".."Inf
00000c0: 6f73 7068 6572 e295 a649 6e66 6f72 6d61  ospher...Informa
00000d0: 7469 6f6e 220d 0a22 496e 666f 7370 6865  tion".."Infosphe
00000e0: 7265 2049 6e66 6f72 6d61 7469 6f6e 220d  re Information".
00000f0: 0a22 496e 666f 7370 6865 7249 6e66 6f72  ."InfospherInfor
0000100: 6d61 7469 6f6e e296 9222 0d0a 2249 6e66  mation...".."Inf
0000110: 6f73 7068 6572 496e 666f 726d 6174 696f  ospherInformatio
0000120: 6ec3 ac22 0d0a 2249 6e66 6f73 7068 6572  n..".."Infospher
0000130: 496e 666f 726d 6174 696f 6ec5 9522 0d0a  Information.."..
0000140: 2249 6e66 6f73 7068 6572 496e 666f 726d  "InfospherInform
0000150: 6174 696f 6ec7 9d22 0d0a 2249 6e66 6f73  ation..".."Infos
0000160: 7068 6572 496e 666f 726d 6174 696f 6ecf  pherInformation.
0000170: 81c7 9c22 0d0a 2249 6e66 6f73 7068 6572  ...".."Infospher
0000180: 496e 666f 726d 6174 696f 6ee2 94bc 220d  Information...".
0000190: 0a22 496e 666f 7370 6865 7249 6e66 6f72  ."InfospherInfor
00001a0: 6d61 7469 6f6e e295 a622 0d0a 2249 6e66  mation...".."Inf
00001b0: 6f73 7068 6572 6520 496e 666f 726d 6174  osphere Informat
00001c0: 696f 6ee2 95a6 220d 0a                   ion..."..
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

We can't see your "special" characters in this dump. Can you highlight the hex codes that correspond to them?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

So, just looking at the first one, the hex bytes between "Infospher" and "Information" are e296 9249 (positions 0026 through 0029 in your dump). You will need to find an appropriate Unicode map that tells you what this four-byte sequence is supposed to be. You will also need to find out how these "rogue" byte sequences got into the data in the first place, and take steps to remediate that.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Post Reply