Loading 1252 fixed length file into UTF-8 DB2 - DB2CODEPAGE

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
franek
Participant
Posts: 4
Joined: Wed Jul 05, 2017 3:28 am
Location: Warsaw

Loading 1252 fixed length file into UTF-8 DB2 - DB2CODEPAGE

Post by franek »

I have set of jobs, that weren't supporting national characters, and now I have to load files with 1252 codepage. My target database is DB2 with UTF-8 - I checked it and I am able to insert e.g. characters with accents. I'm thinking how to adjust my DS jobs with minimum effort. To simplify, job design is:

Sequential file -> (set of processing stages) -> DB2 connector

Parallel job has UTF-8 NLS set, file is fixed length, all varchars are without unicode extended property. When I loaded 1252 file without changes in job, DS wasn't able to read 1252 characters - in debugger they were showed as '?' sign, all data was loaded into database, but when I ran select statement it turned out that fields containing accents were corrupted and were showed as <Error>. Here is what I did to adjust my job:
- changed NLS to 1252 - it didn't help, I was still seeing <Error> in selects
- set extended property to unicode in varchars - data was loaded, selects were ok, but I have warning about writing WVARCHAR into target VARCHAR column.

I tried to get rid of this warning, but only solution that worked was to change DB2 column from varchar to vargraphic.

So now it looks like I have to change all varchars in all my jobs to unicode and all varchars to vargraphic in DB2. This is not effortless solution.

viewtopic.php?p=456055 - this topic describes something similar to my problem, but setting DB2CODEPAGE environmental variable is not clear to me. If I set it to 1208, does it mean that now DataStage will know that my DB2 supports unicode characters? But what about DataStage itself, as it wasn't able to show accents in debugger without unicode parameter in varchars? Does DB2CODEPAGE variable have impact on whole job or if some varchar is not set to unicode there is no possibility to pass unicode characters wtih it?
intelcom
Premium Member
Premium Member
Posts: 25
Joined: Thu Feb 28, 2008 2:05 am

Re: Loading 1252 fixed length file into UTF-8 DB2 - DB2CODEP

Post by intelcom »

Are you sure that the sequential file is in Windows 1252 codepage.
You can tried to set NLS on Sequential file as Windows 1252 and write it to a dataset. Is is correct?
franek
Participant
Posts: 4
Joined: Wed Jul 05, 2017 3:28 am
Location: Warsaw

Re: Loading 1252 fixed length file into UTF-8 DB2 - DB2CODEP

Post by franek »

I did test with Data Set - NLS in job and sequential file is 1252 - When I use View Data... in Data Set, accents are visible and all columns are loaded properly. But in debug, these characters are still as '?'.

When I replaced Data Set with DB2 Connector, and left table as it was (with varchar), there was <Error> placeholder in field with accents when I run select.

Than I changed varchar column to vargraphic, data was loaded, there is no <Error>, but all accent characters are showed as unknown characters, e.g. "fran�ais"

Finally I changed this column to unicode varchar in DataStage job, and all characters are visible in selects.

I set DB2CODEPAGE environmenta variable (in Administrator, as User Defined for project) but it changed nothing - maybe I did it wrong?

I am still at the point where the only solution is to set all varchars to unicode or nvarchars in DataStage job and change DB2 varchar to vargraphic.
Post Reply