DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
myerdsdupp



Group memberships:
Premium Members

Joined: 25 Oct 2018
Posts: 4

Points: 32

Post Posted: Mon Jan 14, 2019 6:47 pm Reply with quote    Back to top    

DataStage® Release: 11x
Job Type: Parallel
OS: Unix
Hi All,

We are fetching data from DB2 database and the data is HTML Numerical encoded.

As part of the requirement we need to convert the data to Unicode data.

Example input data: 705 Bluetooth® Speaker Black
Expected Data is : 705 Bluetooth® Speaker Black

I have different such values in the field where the value needs to be changed to respective unicode character.

Could you please guide me on how to get to the solution?
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54534
Location: Sydney, Australia
Points: 295710

Post Posted: Tue Jan 15, 2019 12:23 am Reply with quote    Back to top    

Welcome aboard.

The UniSeq() function can generate the Unicode code point for any particular character. Use substringing functions to isolate the characters that you want to convert, and concatenation to re-construct the string.

_________________
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Rate this response:  
Not yet rated
myerdsdupp



Group memberships:
Premium Members

Joined: 25 Oct 2018
Posts: 4

Points: 32

Post Posted: Wed Jan 16, 2019 8:58 pm Reply with quote    Back to top    

Thanks for the reply Ray.

but UniSeq() gives me only the decimal number. for that character.

the data from source looks like " & # 174;"(without spaces) which is the equivalent of Registered ® character. To convert it as a proper unicode registered character some forums suggest to change the ISO, which i am bit hesitant to do.

The only option i am left is to create a config file with all the characters, isolate the substring from the source string, perform a lookup and get the Unicode character and replace it in the whole string.

Now it is getting more trickier when there a different such html encoded characters in the input string.
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours