DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
This topic has been marked "Resolved."
Author Message
kurics40



Group memberships:
Premium Members

Joined: 18 Nov 2009
Posts: 61

Points: 968

Post Posted: Tue Jan 19, 2010 8:15 am Reply with quote Back to top

DataStage® Release: 7x
Job Type: Server
OS: Unix
Additional info: DataStage Multiload UTF8-> UTF8 Teradata 6706 error
Hi everybody,

I have multiload error.

I have UTF8 input -> Teradata multiload UTF8 default charset is ok BUT it fails with 6706 or 6705 error which says character error . As I know it is unicode error character such as U-FFFD or in ASCII 0x1a. How can I clear them from the txt?

I haven't get used to datastage well. Unfortunately I can not investigate from Teradata side with the translate_chk function. I don't have rights on the client side.

In this case I need Basic language solution.
Can somebody give me a good idea?
\
*Note - Subect Title Changed to be more descriptive - Andy*


Last edited by kurics40 on Thu Jan 21, 2010 6:40 pm; edited 1 time in total
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42950
Location: Denver, CO
Points: 221500

Post Posted: Tue Jan 19, 2010 9:12 am Reply with quote Back to top

If you really think that's your solution, use Convert() along with the decimal value of the character to strip it from the string:

Code:
Convert(CHAR(26),"",YourField)

Idea Next time please do a better job of choosing a thread subject. We know your question would be about "DataStage" so make it pertinent to your actual issue. This one could have been "Multiload character error" or something along those lines.

_________________
-craig

Now I've got that feeling once again, I can't explain; you would not understand
this is now who I am. I have become comfortably numb.
Rate this response:  
kurics40



Group memberships:
Premium Members

Joined: 18 Nov 2009
Posts: 61

Points: 968

Post Posted: Wed Jan 20, 2010 8:06 am Reply with quote Back to top

Thank you so much.

I guess I found out the source of the problem. U - 0000 FFFD = ef bf bd

This is the error character of the unicode and Teradata doesn't accept it.
How can I eliminate them with datatstage?
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42950
Location: Denver, CO
Points: 221500

Post Posted: Wed Jan 20, 2010 8:30 am Reply with quote Back to top

As noted above.

_________________
-craig

Now I've got that feeling once again, I can't explain; you would not understand
this is now who I am. I have become comfortably numb.
Rate this response:  
Not yet rated
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54514
Location: Sydney, Australia
Points: 295617

Post Posted: Wed Jan 20, 2010 2:53 pm Reply with quote Back to top

You don't eliminate them - they are your customer's data!

You figure out how to handle them, typically by specifying the correct character mappings in both DataStage and Teradata.

_________________
RXP Services Ltd
Melbourne | Canberra | Sydney | Hong Kong | Hobart | Brisbane
Rate this response:  
Not yet rated
kurics40



Group memberships:
Premium Members

Joined: 18 Nov 2009
Posts: 61

Points: 968

Post Posted: Wed Jan 20, 2010 5:24 pm Reply with quote Back to top

Yes that is true. Anyhow it is a data quality issue. with Convert(Unichar(65553),"", input_field) is perfectly solved the 6706 teradata error.
Somehow teradata doesn't want to accept the unicode error character. They are not printable. It appears in ASCII in this way: � Sometimes it is cube or ? , etc. ..

I tried with translate_chk customise the multiload stage, but it doesn't work. I wrote a multiload script and I run it from unix to see it can work from there or not. No success.

The data of the client please don't worry. I work on client side and unfortunatley I know what my boss said. It is the order to solve this issue. I receive it in appearantly injured utf8 format with a wrong extraction.

The 6705 error is more difficult.. It means some wrong multibyte stuff.
Can I do it with oConv( string, MCM ) or something similar?
I guess an ETL tool should has a good solution to handle the data.

Please who know the way to mend the bad unicode characters tell me.
If i get to know some usefull i ll post it.

Bye
Rate this response:  
Not yet rated
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54514
Location: Sydney, Australia
Points: 295617

Post Posted: Wed Jan 20, 2010 6:35 pm Reply with quote Back to top

You can only "mend" what you call "bad" characters if you know what they are and what they should be. You can, of course, eliminate them (with permission). Oconv() supports NLSmapname as its sec ...

_________________
RXP Services Ltd
Melbourne | Canberra | Sydney | Hong Kong | Hobart | Brisbane
Rate this response:  
Not yet rated
kurics40



Group memberships:
Premium Members

Joined: 18 Nov 2009
Posts: 61

Points: 968

Post Posted: Thu Jan 21, 2010 2:43 am Reply with quote Back to top

Allright. Sorry for the comment. I see I did a mistake.

Okay lets investigate the issue:

There is one typical problematic row.

"357 243 267" =  makes the 99% the problem

cat 203row
|Suspect_GSFAISLPLFI2|FIC1303835|FIA1176256|SYTO INTERNATIONAL BV|SYTO INTERNATIONAL BV

od -c 203row - ASCII way
0000000 357 243 267 | S u s p e c t _ G S F A
0000020 I S L P L F I 2 | F I C 1 3 0 3
0000040 8 3 5 | F I A 1 1 7 6 2 5 6 | S
0000060 Y T O I N T E R N A T I O N A
0000100 L B V | S Y T O I N T E R N
0000120 A T I O N A L B V \n
0000133


od -x 203row
0000000 efa3 b77c 5375 7370 6563 745f 4753 4641
0000020 4953 4c50 4c46 4932 7c46 4943 3133 3033
0000040 3833 357c 4649 4131 3137 3632 3536 7c53
0000060 5954 4f20 494e 5445 524e 4154 494f 4e41
0000100 4c20 4256 7c53 5954 4f20 494e 5445 524e
0000120 4154 494f 4e41 4c20 4256 0a00
0000133




Teradata says:
6705 An illegally formed character string was
encountered during translation.
..
Unicode Any single-byte character (SBC).

It would be good if I can at least transform the unicode character till the | sign ASCII or LATIN if it is single-byte (damaged) Unicode.


Last edited by kurics40 on Thu Jan 21, 2010 5:07 pm; edited 1 time in total
Rate this response:  
Not yet rated
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54514
Location: Sydney, Australia
Points: 295617

Post Posted: Thu Jan 21, 2010 3:44 pm Reply with quote Back to top

I don't know the answer, because it will depend on how your data are actually encoded. And there's no way I can know this, so I suggest a more general approach.

_________________
RXP Services Ltd
Melbourne | Canberra | Sydney | Hong Kong | Hobart | Brisbane
Rate this response:  
Not yet rated
kurics40



Group memberships:
Premium Members

Joined: 18 Nov 2009
Posts: 61

Points: 968

Post Posted: Thu Jan 21, 2010 6:04 pm Reply with quote Back to top

Might be the answer is the recode UTF8 to UTF16 and after to UTF8 ...

- I know it can be a temporaly solution to replace those characters, but what if I will receive in the future different type of strange characters? -
Rate this response:  
Not yet rated
kurics40



Group memberships:
Premium Members

Joined: 18 Nov 2009
Posts: 61

Points: 968

Post Posted: Thu Jan 28, 2010 6:28 am Reply with quote Back to top

Hi all,

Might be I found a solution:

http://www.teradata.com/DownloadCenter/Topic9214-123-1.aspx

The other can be to write Inmod Multiload.
If somebody knows a DataStage solution for this it would be great.
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42950
Location: Denver, CO
Points: 221500

Post Posted: Thu Jan 28, 2010 6:29 am Reply with quote Back to top

Did you ask your official support provider?

_________________
-craig

Now I've got that feeling once again, I can't explain; you would not understand
this is now who I am. I have become comfortably numb.
Rate this response:  
Not yet rated
kurics40



Group memberships:
Premium Members

Joined: 18 Nov 2009
Posts: 61

Points: 968

Post Posted: Thu Jan 28, 2010 7:11 am Reply with quote Back to top

chulett wrote:
Did you ask your official support provider?


I don't understand what you mean. This is the offical Teradata state in this topic.

Btw
I am looking for DataStage solution.
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42950
Location: Denver, CO
Points: 221500

Post Posted: Thu Jan 28, 2010 8:37 am Reply with quote Back to top

Confused Ask whomever is officially providing you with DataStage support.

_________________
-craig

Now I've got that feeling once again, I can't explain; you would not understand
this is now who I am. I have become comfortably numb.
Rate this response:  
Not yet rated
kurics40



Group memberships:
Premium Members

Joined: 18 Nov 2009
Posts: 61

Points: 968

Post Posted: Thu Jan 28, 2010 8:49 am Reply with quote Back to top

Aha! I see. No I haven't asked the offical Datastage support. Good idea! Smile
That will be the next step.
Rate this response:  
Not yet rated
Display posts from previous:

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours