ODBC taking longtime to unload the data

roydanlobo · Post by **roydanlobo** » Fri Oct 15, 2010 8:38 am

Hi,

I have an issue unloading the data from SQL DB to sequential file.

I have a sequence job in which there are 3 steps

My design:

1. ODBC----->Transformer------->Sequential file(S1)
2. S1------>Transformer------->S2
3. S2------->Transformer------->Teradata

In Step1, I am trying to unload data from SQL DB(connecting through ODBC stage) to a Sequential file. I have like 10 columns in SQL DB and in which 1 column has length of LONGVARCHAR(2183647) that has to be converted to varchar in transformer (because the same field is defined as VARCHAR in Teradata). It is taking like 30min for the 1st Step and rest of the steps run in sec.

Can anyone help me to increase the performance in Step1 for that particular column?

Thanks in advance

chulett · Post by **chulett** » Fri Oct 15, 2010 9:03 am

How are you 'converting it to varchar' in the transformer?

jcthornton · Post by **jcthornton** » Fri Oct 15, 2010 10:36 am

How fast does the ODBC stage run if you connect it to a copy (and nothing else)?

Are you sure it is the transformer that is the problem?

roydanlobo · Post by **roydanlobo** » Fri Oct 15, 2010 12:36 pm

Hey Criag,

Yes, I am doing it in transformer

roydanlobo · Post by **roydanlobo** » Fri Oct 15, 2010 12:59 pm

jcthornton wrote:How fast does the ODBC stage run if you connect it to a copy (and nothing else)?

Are you sure it is the transformer that is the problem?

I don't think transformer is a problem. I mentioned about the LONGVARCHAR column it is taking time to unload.

Say LONGVARCHAR column as TEXT

If I use the user-defined query in ODBC stage as below
select col1,col2,col3,col4,col5,col6....., cast(TEXT as varchar(9000)) as TEXT From source_table

It is taking like 20min.
I tried using the copy stage the it gave an error as below:

Error:The array size must be set to 1 when reading LOBs .

I changed the array size to 1 then I got lot warnings like

Warning: Invalid character(s) ([x2013]) found converting Unicode string (code point(s): The following contract/Asset (0744956400001/0697100104) was terminated on 09-Jul...) to codepage ISO-8859-1, substituting.

There is an option to enable the LOB's. I opted that and have given the name as TEXT there. then it is giving me an error say needs a key column and I don't have any key in that table.

Thanks

roydanlobo · Post by **roydanlobo** » Mon Oct 18, 2010 8:12 am

When I only copy stage, I am getting the below errors

Error:The array size must be set to 1 when reading LOBs .

I changed the array size to 1 then I got lot warnings like

Warning: Invalid character(s) ([x2013]) found converting Unicode string (code point(s): The following contract/Asset (0744956400001/0697100104) was terminated on 09-Jul...) to codepage ISO-8859-1, substituting.

There is an option to enable the LOB's. I opted that and have given the name as TEXT there. then it is giving me an error say needs a key column and I don't have any key in that table.

Thanks

jcthornton · Post by **jcthornton** » Mon Oct 18, 2010 2:05 pm

I'd check the version with just the copy stage against the original version. Replacing a transformer with a copy (when the copy has no output links) won't cause any new errors.

If you are not getting that error in the original, then something has to be different that is causing the error in the test job.

As for the specific error described, I would suggest that it is something particular to the Unicode settings on the job/stage/database that need to be worked out. It sounds generally like it is trying to convert a character that does not exist in the codepage selected by the DataStage job/stage exists in the input data. Perhaps this issue is avoided by how you were converting the string in the transformer.

Regardless, ensure that there are no output links on the copy stage. That simple 2-stage setup will show you just how fast the data is being grabbed from the database. If it is significantly faster than 20 minutes, the bottleneck is your transformer. If it is 20 minutes, the bottleneck is the database or the ODBC stage.