How to handle field data type varchar2(4000)

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
mallikharjuna
Participant
Posts: 81
Joined: Thu Nov 30, 2006 7:46 am
Location: india

How to handle field data type varchar2(4000)

Post by mallikharjuna »

Hi,

I am reading the data from Oracle database and creating file(same process for around 100 tables).these are 1 to 1 straight mappings,So i am using schema files. i have one table having 3 columns but last column datatype is Varchar2(4000). but when i am trying to write the records in files, record is splitting into more than 1 record. i think it looks because of varcha2(4000), we are facing this issue. that filed having spaces between words and some space at the end also. data not splitting into different record, but new record creating with space. please let me know how to create the record in file without having issue.

Thanks in Advance..
MALLI
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Typically, a records "splits" like that when a large text field has record limiters in it. And typically that's because it has either been hand-typed by a user who it trying to make it "all pretty" or it has been pasted from a document into the database. Any chance of that, a stray carriage return perhaps? The EReplace() or Convert() function could be used to remove them before writing to the flat file, if that's the case. Or leverage the Oracle REPLACE function in the source SQL.

All of which invalidates the idea of a straight-thru, schema file based RCP job for this one table, I do believe.
-craig

"You can never have too many knives" -- Logan Nine Fingers
mallikharjuna
Participant
Posts: 81
Joined: Thu Nov 30, 2006 7:46 am
Location: india

Post by mallikharjuna »

Hi Chulett,

I am using multi instance job and schema files to process 100 tables data. is it possible to use these functions in multi instance job or in schema file? because each table having different set of columns.

Thanks
MALLI
mallikharjuna
Participant
Posts: 81
Joined: Thu Nov 30, 2006 7:46 am
Location: india

Post by mallikharjuna »

Sorry, not multi instance job, RCP only. i have this issue in more than one table. but for example i told 1 table
MALLI
Mike
Premium Member
Premium Member
Posts: 1021
Joined: Sun Mar 03, 2002 6:01 pm
Location: Tampa, FL

Post by Mike »

A transformer stage must know its metadata at compile time, so to use a transform function such as Convert() you would need to operate on a named column. You could surround the transformer stage with a pair of parameterized modify stages to rename the column of interest (the modify stage does not need to know its metadata until run time). Set up a parameter set with a values file per table to assist with the process.

Another alternative is to clean the data in the database before you send it through your generic DataStage job. It would be fairly easy to write some sql that utilizes the database catalog to generate sql to replace newline characters embedded in varchar(4000) fields.

Mike
Post Reply