Best Way to Load flat file with no delimiters

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
whenry6000
Premium Member
Premium Member
Posts: 129
Joined: Thu Mar 02, 2006 8:28 am

Best Way to Load flat file with no delimiters

Post by whenry6000 »

All,
I am attempting to load a flat file into an Oracle table where there are no delimiters for the data. THe job is being ported over from Server edition, which was easy as the transformer had the substring function, which allowed me to break the one long string into columns based on position. However, it seems to be SLOW when moved over. I'd like to try it using a parallel job, just to see if performance is any better (yes, it is a sequential file, but still wondering if the overall performance of Parallel edition plus the Oracle Enterprise stage might be better).

However, the tranformer in Enterprise edition no longer has the substring function, and none of the other statges seems to be able to accomplish what I want. How would you go about doing this in Parallel edition?

Thanks!
ArndW
Participant
Posts: 16318
Joined: Tue Nov 16, 2004 9:08 am
Location: Germany
Contact:

Post by ArndW »

the transformer in PX has the same set of string functions as Server. But if you have fixed width columns with no separators it would make more sense to declare the columns appropriately in the sequential file stage and save reconverting them in a transform stage.
whenry6000
Premium Member
Premium Member
Posts: 129
Joined: Thu Mar 02, 2006 8:28 am

Post by whenry6000 »

ArndW wrote:the transformer in PX has the same set of string functions as Server. But if you have fixed width columns with no separators it would make more sense to declare the columns appropriately in the sequential file stage and save reconverting them in a transform stage.
That's the thing, I tried to look, but the transformer in Server seems to have a substring function, which allows you to give the column name, then the start position and the length. When I tried that same syntax in the Parallel edition transformer, it gave me an error.

I would declared the coluns, but some of the transofrms overlap. so for example, position 1-10 might b job number, but 1-5 might be client number. I'd have to break it up in the read, then run the transforms again, and I still can't do a substring (there's a "Left" and "Right' functions), so I'd be stiuck with the same issue again.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Don't use the substring function, use the substring operator - the square brackets: YourField[3,5] takes 5 characters starting from the 3rd. That syntax is universal across the two products.
-craig

"You can never have too many knives" -- Logan Nine Fingers
karrisuresh
Participant
Posts: 57
Joined: Sat Jun 09, 2007 1:14 am
Location: chicago

Post by karrisuresh »

Hi,
Using the left and right string functions in combination,we can reach the exact string position/extract that bit of required string

and I have a question here,

declare the columns appropriately in the sequential file stage and save reconverting them in a transform stage.
where in seq file is possible to do???

please let me know?
Hi I have experience in parallel extender datastage I am ready to give/take help from other
hope we all help each other hand in hand
whenry6000
Premium Member
Premium Member
Posts: 129
Joined: Thu Mar 02, 2006 8:28 am

Post by whenry6000 »

chulett wrote:Don't use the substring function, use the substring operator - the square brackets: YourField[3,5] takes 5 characters starting from the 3rd. That syntax is universal across the two products.
Ah, you learn something new every day! Thanks!
daignault
Premium Member
Premium Member
Posts: 165
Joined: Tue Mar 30, 2004 2:44 pm
Contact:

Post by daignault »

Define the columns as you would a COBOL flat file. Use CHAR(10) which represents the full length of the string (ie INPUTVAR[20,10]

Make sure you define the sequential input by it's full block length. This will be processed sequentially, but the decode of the flat file will be done in parallel if you use the Promote subrecord stage within Datastage.

Promote subrecord - promotes input subrecord columns to top-level columns

It should actually be a much faster process than in server.

Regards,

Ray D
roydanlobo
Participant
Posts: 21
Joined: Mon Mar 01, 2010 4:55 pm

Post by roydanlobo »

Hope this helps you.

Once I faced the same situation, Have to load the data from sequential file(not a delimited file) to Oracle database -Server Job.

I used routines, in which I defined the total length of the record and defining all the column and its length.Then used the routine in the transformer.
It worked.
Regards,
Lobo
Post Reply