Page 1 of 1

Loading a full file into a field?

Posted: Fri Oct 20, 2017 10:34 am
by simc
So I have a requirement to load a full file into the database.

I've tried a couple different thoughts, first was to cat the file in a command stage, and use the result as a variable to insert into the database, but it seems like there's size limitation to that variable.

Second thing I tried, was to read the file with a seq file stage, but I can't get it to read the whole file into one rec, one field... it reads each line into a record, so I tried with that updating the one row and appending each line of the file... it works, but takes like 4 minutes a file.

So my question, is does anyone have an elegant solution to read in a full file and insert it into one record.

Thanks from a relative newbie!
Charles

Posted: Fri Oct 20, 2017 10:53 am
by chulett
Curious what kind of field we're talking about here, a CLOB?

Posted: Fri Oct 20, 2017 10:56 am
by asorrell
Could you add the following please?

Database type and the definition of the "field" that is being used to store the data.

Maximum expected file size (if you have it) - is it megabytes, terabytes, etc.

Format for the original file - does it contain fixed length records, variable length, Unix or DOS format?

More detail is better!

Posted: Fri Oct 20, 2017 11:38 am
by simc
It's an oracle database, 12c, the field type is CLOB.

The file format is simple text, and the samples I have are mostly under 200kb with a couple in the 1.5mb range. can't see them being more than 3-4mb.

The format is fixed length records, DOS format I believe.

Heres a quick sample, with the formatting a little off:

Code: Select all

220160101CLO CLO 1ADR 1 1       24700        3705        3705           0           0           0           0       45280
220160101CLO CLO 1ADR 1 2        2800         420         420           0           0           0           0        1915
220160101CLO CLO 1ADR 1 3        2500         375         375           0           0           0           0        1915
220160101CLO CLO 1ADR 1 5       21900        5212        5212           0           0           0           0        4175
220160101CLO CLO 1ADR 1 6       13800        3558        3558           0           0           0           0        7695
220160101CLO CLO 1ADR 1 7           0           0           0           0           0           0           0           0
220160101CLO CLO 1ADR 1 8           0           0           0           0           0           0           0           0

Posted: Fri Oct 20, 2017 12:49 pm
by chulett
The forum software removes all 'extra' whitespace, hence the formatting issue. Wrap anything you want to preserve in

Code: Select all

[/b] tags, something I did for your post.

FYI, some quick searches out in the wild turn up several "Ask Tom" questions similar to this, all of which were directed to treat the file as an external table and use PL/SQL to load it into the CLOB. Of course.  :wink: Not quite sure how you would turn that into a DataStage job off the top of my head.

Posted: Thu Oct 26, 2017 12:12 pm
by UCDI
this is not as easy as I thought it would be.

Can you pipe the output of cat (unix) or type (windows) into it?

Posted: Thu Oct 26, 2017 12:47 pm
by PaulVL
This may be a silly question, but your file seems very much like structured data to me... why do you want to load the FILE into the database when the DATA into a TABLE would be best?

You could add a few columns for filename and timestamps if needed... but I think a regular table with fields 13 (+?) fields would suit you best.

Posted: Sun Oct 29, 2017 12:50 am
by ray.wurlod
You could try setting the line terminator to None (or to something that will never occur in the data) in the Sequential File stage.

Posted: Mon Oct 30, 2017 5:21 am
by eostic
Or use a Server Job. The Folder Stage will easily pull the entire contents into a single column "as is". For files of a few hundred meg or less, it works perfect every time.

Ernie

Posted: Mon Oct 30, 2017 2:37 pm
by ray.wurlod
Good catch, Ernie.
One can encapsulate the Folder stage in a Server Shared Container and use that in a parallel job, if that's what you need.

Posted: Wed Nov 01, 2017 9:12 am
by UCDI
PaulVL wrote:This may be a silly question, but your file seems very much like structured data to me... why do you want to load the FILE into the database when the DATA into a TABLE would be best?

You could add a few columns for filename and timestamps if needed... but I think a regular table with fields 13 (+?) fields would suit you best.
Version/source control for code is nothing but a database of files. There are times when that sort of thing can be useful. We archive a few generated files to DB for various reasons. If nothing else it beats having a massive archive folder and makes pulling by date/time easier and deletion of outdated 'files' easier and backup is taken care of, etc. I guess all that can be summed up by "convenience".