Selecting fields using Schema files

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
samratisking
Participant
Posts: 37
Joined: Tue Jan 29, 2008 6:03 am
Location: Guntur
Contact:

Selecting fields using Schema files

Post by samratisking »

Hi All,

I am trying to build a job with SequentialFile-->Transformer-->Join--->Target(SQL Server Table).

For this I am trying to use the Schema file to take the table definition in the Sequential file(.csv file), as my source metadata might change frequently and I'm using RCP in my job.

My sequential file has around 600 fields, out of which only 60 are relevant and these might change later(add new fields or delete some fields). I am trying to find out if the Schema file considers the metadata based on field names or based on the column position using the delimiter specified.

As the fields I need to select are across the file randomly, I need to select the fields dynamically(my reason for using schema file).

Please let me know if you have any ideas of implementing this.

Best Regards,
Sam
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Don't have any direct experience with this but it seems to me it would be easy enough to create a small test job with a handful of columns and try this out for yourself. Then let us know. :wink:

What purpose is the "Join" serving? That seems to me to be a deal breaker with RCP...
-craig

"You can never have too many knives" -- Logan Nine Fingers
samratisking
Participant
Posts: 37
Joined: Tue Jan 29, 2008 6:03 am
Location: Guntur
Contact:

Post by samratisking »

I tried it and could not get the result I was looking for.

The join is being used with a Key that I am bringing out of the Source explicitly and I do not have any concerns about the join.

My only issue is with the Schema file and the way it works.

Any ideas please?

Regards,
Sam
Samratisking
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Why not detail what exactly was the "it" that you tried and what the result was that you actually received? People should be able to give more targeted help from there.
-craig

"You can never have too many knives" -- Logan Nine Fingers
samratisking
Participant
Posts: 37
Joined: Tue Jan 29, 2008 6:03 am
Location: Guntur
Contact:

Post by samratisking »

Sure.

Here you go.

To test the schema file, I took a simple job SeqFile--->SeqFile.

I defined a simple schema file as below.

Code: Select all

record {final_delim=end,delim=',',record_delim='\n',quote=none,padchar='#'}
(
        Col3:string[max=255];
        Col1:string[max=255]
)
My input file is as follows.

Col1,Col2,Col3,Col4
Hi,Hello,First,Last
How,are,you,?
How,is,your,Day?

I have selected "First line is columns" property in my job.

The output that I expected is:

Col3, Col1
First,Hi
you,How
your,How

But the output I got is:

Col3
Col1,Col2,Col3,Col4
Hi,Hello,First,Last
How,are,you,?
How,is,your,Day?

When I set the "First line is columns" property in the Source file to "False", this is the output

Col3
Hi,Hello,First,Last
How,are,you,?
How,is,your,Day?


Regards,
Sam
Samratisking
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

It is the very nature of sequential files that you must read every column. You must, indeed, read past every single byte to get to the next. Using a schema file does not get around this requirement, though you may be able to leverage the "Drop on Import" property - something I've not seen done with schema files, but it may nonetheless be possible. Over to you to investigate.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Dang, should have caught that aspect of this right away... the nature of sequential media, indeed.
-craig

"You can never have too many knives" -- Logan Nine Fingers
moalik
Participant
Posts: 39
Joined: Thu Sep 15, 2011 8:15 am
Location: Melbourne

Post by moalik »

Hi Samrat,

I hope the concept of Partial Schema's might come handy. I haven't tried a hand on this. You can have a look at the below link from IBM.

https://www-01.ibm.com/support/knowledg ... hemas.html

Please update us if it resolves you problem :D

Thanks
Mohsin Khan
Datastage Consultant
Post Reply