DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
gaelynalmeida



Group memberships:
Premium Members

Joined: 28 Jul 2017
Posts: 12

Points: 93

Post Posted: Thu Sep 14, 2017 10:37 am Reply with quote    Back to top    

DataStage® Release: 11x
Job Type: Parallel
OS: Unix
Additional info: RCP
Hi,

We have a job that reads a flat file, and loads it to a target using RCP. The sequential file stage reads a schema file to determine the layout.

Some files may have headers, some may not. I do not seen an option to set this at run time in the sequential file stage. The value setting for "First line is column names" is a drop down that has a True or False value, and I cannot override with a parameter

Question: Can I specify, somehow, the number of rows to skip? Can this be done in the schema file? I have not found this in any documentation so far.

Thank you
G. Almeida
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42268
Location: Denver, CO
Points: 217033

Post Posted: Thu Sep 14, 2017 11:28 am Reply with quote    Back to top    

While Informatica has a "number of records to skip" for flat files, I don't recall DataStage having anything other than the "First line is column headers" true/false option. Perhaps a question for your official support provider?

_________________
-craig

Watch out where the huskies go and don't you eat that yellow snow
Rate this response:  
Not yet rated
Thomas.B
Participant



Joined: 09 Apr 2015
Posts: 57
Location: France - Nantes
Points: 369

Post Posted: Tue Sep 19, 2017 5:53 am Reply with quote    Back to top    

You could use the "Filter" option of the Sequential File stage to skip the first rows from a file, for example :
Code:
sed -e '1,3d'

will skip the first 3 rows.

_________________
BI Consultant
Business & Decision
Rate this response:  
Not yet rated
gaelynalmeida



Group memberships:
Premium Members

Joined: 28 Jul 2017
Posts: 12

Points: 93

Post Posted: Tue Sep 19, 2017 4:21 pm Reply with quote    Back to top    

Thank you, Craig .. yes, we will reach out to IBM to see what they say. In the mean time, we are using a filter.
Rate this response:  
Not yet rated
gaelynalmeida



Group memberships:
Premium Members

Joined: 28 Jul 2017
Posts: 12

Points: 93

Post Posted: Tue Sep 19, 2017 4:23 pm Reply with quote    Back to top    

Thank you for your response, Thomas. Yes, we are using an awk filter at this time as a work-around. sed would work just as well.

The trouble is that awk drops some of our records because of non-printable characters, which we would much rather handle further upstream. This is why I was looking for some native functionality.

Perhaps sed will not drop these records - we'll test this out.
Rate this response:  
Not yet rated
Thomas.B
Participant



Joined: 09 Apr 2015
Posts: 57
Location: France - Nantes
Points: 369

Post Posted: Thu Sep 21, 2017 5:51 am Reply with quote    Back to top    

If the "sed" solution don’t solve your problem you can also drop the first rows from a file with a sequential transformer:
Code:
Sequential File --> Transformer --> Column Generator --> Output


The sequential file reads the file line by line, and put each line in a single varchar string.
I don’t think strings containing non printable characters from a flat file will be drop if they are stored in a string.

The transformer is executed in sequential mode and filters the output rows with this condition (to remove the first 3 rows) :
Code:
@INROWNUM > 3


The column generator is used with the "Schema file" column method. That way he will transform your string in multiple fields defined by your schema file.

_________________
BI Consultant
Business & Decision
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42268
Location: Denver, CO
Points: 217033

Post Posted: Thu Sep 21, 2017 7:34 am Reply with quote    Back to top    

Still applicable in an RCP scenario, though?

_________________
-craig

Watch out where the huskies go and don't you eat that yellow snow
Rate this response:  
Not yet rated
Thomas.B
Participant



Joined: 09 Apr 2015
Posts: 57
Location: France - Nantes
Points: 369

Post Posted: Fri Sep 22, 2017 1:46 am Reply with quote    Back to top    

Yes, you just need to disable it in the Sequential File and the Transformer stages, activate it in the Column Generator and the output will represent the schema file.

_________________
BI Consultant
Business & Decision
Rate this response:  
Not yet rated
gaelynalmeida



Group memberships:
Premium Members

Joined: 28 Jul 2017
Posts: 12

Points: 93

Post Posted: Thu Oct 05, 2017 9:28 am Reply with quote    Back to top    

Thank you for all the excellent answers - we are pretty far gone down our development path, so hard to turn back and add another job to the flow.

For now, I think the filter is our best option.

But the other options are good to know for future reference
Rate this response:  
Not yet rated
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54070
Location: Sydney, Australia
Points: 293276

Post Posted: Thu Oct 05, 2017 11:13 pm Reply with quote    Back to top    

I would have thought that it is possible, since the data browser has that feature, as does the Sample stage. Why not create a job with a Sample stage that skips some rows and inspect the generated os ...

_________________
RXP Services Ltd
Melbourne | Canberra | Sydney | Hong Kong | Hobart | Brisbane
currently hiring: Canberra, Sydney and Melbourne
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours