Datastage Parallel schema file - indicate # of rows to skip
Moderators: chulett, rschirm, roy
-
- Premium Member
- Posts: 12
- Joined: Fri Jul 28, 2017 1:01 pm
Datastage Parallel schema file - indicate # of rows to skip
Hi,
We have a job that reads a flat file, and loads it to a target using RCP. The sequential file stage reads a schema file to determine the layout.
Some files may have headers, some may not. I do not seen an option to set this at run time in the sequential file stage. The value setting for "First line is column names" is a drop down that has a True or False value, and I cannot override with a parameter
Question: Can I specify, somehow, the number of rows to skip? Can this be done in the schema file? I have not found this in any documentation so far.
Thank you
G. Almeida
We have a job that reads a flat file, and loads it to a target using RCP. The sequential file stage reads a schema file to determine the layout.
Some files may have headers, some may not. I do not seen an option to set this at run time in the sequential file stage. The value setting for "First line is column names" is a drop down that has a True or False value, and I cannot override with a parameter
Question: Can I specify, somehow, the number of rows to skip? Can this be done in the schema file? I have not found this in any documentation so far.
Thank you
G. Almeida
You could use the "Filter" option of the Sequential File stage to skip the first rows from a file, for example :
will skip the first 3 rows.
Code: Select all
sed -e '1,3d'
BI Consultant
DSXConsult
DSXConsult
-
- Premium Member
- Posts: 12
- Joined: Fri Jul 28, 2017 1:01 pm
-
- Premium Member
- Posts: 12
- Joined: Fri Jul 28, 2017 1:01 pm
Thank you for your response, Thomas. Yes, we are using an awk filter at this time as a work-around. sed would work just as well.
The trouble is that awk drops some of our records because of non-printable characters, which we would much rather handle further upstream. This is why I was looking for some native functionality.
Perhaps sed will not drop these records - we'll test this out.
The trouble is that awk drops some of our records because of non-printable characters, which we would much rather handle further upstream. This is why I was looking for some native functionality.
Perhaps sed will not drop these records - we'll test this out.
Yes, you just need to disable it in the Sequential File and the Transformer stages, activate it in the Column Generator and the output will represent the schema file.
BI Consultant
DSXConsult
DSXConsult
-
- Premium Member
- Posts: 12
- Joined: Fri Jul 28, 2017 1:01 pm
-
- Participant
- Posts: 54607
- Joined: Wed Oct 23, 2002 10:52 pm
- Location: Sydney, Australia
- Contact:
I would have thought that it is possible, since the data browser has that feature, as does the Sample stage. Why not create a job with a Sample stage that skips some rows and inspect the generated osh?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.