Reading an Excel (xlsx) file

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

splayer
Charter Member
Charter Member
Posts: 502
Joined: Mon Apr 12, 2004 5:01 pm

Reading an Excel (xlsx) file

Post by splayer »

I set up the "Configuration Window for Unstructured Data stage" window. I am trying to read sheet 2 of 4 sheets. Under Range Option, I selected "Specify the entire data range". Under "Range expression", I specified "A2:G8". Under "Sheet names to skip", I specified "Qwe Rte;Fre Grd;Ogh Gre" - the three sheets I want to skip. Under "Column header", I selected None.

Issue1: When I click on Load, I get the error "com.ascential.asb.cas.shared.exceptions.CASDMDIException". Any ideas?
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Perhaps the excluded sheet names need to be separately quoted? Particularly as each appears to include a space character.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
splayer
Charter Member
Charter Member
Posts: 502
Joined: Mon Apr 12, 2004 5:01 pm

Post by splayer »

Tried using double quotes around the skipped sheet names. Same error.
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

Try replacing the spaces with underscores in the tab names in the spreadsheet. Sometimes Unix is the culprit.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
splayer
Charter Member
Charter Member
Posts: 502
Joined: Mon Apr 12, 2004 5:01 pm

Post by splayer »

FranklinE, thank you for your response. I am trying to automate this process and having to rename each sheet with an underscore for a space would defeat the purpose.

I am trying to hear from someone who has actually loaded an Excel file directly into Datastage on an Unix platform.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

You could always check with support, betting someone there has. :wink:
-craig

"You can never have too many knives" -- Logan Nine Fingers
FranklinE
Premium Member
Premium Member
Posts: 739
Joined: Tue Nov 25, 2008 2:19 pm
Location: Malvern, PA

Post by FranklinE »

splayer wrote:FranklinE, thank you for your response. I am trying to automate this process and having to rename each sheet with an underscore for a space would defeat the purpose.
I understand your situation. My suggestion is to have the creator of the file make the change, not have you make the change after receiving the file.

I have data sources who use Windows exclusively. I impose upon them compliance with Unix requirements, or I reject their data. Seems harsh, but it's much simpler to correct a problem at the source.
Franklin Evans
"Shared pain is lessened, shared joy increased. Thus do we refute entropy." -- Spider Robinson

Using mainframe data FAQ: viewtopic.php?t=143596 Using CFF FAQ: viewtopic.php?t=157872
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Exactly. Whatever it takes to make it work, fully automated, force that back on the source as a requirement once you figure it out. Been there, done that, got the t-shirt. :wink:

Only fall back to handling the quirks yourself if you absolutely have no other choice.
-craig

"You can never have too many knives" -- Logan Nine Fingers
rkashyap
Premium Member
Premium Member
Posts: 532
Joined: Fri Dec 02, 2011 12:02 pm
Location: Richmond VA

Post by rkashyap »

Try naming the sheet to extract in 'Range expression' and not populate 'Sheet names to skip'.
e.g. Assuming 'sheet 2 of 4 sheets' is named Tab2, code would be

Code: Select all

Range expression: Tab2!A2:G8
If sheet 2's name has spaces (say Tab 2), then it can be enclosed in single quotes

Code: Select all

Range expression: 'Tab 2'!A2:G8
rkashyap
Premium Member
Premium Member
Posts: 532
Joined: Fri Dec 02, 2011 12:02 pm
Location: Richmond VA

Post by rkashyap »

To skip multiple sheets (with space in name) use single quotes e.g.

Code: Select all

Sheet names to skip: 'Qwe Rte;Fre Grd;Ogh Gre'
I feel that 'skipping sheets' is suited for extracting data from multiple sheets (of a spreadsheet) with same layout; for extracting data from a single sheet, naming the sheet in 'range expression' is a better option.
splayer
Charter Member
Charter Member
Posts: 502
Joined: Mon Apr 12, 2004 5:01 pm

Post by splayer »

rkashyap, thank you for your response. I have a spreadsheet with two sheets. I would like to skip the first and load the data from the second.

In "Range option", I tried "Specify the entire data range" and "Specify the start row". Under "Range expression", I have Sheet2!A1:G10. Under "Sheet names to skip", I have Sheet1. Under "Column header", I have "None". When I click on Load, I get the error:

org.apache.poi.xssf.usermodel.XSSFChartSheet (initiallization failure)

It doesn't matter if I put single quotes or not.
rkashyap
Premium Member
Premium Member
Posts: 532
Joined: Fri Dec 02, 2011 12:02 pm
Location: Richmond VA

Post by rkashyap »

Sample code that I have seen has sheet-name in either "Sheet names to skip" or in "Range option" but not both.
i.e Try

Code: Select all

Data ranges: Specify the entire data range
Sheet names to skip: 
Range expression: Sheet2!A2:G8
Error messages that we have seen from Unstructured Data Stage are handled/meaningful, but the ones you are getting are raw.This may imply a deeper underlying issue.

On which flavor of Unix is DataStage server installed? Is Unstructured Data Stage working OK in other cases?

You may also validate by building and executing sample job as described in this link.
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Nothing in the documentation shows a need for quotes. And the suggestion was to leave the Skip option unset as you are not skipping sheets within the range you are specifying.
-craig

"You can never have too many knives" -- Logan Nine Fingers
splayer
Charter Member
Charter Member
Posts: 502
Joined: Mon Apr 12, 2004 5:01 pm

Post by splayer »

If I don't have anything in the "Sheet names to skip" and I want to read from the 2nd sheet and I specify its name in "Range expression", I get the error:

com.ascential.asb.cas.shared.exception.CASSMDIException
rkashyap
Premium Member
Premium Member
Posts: 532
Joined: Fri Dec 02, 2011 12:02 pm
Location: Richmond VA

Post by rkashyap »

This sounds like a support issue.

Again ... On which flavor of Unix is DataStage server installed? Is Unstructured Data Stage working OK for a single tab spreadsheet?

You may build and execute sample job as described in this link to confirm that Unstructured Data stage works OK in your shop.
Post Reply