Datastage XML input stage question

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
vik1979
Participant
Posts: 9
Joined: Thu Oct 21, 2010 3:22 am

Datastage XML input stage question

Post by vik1979 »

Hello All,

I am using XML input stage for parsing XML files in a parallel job, would like to know how I can pass through XML parsing when repetitive element is not found when parsing.

Code: Select all

<Store>
     <ID>
           <code>
                <Item extension="11"/>
            </code>
      </ID>
     <ID>
           <code>
                <Item extension="21"/>
            </code>
      </ID>
     <Container>
            <Stock code="abc"/>
     <Container>
            <Stock code="bcd"/>
</Store>
In above example, Store and ID elements are mandatory in the message and Container element is optional. I am parsing the XML keeping ID element as key, chunking out and parsing the Container info in next XML stage.
I am using below design in parsing. In XML stage 2, when Container info is missing in the message (as it is optional), no records are coming out of XML stage 2. I am using 'repetitive element required' unchecked in Transformation Settings and using Container ID as key.

[External Source] -> [ XML Stage 1 Parsing ID info ] - > [ XML Stage 2 Parsing Container info] -> Peek

Other option, I may have to parse out both elements separately and use a join stage to achieve this. But we are trying to minimize the number of the joins in the job. Would like to know if there is a way to make my first option work.

Thanks,
Vik.
vik1979
Participant
Posts: 9
Joined: Thu Oct 21, 2010 3:22 am

Re: Datastage XML input stage question

Post by vik1979 »

First option is working fine. I changed the path I gave for 2nd XML stage.
asorrell
Posts: 1707
Joined: Fri Apr 04, 2003 2:00 pm
Location: Colleyville, Texas

Post by asorrell »

I know you are on release 9 - but wanted to mention that the Hierarchical Stage in 11.3 does a better job of handling these situations. It has some different limitations but is usually more flexible than the XML stage. As long as your XSD (schema) defines it as a repeating element, the Hierarchical Stage should handle it whether it is there zero or more times.

If your site has 11.3 available (now or coming soon) then you might want to look at the Hierarchical stage.
Last edited by asorrell on Thu Oct 19, 2017 10:08 pm, edited 1 time in total.
Andy Sorrell
Certified DataStage Consultant
IBM Analytics Champion 2009 - 2020
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

As Andy notes, the Hierarchical stage can do some nice things.....but beware! It cannot handle parents with no children, when both are on the same link.
xmlInput handles this fine, with unchecking repeating element required, and nulls will result for the children. The other will lose the Parent entirely, forcing separate links.

The Hierarchical Stage is more performant and can handle huge XML docs, but when the XML docs are small, use the xmlInput Stage instead.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
vik1979
Participant
Posts: 9
Joined: Thu Oct 21, 2010 3:22 am

Post by vik1979 »

Thanks for the reply. I am unable to see your complete post, I need to get the premium access :)
I will definitely look into the IBM notes for the Hierarchical stage in 11.3.
I am working on complex HL7 standard schemas, which has a lot of parent child relationships.
I tried XML stage before using XML input stage in 9.1, which was making it complicated to parse elements under each list into a separate output.. which will end up using too many joins to combine the data.
So, am trying to understand how hierarchical stage is different from XML stage. Thanks.
venkata9
Premium Member
Premium Member
Posts: 7
Joined: Mon Sep 18, 2017 6:24 pm

XML PARSER ERROR

Post by venkata9 »

Hi,

i'm not able to post a new topic even though i'm premium member.
so i'm posting my question here

I'm getting the the below xml parser error randomly,

environet: DS 9.1.2 , Redhat linux 6

XMLS_MESSAGE_PARSER,0: Message bundle error Can't find resource for bundle com.ibm.e2.Bundle_E2_engine_msgs_en_US, key E2IllegalStateException.parentCursorInvalid.

could some one help?
Venkata
Post Reply