Recursive XML

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dstager
Participant
Posts: 47
Joined: Tue Jan 08, 2008 8:43 am

Recursive XML

Post by dstager »

Hi

I am trying to parse out an XML that has recursive list embedded in a lower level. However, the Edit Assembly does not show the recursive elements (aliased with <> plus an infinity sign). The recursive list STARTS at 7 levels deep within the XML...but I only need to parse the 1st level of the recursive list.

1. How do I configure the Edit Assembly to expand levels (already upped the tree size to 10) - still does not show lower levels...
2. Is the recursive list limited to the tree level limit?

Your valued expertise is greatly appreciated!
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Most times, for situations like this, I prefer to use the xmlInput Stage. It more easily handles incomplete paths.

I think you said you only need the first level, but imagine if you wanted the first "four".....

....however, for certain incoming documents and parent nodes, you might only have two levels...and for another, all four...but another, none, or just one.

You have to declare all the levels you potentially want, but with the xmlInput Stage, you have the option to just get "nulls" for children that don't exist "for any given parent node". This is the check box that says "Repetition Element Required." This is a powerful notion --- if you "uncheck" it, it tells the Stage "if this path is not complete, and doesn't have a node at this depth, that's ok...retrieve the parent and provide nulls for any elements or attributes on the output link that are not in this instance". If you leave it checked, it acts like a pure, simple relational JOIN.... parents without children are NOT included in the answer set -- or in this context, not sent down the output link.

Declare your potentially nested recursive nodes directly:

(other xpath, etc. in the description is assumed, let's say that you have an "actualElement" that is capable of being nested many times in a "child" element node)...

child1column ...Parent/Child/actualElement/text()
child2column ...Parent/Child/Child/actualElement/text()
child3column ...Parent/Child/Child/Child/actualElement/text()
child4column ...Parent/Child/Child/Child/Child/actualElement/text()

With the Hierarchical Stage, you probably will have to use the "chunk" feature. When you pick your document root, in the xmlParser step, find the list that has your recursive node. Right click. Choose Chunk. That lets you address that "piece" later in the Assembly, possibly passing it to yet another parser step, with an xmlSchema View (from the Library Manager" that teases out the parts that you need...or else, what I prefer to do...send that "chunk" downstream in a longvarchar and pass just "that" into a separate xmlInput Stage for further smart parsing of its levels.

Hopefully these suggestions get you closer to your solution.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Post Reply