Page 1 of 1

Recursive XML

Posted: Thu Apr 26, 2018 12:41 pm
by dstager
Hi

I am trying to parse out an XML that has recursive list embedded in a lower level. However, the Edit Assembly does not show the recursive elements (aliased with <> plus an infinity sign). The recursive list STARTS at 7 levels deep within the XML...but I only need to parse the 1st level of the recursive list.

1. How do I configure the Edit Assembly to expand levels (already upped the tree size to 10) - still does not show lower levels...
2. Is the recursive list limited to the tree level limit?

Your valued expertise is greatly appreciated!

Posted: Fri Apr 27, 2018 11:11 am
by eostic
Most times, for situations like this, I prefer to use the xmlInput Stage. It more easily handles incomplete paths.

I think you said you only need the first level, but imagine if you wanted the first "four".....

....however, for certain incoming documents and parent nodes, you might only have two levels...and for another, all four...but another, none, or just one.

You have to declare all the levels you potentially want, but with the xmlInput Stage, you have the option to just get "nulls" for children that don't exist "for any given parent node". This is the check box that says "Repetition Element Required." This is a powerful notion --- if you "uncheck" it, it tells the Stage "if this path is not complete, and doesn't have a node at this depth, that's ok...retrieve the parent and provide nulls for any elements or attributes on the output link that are not in this instance". If you leave it checked, it acts like a pure, simple relational JOIN.... parents without children are NOT included in the answer set -- or in this context, not sent down the output link.

Declare your potentially nested recursive nodes directly:

(other xpath, etc. in the description is assumed, let's say that you have an "actualElement" that is capable of being nested many times in a "child" element node)...

child1column ...Parent/Child/actualElement/text()
child2column ...Parent/Child/Child/actualElement/text()
child3column ...Parent/Child/Child/Child/actualElement/text()
child4column ...Parent/Child/Child/Child/Child/actualElement/text()

With the Hierarchical Stage, you probably will have to use the "chunk" feature. When you pick your document root, in the xmlParser step, find the list that has your recursive node. Right click. Choose Chunk. That lets you address that "piece" later in the Assembly, possibly passing it to yet another parser step, with an xmlSchema View (from the Library Manager" that teases out the parts that you need...or else, what I prefer to do...send that "chunk" downstream in a longvarchar and pass just "that" into a separate xmlInput Stage for further smart parsing of its levels.

Hopefully these suggestions get you closer to your solution.

Ernie