Design XML(multiOccurance) using Hierarchial Data stage

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
sensiva
Premium Member
Premium Member
Posts: 21
Joined: Tue Aug 22, 2017 10:39 am

Design XML(multiOccurance) using Hierarchial Data stage

Post by sensiva »

Hello,

I would like to have your views on designing a multi-occurance xml job. I am stuck up with the hierarchial data stage.

My requirement is that i have a table to read the orders which is like this

Order1, Product1, price, quantity, etc....
Order1, Product2, price, quantity, etcc...
Order2, Product1, price, quantity, etc....
Order3, product1, price, quantity, etc...
Order3, product2, price, quantity, etc...
order3, product3, price, quantity, etc...

Then i have to create a xml for each order with multiple products that it could have and send them to mq. Here in our case it would be 3 messages

First message
<Order1>
<Product>
<Product1>details</Product1>
<Product2>details</Product2>
/Product>
</Order1>

Second message
<Order2>
<Product>
<Product1>details</Product1>
/Product>
</Order2>

third message
<Order3>
<Product>
<Product1>details</Product1>
<Product2>details</Product2>
<Product3>details</Product3>
/Product>
</Order3>


With respect to the design, i thought of using hierarchial data and using H-join and xml-composer to do this. But it would need multiple inputs. but these inputs are sequential , hence grouping needs to be done. Personally I fear grouping as we work with millions of rows, as it could impact the perf.

Any pointers would be of great help. My experience with datastage is just 16 months, excuse me if my considerations are not correct.

I was also thinking of xml output, but would land in the same problem of grouping.
sen
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Based on this kind of straight path hierarchy, I would use xml output and skip the hierarchical stage. You dont need it. It only makes sense when you have parallel siblings to produce for a given order, such as maybe multiple purchasers. There may be other variables in your requirement but so far, xml output would be simpler.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
sensiva
Premium Member
Premium Member
Posts: 21
Joined: Tue Aug 22, 2017 10:39 am

Design XML(multiOccurance) using Hierarchial Data stage + MQ

Post by sensiva »

Thanks for your proposal. I changed to xmlOutput node and its was a lot simpler.

I have an another question concerning this above design and not sure if its good to continue on the same post. I could change if required. As said above, I have to send these messages to MQ queue where our middleware would treat them on real time. I am worried if at any given day the job got aborted after sending millions of data, the support would try to rerun the job which would result in unwanted heavy message traffic and unwanted logs etc....

I was checking if i could get the orderId for the orders that were sent successfully to queue, so that i can update the order as sent in DB.
I tried by dragging an outputlink from MQConnector, but that works like a typical MQinput stage

Another idea was to send a sepearate link to DB and another to MQ, but as per my understanding DB could process and finish earlier than MQ as it always depends on the activity on the final stage, and hence an MQ error at last would probably not be captured in DB.

Any pointers would be of help. Well the above is my understanding on datastage as its a bulk load process that we could potentially have difference between 2 different links processing time. not sure if thats correct.
sen
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Rather than a database, why not direct your data to a work queue, from which they could be read as part of completion of processing? Then, after any failure, you could check whether the work queue is empty.
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
sensiva
Premium Member
Premium Member
Posts: 21
Joined: Tue Aug 22, 2017 10:39 am

Post by sensiva »

I believe the work queue you mentioned here is an intermediate queue and not the "workqueue" option available in mq connector. Because i do not see this option when the mq connector is used in output mode.

Hence I thought of publishing the message to a topic and have 2 subscribers one that sends it to the actual application and other is a datastage job that reads and updates the status of the orders. Here i kind of duplicate the messages but i am not seeing another option. I believe I am expecting more transactional which could add further more complexities. I believe the publish-subscribe should work. would keep you guys posted.

Concerning the XMLOuput, i have another question, i have started implementing and stuck up inbetween with a repeating tag which is expectde from the same record from 2 different columns.

Input record is like this
Order1, Product, etc..., StartDate, EndDate

But in output, i need like this
<Order1>
<other tags>
<Date type="StartDate">2019-01-09</Date>
<Date type="EndDate">2019-12-31</Date>
</Order1>

Here i cannot set the startDate or endDate as keyelement as they are in the same record. I tried giving hardcoded xpaths like
/Order/Date[1]/
/Order/Date[2]/
It just takes the first occurance of date "StartDate". I dont see "EndDate" at all..

Any pointers would be of great help.
sen
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Looks like a classic poorly designed xml. Better would have been to have an actual element called "<StartDate>" and another called <EndDate> and populate them appropriately. For something like this, just build the little xml chunk in a transformer and include it in the Description property on the appropriate column (from the transformer) adopting the xpath that you need, usually ending with just a "/" instead of putting on the text() or @attriuteName.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Post Reply