DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
sensiva



Group memberships:
Premium Members

Joined: 22 Aug 2017
Posts: 12

Points: 235

Post Posted: Thu Jan 03, 2019 7:15 am Reply with quote    Back to top    

DataStage® Release: 11x
Job Type: Parallel
OS: Unix
Hello,

I would like to have your views on designing a multi-occurance xml job. I am stuck up with the hierarchial data stage.

My requirement is that i have a table to read the orders which is like this

Order1, Product1, price, quantity, etc....
Order1, Product2, price, quantity, etcc...
Order2, Product1, price, quantity, etc....
Order3, product1, price, quantity, etc...
Order3, product2, price, quantity, etc...
order3, product3, price, quantity, etc...

Then i have to create a xml for each order with multiple products that it could have and send them to mq. Here in our case it would be 3 messages

First message
<Order1>
<Product>
<Product1>details</Product1>
<Product2>details</Product2>
/Product>
</Order1>

Second message
<Order2>
<Product>
<Product1>details</Product1>
/Product>
</Order2>

third message
<Order3>
<Product>
<Product1>details</Product1>
<Product2>details</Product2>
<Product3>details</Product3>
/Product>
</Order3>


With respect to the design, i thought of using hierarchial data and using H-join and xml-composer to do this. But it would need multiple inputs. but these inputs are sequential , hence grouping needs to be done. Personally I fear grouping as we work with millions of rows, as it could impact the perf.

Any pointers would be of great help. My experience with datastage is just 16 months, excuse me if my considerations are not correct.

I was also thinking of xml output, but would land in the same problem of grouping.

_________________
sen
eostic

Premium Poster



Group memberships:
Premium Members

Joined: 17 Oct 2005
Posts: 3821

Points: 30806

Post Posted: Fri Jan 04, 2019 7:21 pm Reply with quote    Back to top    

Based on this kind of straight path hierarchy, I would use xml output and skip the hierarchical stage. You dont need it. It only makes sense when you have parallel siblings to produce for a given o ...

_________________
Ernie Ostic

blogit!
Open IGC is Here!
Rate this response:  
Not yet rated
sensiva



Group memberships:
Premium Members

Joined: 22 Aug 2017
Posts: 12

Points: 235

Post Posted: Mon Jan 07, 2019 6:45 am Reply with quote    Back to top    

Thanks for your proposal. I changed to xmlOutput node and its was a lot simpler.

I have an another question concerning this above design and not sure if its good to continue on the same post. I could change if required. As said above, I have to send these messages to MQ queue where our middleware would treat them on real time. I am worried if at any given day the job got aborted after sending millions of data, the support would try to rerun the job which would result in unwanted heavy message traffic and unwanted logs etc....

I was checking if i could get the orderId for the orders that were sent successfully to queue, so that i can update the order as sent in DB.
I tried by dragging an outputlink from MQConnector, but that works like a typical MQinput stage

Another idea was to send a sepearate link to DB and another to MQ, but as per my understanding DB could process and finish earlier than MQ as it always depends on the activity on the final stage, and hence an MQ error at last would probably not be captured in DB.

Any pointers would be of help. Well the above is my understanding on datastage as its a bulk load process that we could potentially have difference between 2 different links processing time. not sure if thats correct.

_________________
sen
Rate this response:  
Not yet rated
ray.wurlod

Premium Poster
Participant

Group memberships:
Premium Members, Inner Circle, Australia Usergroup, Server to Parallel Transition Group

Joined: 23 Oct 2002
Posts: 54519
Location: Sydney, Australia
Points: 295643

Post Posted: Mon Jan 07, 2019 5:48 pm Reply with quote    Back to top    

Rather than a database, why not direct your data to a work queue, from which they could be read as part of completion of processing? Then, after any failure, you could check whether the work queue is ...

_________________
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Rate this response:  
Not yet rated
sensiva



Group memberships:
Premium Members

Joined: 22 Aug 2017
Posts: 12

Points: 235

Post Posted: Wed Jan 09, 2019 1:09 am Reply with quote    Back to top    

I believe the work queue you mentioned here is an intermediate queue and not the "workqueue" option available in mq connector. Because i do not see this option when the mq connector is used in output mode.

Hence I thought of publishing the message to a topic and have 2 subscribers one that sends it to the actual application and other is a datastage job that reads and updates the status of the orders. Here i kind of duplicate the messages but i am not seeing another option. I believe I am expecting more transactional which could add further more complexities. I believe the publish-subscribe should work. would keep you guys posted.

Concerning the XMLOuput, i have another question, i have started implementing and stuck up inbetween with a repeating tag which is expectde from the same record from 2 different columns.

Input record is like this
Order1, Product, etc..., StartDate, EndDate

But in output, i need like this
<Order1>
<other tags>
<Date type="StartDate">2019-01-09</Date>
<Date type="EndDate">2019-12-31</Date>
</Order1>

Here i cannot set the startDate or endDate as keyelement as they are in the same record. I tried giving hardcoded xpaths like
/Order/Date[1]/
/Order/Date[2]/
It just takes the first occurance of date "StartDate". I dont see "EndDate" at all..

Any pointers would be of great help.

_________________
sen
Rate this response:  
Not yet rated
eostic

Premium Poster



Group memberships:
Premium Members

Joined: 17 Oct 2005
Posts: 3821

Points: 30806

Post Posted: Fri Jan 11, 2019 4:48 am Reply with quote    Back to top    

Looks like a classic poorly designed xml. Better would have been to have an actual element called "<StartDate>" and another called <EndDate> and populate them appropriately. For someth ...

_________________
Ernie Ostic

blogit!
Open IGC is Here!
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours