Resolving Special characters in XML

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
parag.s.27
Participant
Posts: 221
Joined: Fri Feb 17, 2006 3:38 am
Location: India
Contact:

Resolving Special characters in XML

Post by parag.s.27 »

We are facing an issue with special characters coming in XML message. For e.g. "&","<",">" etc. Now these special characters are automatically handle by DataStage for all the elements except the "repeating elements".

I could figure out that issue is happening because of the way repeating elements are created. We did use a transformer to hardcode the XML namespace as a text and later convert it in XML stage by specifying the DATA ELEMENT as "XML". The example is mentioned below: -

we had to provide an XML message on MQ where one particular Address section should look like following: -

Code: Select all

<ns5:Address>
  <ns4:Line>1500 SCENIC DR</ns4:Line> 
  <ns4:Line>CROSSROADS & AVE</ns4:Line> 
  <ns4:Line>NEAR OLD CHURCH</ns4:Line> 
  <ns4:Line /> 
</ns5:Address>
The point here is a repeating element of "Line". For Address Line1, 2, 3, 4. I could produce such result by using a Transformer prior to XML stage and concatenated all address elements along with Name Space like : -

Code: Select all

'<ns4:Line>':Addre Line 1:'</ns4:Line>':'<ns4:Line>':Addre Line 2:'</ns4:Line>':'<ns4:Line>':Addre Line 3:'</ns4:Line>':'<ns4:Line>':Addre Line 4:'</ns4:Line>'.
To Summarize, the special characters are not handled by XML Output stage if a XML element is constructed as text and later converted to XML using DATA ELEMENT as XML.

This can be resolved by sending all address lines as it is from one XML stage and then, just before second XML stage the transformer should be used with Field function to extract handled value between the tags (<addrline1>........)and constructing the repeating element again.

But this is only possible if number of repeating elements are less, we have a job where there are 32 different repeating elements. I wanted to know if there is any robust method using XML stage of something in DataStage to handle all such scenarios.
Thanks & Regards
Parag Saundattikar
Certified for Infosphere DataStage v8.0
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

I don't have the time to fully absorb what you need here, but this caught my eye:
We did use a transformer to hardcode the XML namespace as a text and later convert it in XML stage by specifying the DATA ELEMENT as "XML".
Just want to point out that using that data element does not convert anything. You use that to tell the parser that something is already XML and for it to not touch it. FYI.
-craig

"You can never have too many knives" -- Logan Nine Fingers
parag.s.27
Participant
Posts: 221
Joined: Fri Feb 17, 2006 3:38 am
Location: India
Contact:

Post by parag.s.27 »

What I wanted to say is, if Data Element property is set then as you said, DataStage will not handle the Special characters such as &, <, > etc in that element.

Also if I have some repeating elements then I need to perform some transformations in a transformer prior to XML stage to create a repeating node. This transformation is some kind of hardcoding of XML tags which is taken as text by DataStage and hence not handled for special characters.

We needed help on how to tell DataStage, especially XML output stage to handle special characters for above two scenarios
Thanks & Regards
Parag Saundattikar
Certified for Infosphere DataStage v8.0
parag.s.27
Participant
Posts: 221
Joined: Fri Feb 17, 2006 3:38 am
Location: India
Contact:

Post by parag.s.27 »

Thanks Chulett for your time.

I resolved it in a cleaner way. The Idea is similar to what you said.
Thanks & Regards
Parag Saundattikar
Certified for Infosphere DataStage v8.0
chulett
Charter Member
Charter Member
Posts: 43085
Joined: Tue Nov 12, 2002 4:34 pm
Location: Denver, CO

Post by chulett »

Can you explain your cleaner way, please?
-craig

"You can never have too many knives" -- Logan Nine Fingers
parag.s.27
Participant
Posts: 221
Joined: Fri Feb 17, 2006 3:38 am
Location: India
Contact:

Post by parag.s.27 »

Though I got partial success but I think I'll resolve it further. By partial success I meant that in Internet explorer it is showing the entire element as text(in black color) but on Tibco side it is able to parse it but not completely because I did not send all elements for testing.

I used something like this: -

Code: Select all

'<![CDATA[<ns4:Line>]]>':UpCase(NullToValue(Lnk_ExtExpRole.KDAE_APX_PARS_LN1_AD,'')):'<![CDATA[</ns4:Line>]]>':'<![CDATA[<ns4:Line>]]>':UpCase(NullToValue(Lnk_ExtExpRole.KDAE_APX_PARS_LN2_AD,'')):'<![CDATA[</ns4:Line>]]>':'<![CDATA[<ns4:Line>]]>':UpCase(NullToValue(Lnk_ExtExpRole.KDAE_APX_PARS_LN3_AD,'')):'<![CDATA[</ns4:Line>]]>':'<![CDATA[<ns4:Line>]]>':UpCase(NullToValue(Lnk_ExtExpRole.KDAE_APX_PARS_LN4_AD,'')):'<![CDATA[</ns4:Line>]]>'
Thanks & Regards
Parag Saundattikar
Certified for Infosphere DataStage v8.0
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

That was the perfect solution, and "by design". The XML data element is intended for those situations when YOU are taking sole responsibility for building xml -- which has to include escaping any characters that require it.......

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
parag.s.27
Participant
Posts: 221
Joined: Fri Feb 17, 2006 3:38 am
Location: India
Contact:

Post by parag.s.27 »

That is so true Ernie. Though I've been working with XML and WSDL in IIS for quite some time now, but every day is a new learning. I am just loving, how you can implement many things in DataStage which are not part of IIS architecture
Thanks & Regards
Parag Saundattikar
Certified for Infosphere DataStage v8.0
pnpmarques
Participant
Posts: 35
Joined: Wed Jun 15, 2005 9:27 am

Extra scenario

Post by pnpmarques »

I had a similar situation (special chars) using XML Output Stage. I was trying to put together several elements with varchar columns containing xml code in it.
I thought it would be enough to set Data Element=XML on the OUTPUT link, but I found that it was also necessary to set all columns in the INPUT link with Data Element=XML.
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

As you discovered, the "input" link column grid of the xmlOutput Stage is the real driver of the logic to construct the xml content. It's a bit counterintuitive, but the output link column list is really just the "receiver" of the final created xml.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
Post Reply