XML Parsing & validation against the schema .XSD

Post questions here relative to DataStage Enterprise/PX Edition for such areas as Parallel job design, Parallel datasets, BuildOps, Wrappers, etc.

Moderators: chulett, rschirm, roy

Post Reply
dinakaran_s
Participant
Posts: 22
Joined: Wed Jul 02, 2008 7:01 am
Location: London

XML Parsing & validation against the schema .XSD

Post by dinakaran_s »

Hi All,

My requriment is to validate the XML based on the XSD schema. Is it possible to impelement in our datastage job.

Sample XSD schema below; The reference element "Activity" is a user defined element "ActivityDataType" and it has list of restricted values as highlighed below. Only any one of the listed value can be the value for the element in the XML file otherwise the validation needs to be failed.

<?xml version ="1.0" encoding = "UTF-8"?>
<xsd:schema targetNamespace="SDR_Body_1-0"
xmlns="SDR_Body_1-0"
xmlns:h="SDR_Header_1-0"
xmlns:fpml="http://www.fpml.org/FpML-5/transparency"
xmlns:fpmlreport="http://www.fpml.org/FpML-5/reporting"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

version="1-0">
<xsd:import namespace="http://www.fpml.org/FpML-5/transparency" schemaLocation="/xmls/SDR/transparency/fpml-main-5-2.xsd" />
<xsd:import namespace="http://www.fpml.org/FpML-5/reporting" schemaLocation="/xmls/SDR/reporting/fpml-main-5-2.xsd" />
<xsd:import namespace="SDR_Header_1-0" schemaLocation="/xmls/SDR/transparency/SDR_Trans_Header_1-0.xsd" />

<!-- Begin Level 1 =============================================-->
<xsd:element name="SDR_Body">
<xsd:complexType mixed="false">
<xsd:sequence>
<xsd:element ref="Activity"/>
<xsd:element ref="Status"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<!-- End Level 1 ===============================================-->
<!-- Begin simple element name type declarations ======================= -->
<xsd:element name="Activity" type="ActivityDataType"/>

<!-- End simple element name type declarations ===================== -->
<!-- Begin custom element datatype declarations ==================== -->
<xsd:simpleType name="ActivityDataType">
<xsd:restriction base="xsd:string">
<xsd:enumeration value="Cancel"/>
<xsd:enumeration value="Correct"/>
<xsd:enumeration value="LifeCycle"/>
<xsd:enumeration value="New"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:schema>

My question is either can we do this kind of schema validation in datastage. If it possible; then how can we implement this.

Thanks,
Dina
eostic
Premium Member
Premium Member
Posts: 3838
Joined: Mon Oct 17, 2005 9:34 am

Post by eostic »

Absolutely. Sometimes it can get a bit tricky as to how the schema is specified, but yes, it can be done, with the new xml stage in 8.5 or with the prior xmlInput Stage in 8.5 or earlier.

Ernie
Ernie Ostic

blogit!
<a href="https://dsrealtime.wordpress.com/2015/0 ... ere/">Open IGC is Here!</a>
vmcburney
Participant
Posts: 3593
Joined: Thu Jan 23, 2003 5:25 pm
Location: Australia, Melbourne
Contact:

Post by vmcburney »

If you install DataStage 8.5 and Fix Pack 1 you will receive the new XML Assembly stages - this comes with a vastly improved method for importing the XML schema files and transforming that information into whatever you want.

For validation it depends on what you are trying to catch. You can read the data and then detect missing elements in a transformer. It's hard to use DataStage to validate invalid XML since it will not be able to process it. It should be able to handle missing or invalid strings inside the XML elements as long as the XML structure matches the XSD schema file.
dinakaran_s
Participant
Posts: 22
Joined: Wed Jul 02, 2008 7:01 am
Location: London

Post by dinakaran_s »

Can you please tell me how to implement with XML pack 2. Is it through expression to be hard coded using transformer stage or the XML stage supports this property.
prakashdasika
Premium Member
Premium Member
Posts: 72
Joined: Mon Jul 06, 2009 9:34 pm
Location: Sydney

Post by prakashdasika »

In v.8.1 XML validation is done at XML input stage. We can specify it as strict validation which means it will look for all enumeration values for fields, as well as any unknown elements. The XSD's location should be specified in the XML header as well as in the 'Namespace declaration' tab in XML input stage.
Prakash Dasika
ETL Consultant
Sydney
Australia
Post Reply