DSXchange: DataStage and IBM Websphere Data Integration Forum
View next topic
View previous topic
Add To Favorites
Author Message
qt_ky



Group memberships:
Premium Members

Joined: 03 Aug 2011
Posts: 2819
Location: USA
Points: 21356

Post Posted: Sat Sep 16, 2017 8:08 am Reply with quote    Back to top    

DataStage® Release: 11x
Job Type: Parallel
OS: Unix
Additional info: 11.5.0.2
Any ideas on how to build an IA data rule to check if a column contains valid XML (properly constructed)? I did not see anything related to XML in the pre-built rule definitions.

_________________
Choose a job you love, and you will never have to work a day in your life. - Confucius
eostic

Premium Poster



Group memberships:
Premium Members

Joined: 17 Oct 2005
Posts: 3784

Points: 30397

Post Posted: Sat Sep 16, 2017 11:03 am Reply with quote    Back to top    

Theoretically.....in java? Assuming you can invoke either a well formed checker or a schema validator. ....or else use a ds job itself with an exceptions stage downstream from a hierarchical ...

_________________
Ernie Ostic

blogit!
Open IGC is Here!
Rate this response:  
Not yet rated
qt_ky



Group memberships:
Premium Members

Joined: 03 Aug 2011
Posts: 2819
Location: USA
Points: 21356

Post Posted: Mon Sep 18, 2017 7:39 am Reply with quote    Back to top    

I'm not really a big XML fan or a java developer; just looking for a least effort path to validate XML that is stored in a source database column (without any XSD). The data source is SQL Server, which we have read-only access to via ODBC on Information Server.

What are the options on AIX? Options:

1. I did some searching and found an Apache Xerces-C++ validating XML parser. http://xerces.apache.org/xerces-c/index.html

2. We do have DataStage. I have not done a lot with XML in DataStage. Does it require an XSD or can it validate XML by itself?

3. I assume most relational database systems have XML data type support. Could we attempt to load the source data into a temporary table in the IADB database (DB2 10.5)? Is that a legitimate use of the bundled DB2?

4. Other options?

_________________
Choose a job you love, and you will never have to work a day in your life. - Confucius
Rate this response:  
Not yet rated
chulett

Premium Poster


since January 2006

Group memberships:
Premium Members, Inner Circle, Server to Parallel Transition Group

Joined: 12 Nov 2002
Posts: 42765
Location: Denver, CO
Points: 220367

Post Posted: Mon Sep 18, 2017 7:59 am Reply with quote    Back to top    

It's been a long time but from what I recall it needs the xsd in order to have a clue how to "validate" the xml. It might depends on what that means to the OP as simply checking to see if it is well-formed doesn't need it but would need to happen before something tries to process it or the parser will fail. Once you are sure it is well-formed then the gory details of the elements can be validated against the xsd by an ETL tool.

I have vague recollections of using Xerces (no, not the king of Persia!) when we had to handle large (as in hundreds of megabytes) XML files for Google or maybe it was something from the Java Beans collection? Way too long ago but I remember it had to be a stream-based tool as the files we were processing were "too large" to load all up in memory. Much preferred some of our other sources who gave us a crap-ton of teeny little files.

Not an XML fan either, but I work with it when forced to. Wink

_________________
-craig

Research shows that 6 out of 7 dwarves aren't happy
Rate this response:  
Not yet rated
eostic

Premium Poster



Group memberships:
Premium Members

Joined: 17 Oct 2005
Posts: 3784

Points: 30397

Post Posted: Mon Sep 18, 2017 3:26 pm Reply with quote    Back to top    

I would also prefer not to use java if I can avoid it. You may be successful with the xmlInput stage. It doesnt need an xsd, and performs a well formed xerces based check when getting started. ...

_________________
Ernie Ostic

blogit!
Open IGC is Here!
Rate this response:  
Not yet rated
eostic

Premium Poster



Group memberships:
Premium Members

Joined: 17 Oct 2005
Posts: 3784

Points: 30397

Post Posted: Tue Sep 19, 2017 6:20 am Reply with quote    Back to top    

One very important key here is to know in advance what you are looking for in terms of validation? You mentioned that there is no xsd..... formal xml "validation" means determining if a document ...

_________________
Ernie Ostic

blogit!
Open IGC is Here!
Rate this response:  
Not yet rated
qt_ky



Group memberships:
Premium Members

Joined: 03 Aug 2011
Posts: 2819
Location: USA
Points: 21356

Post Posted: Tue Sep 19, 2017 7:12 am Reply with quote    Back to top    

Thank you so much for that outline--big time saver! I will follow that and see how far I can get.

I am sticking with just the surface level initially (what you called well-formed-ness) then will see where it goes with the customer.

_________________
Choose a job you love, and you will never have to work a day in your life. - Confucius
Rate this response:  
Not yet rated
Display posts from previous:       

Add To Favorites
View next topic
View previous topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2002 phpBB Group
Theme & Graphics by Daz :: Portal by Smartor
All times are GMT - 6 Hours