Page 1 of 1

XML Files in Information Analyzer

Posted: Mon May 22, 2017 3:55 am
by Nagac
Hi

We have XML files as source, which we want to do profiling in IA. Is that possible?, If possible we will end up doing staging and then(it is little complex as XML files are massive)

I know we can do RDBMS, Flat Files but couldn't find the way to handle XML files. Can someone advise.

Thanks

Posted: Mon May 22, 2017 7:57 am
by eostic
Need to parse them first, into the various chunks of text that you want to examine...... usually this means sending the columns to the xmlInput or alternatively, the Hierarchical Stage, and parsing them there into their appropriate rows and columns, based on the hierarchical structure of your particular XML document, and the various tags (xml elements and attributes) that it contains. One "column" in one "row" of xml in an rdbms could potentially contain thousands of rows and columns of actual "data" to be examined.

Ernie

Posted: Mon May 22, 2017 10:03 am
by Nagac
Thanks Ernie,

Do you mean, we need to load these chunks into table then do the profiling?
One "column" in one "row" of xml in an rdbms could potentially contain thousands of rows and columns of actual "data" to be examined.

Posted: Mon May 22, 2017 7:36 pm
by eostic
Yes. And for several key reasons. The most important being that you need to be looking at "rows" of data for any fairly standard profiling tool. This means parsinflg you xml structure into its potentially many different table relationships. Some xml documents are simple single repeating nodes, but not many. Each parent child grandchild (etc) path is a potential set of rows for analysis. Parse them out and then analuze the resulting tables or sequential files.

Ernie