Page 1 of 1

XML Hierarchical Stage Produces No Rows

Posted: Fri Jun 22, 2018 10:02 am
by irvinew
I have successfully read on 4 XSD files

I have 1 xml source test file to read in.

The source xml file has really 2 parts, one called AcademicRecordBatch which has basic Sender/Destination info that does not change; I can read in that. it produces rows in a seperate job. I

The second part has high school transcript data, called HighSchoolTranscript. I have problems with this part. This is in a separate job.

I am using 2 jobs because I couldn't get the union parser stage to work without hanging the system.

My job is simple to start; it goes from a Hierarchical stage right to a peek stage just to get things going. Problem is that the highschool job compiles but produces no rows. To start it is only mapping firstname, lastname.

Sometimes I run into a scalar error like this:

com.ibm.e2.core.exceptions.E2IllegalStateException: CDIER0835E: In step XML_Parser, the Hierarchical Data stage tried to assign the value true to the {http://www.ibm.com/e2/reserved}@@isPresent scalar element, but the element already has the value true. Possible mapping error involving ListToGroup. The parent list element for the scalar element is AcademicSummary.
Test completed

Not sure if warnings prevent the rows from being read; but I am getting a successful compiles but no rows simple or complex; I know it is rather vague but any thoughts on where to debug/start? I have tried chunking with successful compilations but still no rows.

Thanks for the help in advance.




This is the source xml file

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<AcRecBat:AcademicRecordBatch xmlns:AcRecBat="urn:org:pesc:message:AcademicRecordBatch:v2.1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:org:pesc:message:AcademicRecordBatch:v2.1.0 AcademicRecordBatch_v2.1.xsd">
	<BatchEnvelope>
		<BatchID>00000001</BatchID>
		<BatchDateTime>2018-01-31T11:06:35-07:00</BatchDateTime>
		<BatchDeliveryMethod>DeliverWhole</BatchDeliveryMethod>
		<SourceAgency>
			<Organization>
				<APAS>SK00000000</APAS>
				<LocalOrganizationID>
					<LocalOrganizationIDCode>SK00000000</LocalOrganizationIDCode>
					<LocalOrganizationIDQualifier>SK</LocalOrganizationIDQualifier>
				</LocalOrganizationID>
				<OrganizationName>Saskatchewan Ministry of Education</OrganizationName>
				<Contacts>
					<Phone>
						<AreaCityCode>306</AreaCityCode>
						<PhoneNumber>7876012</PhoneNumber>
					</Phone>
					<Email>
						<EmailAddress>student.records@gov.sk.ca</EmailAddress>
					</Email>
				</Contacts>
			</Organization>
		</SourceAgency>
		<DestinationAgency>
			<Organization>
				<PSIS>47004000</PSIS>
				<LocalOrganizationID>
					<LocalOrganizationIDCode>47004000</LocalOrganizationIDCode>
					<LocalOrganizationIDQualifier>SK</LocalOrganizationIDQualifier>
				</LocalOrganizationID>
				<OrganizationName>University of Regina</OrganizationName>
			</Organization>
		</DestinationAgency>
	</BatchEnvelope>
	<BatchContent>
		<HSTrn:HighSchoolTranscript xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:n1="http://www.altova.com/samplexml/other-namespace" xmlns:HSTrn="urn:org:pesc:message:HighSchoolTranscript:v1.5.0" xsi:schemaLocation="urn:org:pesc:message:HighSchoolTranscript:v1.5.0 HighSchoolTranscript_v1.5.0.xsd">
			<TransmissionData>
				<DocumentID>2018-01-3111061</DocumentID>
				<CreatedDateTime>2018-01-31T11:06:35-07:00</CreatedDateTime>
				<DocumentTypeCode>StudentRequest</DocumentTypeCode>
				<TransmissionType>Original</TransmissionType>
				<Source>
					<Organization>
						<APAS>SK00000000</APAS>
						<LocalOrganizationID>
							<LocalOrganizationIDCode>SK00000000</LocalOrganizationIDCode>
							<LocalOrganizationIDQualifier>SK</LocalOrganizationIDQualifier>
						</LocalOrganizationID>
						<OrganizationName>Saskatchewan Ministry of Education</OrganizationName>
						<Contacts>
							<Phone>
								<AreaCityCode>306</AreaCityCode>
								<PhoneNumber>7876012</PhoneNumber>
							</Phone>
							<Email

Posted: Fri Jun 22, 2018 10:46 am
by wpkalsow
The XML_Parser error will only occur during parsing of data to match the xsd during testing or execution of the stage.

Sounds like it is a mapping issue to me.

Can you share the xsd for this data?

Posted: Fri Jun 22, 2018 1:40 pm
by irvinew
Hope this helps; its the HighSchool XSD; the include CoreMain is far to large to paste here; the AcademicRecord is quite big as well. Those 2 files don't appear in the Library as name spaces; only the AcademicRe ordBatch and HighSchoolTranscript can be found to be used as XSD documents in the parser stages.

Thank you in advance :)

Code: Select all

	<xs:import namespace="urn:org:pesc:core:CoreMain:v1.14.0" schemaLocation="CoreMain_v1.14.0.xsd"/>
	<xs:import namespace="urn:org:pesc:sector:AcademicRecord:v1.9.0" schemaLocation="AcademicRecord_v1.9.0.xsd"/>
	<!--============================================================================-->
	<!--Name:      HighSchoolTranscript.xsd  -->
	<!--Version:  1.5.0-->
	<!--Date:       17-December-2014-->
	<!---->
	<!--Change Log:-->
	<!--v1.0.x 23-May-2005 Bruce Marton  - Draft version proposed by PESC High School Transcript workgroup. -->
	<!--v1.0.x 24-May-2005 Bruce Marton  -Minor corrections. -->
	<!--v1.0.x 15-September-2005 Bruce Marton  - Additional draft changes proposed by PESC High School Transcript workgroup. -->
	<!--v1.0.0 15-February-2006 Bruce Marton  - Final proposed changes for PESC High School Transcript as approved for public comment by PESC Change Control Board (reviewed - JAF). -->
	<!--v1.2.0 29-April-2011 Jeffrey Funck  -  -->
	<!--Include all changes requested from Tom Stewart -->
	<!--   Change #   TS20110329030400 -->
	<!--v1.3.0 15-June-2012 Jeffrey Funck  -  -->
	<!--Modify to pull in new versions of sector libraries -->
	<!--   Change #   TS20120305094902 -->
	<!--v1.4.0 15-October-2013 Jeffrey Funck  -  -->
	<!--Modified to use the newest version of CoreMain (v1.13.0)-->
	<!--   Change #   TS20130624000001 -->
	<!--v1.5.0 17-December-2014 Jeffrey Funck  -  -->
	<!--Modified to use the newest version of CoreMain (v1.14.0)-->
	<!--   Change #   MB20140606000001 -->
	<!--============================================================================-->
	<!---->
	<xs:element name="HighSchoolTranscript">
		<xs:complexType>
			<xs:sequence>
				<xs:element name="TransmissionData" type="AcRec:TransmissionDataType"/>
				<xs:element name="Student" type="AcRec:K12StudentType"/>
				<xs:element name="NoteMessage" type="core:NoteMessageType" minOccurs="0" maxOccurs="unbounded"/>
				<xs:element name="UserDefinedExtensions" type="core:UserDefinedExtensionsType" minOccurs="0"/>
			</xs:sequence>
		</xs:complexType>
	</xs:element>
</xs:schema>

Posted: Fri Jun 22, 2018 2:16 pm
by chulett
:idea:

FYI - code tags preserve whitespace / formatting, otherwise the forum software gets rid of all those pesky 'extra' spaces.

Posted: Wed Jun 27, 2018 1:41 pm
by irvinew
Further context:

I think I figured out why it doesn't read rows; though I've yet to get it to work.

I have a multiple XSD validating and parsing a single xml file; within that XML file it has other xml files bundled into it; that is supposed to be contained in the "BatchContent". Within the document root there is a type element that refers to this bundled "BatchContent" section: Here is the code for the main XSD document:


<xs:schema xmlns:AcRecBat="urn:org:pesc:message:AcademicRecordBatch:v2.1.0" xmlns:AcRec="urn:org:pesc:sector:AcademicRecord:v1.9.0" xmlns:core="urn:org:pesc:core:CoreMain:v1.14.0" xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="urn:org:pesc:message:AcademicRecordBatch:v2.1.0" elementFormDefault="unqualified" attributeFormDefault="unqualified" version="v2.1.0">
<xs:import namespace="urn:org:pesc:core:CoreMain:v1.14.0" schemaLocation="CoreMain_v1.14.0.xsd"/>
<xs:import namespace="urn:org:pesc:sector:AcademicRecord:v1.9.0" schemaLocation="AcademicRecord_v1.9.0.xsd"/>
<!--============================================================================-->
<!--Name: AcademicRecordBatch-->
<!--Version: 2.1.0-->
<!--Date: 17-December-2014-->
<!---->
<!--Change Log:-->
<!-- Change # JTS20070816102300 -->
<!-- Reviewed by Jeffrey A Funck -->
<!--2.0.0 14-March-2008 Tuan Anh Do - Restructured Schema to include Transmission Data Segment for sending/receiving agencies -->
<!-- Changes for this version is not backwards compatible with v1.0.0. -->
<!-- The Batch Content is mandatory and places the data package as the child which moves it down one layer.-->
<!-- The Batch Envelope is entirely Optional -->
<!-- The Batch Envelope is entirely Optional -->
<!-- The Batch Envelope is entirely Optional -->
<!--v2.1.0 17-December-2014 Jeffrey Funck - -->
<!--Modified to use the newest version of CoreMain (v1.14.0)-->
<!-- Change # MB20140606000001 -->
<!--============================================================================-->
<!---->
<xs:element name="AcademicRecordBatch">
<xs:complexType>
<xs:sequence>
<xs:element name="BatchEnvelope" type="AcRec:TransmissionBatchType" minOccurs="0"/>
<xs:element name="BatchContent" type="core:AcademicRecordBatchType"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>



The "BatchContent" refers to a file called "CoreMain" in the CoreMain XSD document this is the referred type:



<xs:complexType name="AcademicRecordBatchType">
<xs:annotation>
<xs:documentation>This is used to create a place holder and root element to contain multiple logical XML documents that are bundled for a single batch transmission</xs:documentation>
</xs:annotation>
<xs:sequence>
<xs:any namespace="##other" processContents="strict" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>




But in my parser; Datastage doesn't want to drill down into it; so all I get in the test data is this:

</BatchEnvelope>
<BatchContent><?xml version="1.0" encoding="UTF-8"?><?xml version="1.0" encoding="UTF-8"?><?xml version="1.0" encoding="UTF-8"?></BatchContent>
</AcRecBat:AcademicRecordBatch>


in the Tree structure all I see is this:

ns0:BatchContent
- wildcard0
? e2res:text()


Any thoughts on how to get this to work? I did validate all files at xmlvalidator.com

Posted: Wed Jun 27, 2018 2:37 pm
by chulett
Thanks Will... I nuked the other topic. :wink:

Posted: Fri Jul 06, 2018 10:05 am
by irvinew
Update:

I got 1 row to work; seems as reason the hierarchical stage didn't want to drill down into the xml was that it couldn't resolve where the schema location was........I think.

Anyway I got it to resolve 2 different ways:

1) I butchered the xsd and made the highschool a complex type

Code: Select all

xs:complexType name="HighSchoolTranscriptDataType">
		<xs:sequence>
			<xs:element name="TransmissionData" type="AcRec:TransmissionDataType" minOccurs="0" maxOccurs="unbounded" />
				<xs:element name="Student" type="AcRec:K12StudentType" maxOccurs="unbounded"/>
				<xs:element name="NoteMessage" type="core:NoteMessageType" minOccurs="0"/>
				<xs:element name="UserDefinedExtensions" type="core:UserDefinedExtensionsType" minOccurs="0"/>	
		</xs:sequence>	
</xs:complexType>
2) I altered the schemaLocation to be a direct link

"urn:org:pesc:message:AcademicRecordBatch:v2.1.0 AcademicRecordBatch_v2.1.xsd"

became

"AcademicRecordBatch_v2.1.xsd"


The AcademicRecordBatch is what the testfile draws from, the AcademicBatch is as follows:

Code: Select all

<xs:element name="AcademicRecordBatch">
		<xs:complexType>
			<xs:sequence>
				<xs:element name="BatchEnvelope" type="AcRec:TransmissionBatchType" minOccurs="0"/>
				<xs:element name="BatchContent" type="core:AcademicRecordBatchType"/>
			</xs:sequence>
		</xs:complexType>
</xs:element>
Anyway; regardless of how I altered the xsd datastage and my xml editor would always fumble in 2 ways

It didn't like this line:

<HSTrn:HighSchoolTranscript xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:n1="http://www.altova.com/samplexml/other-namespace" xmlns:HSTrn="urn:org:pesc:message:HighSchoolTranscript:v1.5.0" xsi:schemaLocation="urn:org:pesc:message:HighSchoolTranscript:v1.5.0 HighSchoolTranscript_v1.5.0.xsd">


This line is in the testfile; it specifies the xml for the transcript


If I took it out it complains that it can't find these 2 lines from above, which doesn't exist in the testfile; but their minoccurs is 0, so ???

<xs:element name="NoteMessage" type="core:NoteMessageType" minOccurs="0"/>
<xs:element name="UserDefinedExtensions" type="core:UserDefinedExtensionsType" minOccurs="0"/>


Anyway I am dammed either way and can't get a break; can someone explain why datastage cares about the missing files?
Why does datastage complain about the <HSTRN:HighSchool................. line??


Sorry for the long post; trying to give as much info as possible.

Posted: Fri Jul 06, 2018 12:53 pm
by chulett
I can't really help but I am of the opinion that we'd rather have too much information rather than not enough. :wink:

Out of curiosity, have you involved your official support provider yet?

Posted: Tue Jul 17, 2018 10:19 am
by eostic
Still having issues with this xml?

Solved

Posted: Tue Jul 24, 2018 9:03 am
by irvinew
I have solved my own problem.

1. Datastage does not like the XSD document I had; I had to reverse engineer one of the XSD documents to make it an element a complex type just so Datastage could drill down into the XML. The XML had an embedded XML document where that document could be 1 or many.

2. I set my validation to Reject

3. Whatever is in those XSD documents (There are 4 of them) Datastage doesn't like them; it consumes so much memory it fails at some point; thus by the time the parser reaches the transcripts it stops reading. At some point the server begins to throw Java stack and heap errors and the system bogs down.

I don't know if it is a rookie mistake but I had sample test data to test the parser still in there. If I took it out then it worked. At some point I made my own XSD document from the XML document to generate a sample test data to test the parser; that is where I stumbled on the answer. I don't know why the system gets bogged down because of this; maybe someone can chime in on that.

Thanks for the moderator for help; Cheers.