How do you string a series of xml records together ?

Formally known as "Mercator Inside Integrator 6.7", DataStage TX enables high-volume, complex transactions without the need for additional coding.

Moderators: chulett, rschirm

jazzer1
Participant
Posts: 37
Joined: Mon Mar 20, 2006 10:26 am

Post by jazzer1 »

Is there some way of extracting the data from the XML file without including the tags ?
jgibby
Participant
Posts: 42
Joined: Thu Dec 16, 2004 8:48 am

Post by jgibby »

I tested and found out the SERIESTOTEXT function will not do it. I don't know of a way to extract just the data values when you don't know what the xml tags are.
"Artificial intelligience is no match for natural stupidity."
jazzer1
Participant
Posts: 37
Joined: Mon Mar 20, 2006 10:26 am

I have a fix but not very pretty

Post by jazzer1 »

I have one solution but I had to hardcode some stuff. This will work for now.....

=SUBSTITUTE(LEAVEPRINT(record01),"<data1>","","</data1>","|")) etc.

This returns a result of 111|222|333 etc.

However, I had to hardcode the xml tags to be replaced.
There must be a way to do a wildcard search and replace.
I'll figure the out next.

Thanks to all who contributed.
jgibby
Participant
Posts: 42
Joined: Thu Dec 16, 2004 8:48 am

Post by jgibby »

Maybe there is a better way to approach this. If it is possible that you can change the output type tree, then there is a very simple solution.

You are trying to take an xml series object and extract just the data elements into a single object expression on the output side. However, if you modify the designated output field to make it a series object as well, then the problem is solved very simply. Make the target field on the output type tree a series object infix delimited by the pipe character.

New Output Series Object Field Formula:

Code: Select all

=FieldData:DummyRec:Input01
John
"Artificial intelligience is no match for natural stupidity."
jazzer1
Participant
Posts: 37
Joined: Mon Mar 20, 2006 10:26 am

Post by jazzer1 »

I ended up using this:

=F_String(SUBSTITUTE(LEAVEPRINT(Record1),"<Data1>","","</Data1>","|")) etc...

The only problem being I have to hardcode the values of the tags.
The next step is to come up with a way to inspect the string in a wildcard search.

Seen the new post.

Thanks very much to all who helped.
jazzer1
Participant
Posts: 37
Joined: Mon Mar 20, 2006 10:26 am

Post by jazzer1 »

John: I apologize for these basic questions, but how do I define a series object ? I have a group defined and a record(0:s) under it.
jgibby
Participant
Posts: 42
Joined: Thu Dec 16, 2004 8:48 am

Post by jgibby »

I take it you have fields or columns defined under the Record series object. Something like this

Code: Select all

File
	Record (0:s)
		Field1 (1:1)
		Field2 (1:1)
		Field3 (1:1)
		Field4 (1:1)
		FieldN (1:1)
What you are going to want to do is create another group object that will hold the field in question. Set the group's delimiter and put the field in the group's component window and set the range. Now put the new group in the record object's component window in place of the single field.
It would look something like this:

Code: Select all

File
	Record (0:s)
		Field1 (1:1)
		Field2 (1:1)
		Field3 (1:1)
		Field4GRP (1:1)
			Field4 (0:s)
		FieldN (1:1)
I'm running on 7.5.1. If you want to send me the Type Tree, I'll mock it up for you. I'll PM you my email address.

John
"Artificial intelligience is no match for natural stupidity."
jazzer1
Participant
Posts: 37
Joined: Mon Mar 20, 2006 10:26 am

Post by jazzer1 »

I'm close....(I think)
Input file looks like this:

<tag1><tag2>dataxxxxxxxxxxx</tag2><tag3>dataxxxxx</tag3>>tag4>dataxxxxxxxxxxxzzz</tag4></tag1>

The input is one continuous stream.

I have the input file defined like this:

Extracted_XML Group
Extract_Rec(0:s)
start_tag(0:s) initiator = <, terminator = >
data_string(0:1)
end_tag(0:s) initiator = </, terminator = >

I'm trying to write out the data_string stuff separated by a "|"

So far, the output looks like this:

dataxxxxxxxxxxx</tag2><tag3>dataxxxxx</tag3>>tag4>dataxxxxxxxxxxxzzz</tag4></tag1>

The map drops the first two tags (which is good) but writes the rest of the file as is. Am I close ?
jgibby
Participant
Posts: 42
Joined: Thu Dec 16, 2004 8:48 am

Post by jgibby »

For some reason, you're type tree looks a little strange to me, but I have an idea. Try this:

Code: Select all

=SERIESTOTEXT(
	EXTRACT(
		data_string:Extract_Rec:Extracted_XML
		,data_string:Extract_Rec:Extracted_XML != ""
	) + "|"
)
Looks like it might be worth a shot. Let us know.

John
"Artificial intelligience is no match for natural stupidity."
jazzer1
Participant
Posts: 37
Joined: Mon Mar 20, 2006 10:26 am

Post by jazzer1 »

Same result. I would have thought something would change but it didn't.
jazzer1
Participant
Posts: 37
Joined: Mon Mar 20, 2006 10:26 am

Post by jazzer1 »

I looked at the log and the map recognizes the first two tags but then thinks the rest of the stream is data. It's not recognizing the </tag3> after the first chunk of data.
Post Reply