Divide a ring in 8 shape into two rings

Post questions here relative to DataStage Server Edition for such areas as Server job design, DS Basic, Routines, Job Sequences, etc.

Moderators: chulett, rschirm, roy

Post Reply
Pacific007
Participant
Posts: 35
Joined: Wed Oct 06, 2010 11:24 am

Divide a ring in 8 shape into two rings

Post by Pacific007 »

Hi,

I have a data like a ring in which a ring forms a structure of 8 or infinity symbol.

The data is like this, A_point and B_point are two points and link joins those points if you draw the below data it will create 8

Ring link A_point B_point
R1 F1 H F
R1 F2 E F
R1 F6 H G
R1 F5 G E
R1 F3 E A
R1 F4 D E
R1 F9 D C
R1 F7 B C
R1 F8 A B

I want to divide it like this in two rings(R1 and R2)

R1 F1 H F
R1 F2 E F
R1 F6 H G
R1 F5 G E

R2 F3 E A
R2 F4 D E
R2 F9 D C
R2 F7 B C
R2 F8 A B

Revert if any one have any idea.
Pacific
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

Transformer stage with two outputs, constrained on the value of the first column (R1 or R2).
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Pacific007
Participant
Posts: 35
Joined: Wed Oct 06, 2010 11:24 am

Post by Pacific007 »

Hi Ray source data dosent contain R2 only R1. I want in output that R2
Pacific
chandra.shekhar@tcs.com
Premium Member
Premium Member
Posts: 353
Joined: Mon Jan 17, 2011 5:03 am
Location: Mumbai, India

Post by chandra.shekhar@tcs.com »

Based upon wat condition/logic, you want your rings to be divided??
Thanx and Regards,
ETL User
ray.wurlod
Participant
Posts: 54607
Joined: Wed Oct 23, 2002 10:52 pm
Location: Sydney, Australia
Contact:

Post by ray.wurlod »

OK, so what determines which ring a row belongs in?
IBM Software Services Group
Any contribution to this forum is my own opinion and does not necessarily reflect any position that IBM may hold.
Pacific007
Participant
Posts: 35
Joined: Wed Oct 06, 2010 11:24 am

Post by Pacific007 »

If you draw a digram by using A_point and B_point and connecting them by using link column then you will get a figure of 8 or infinity symbol. I want those two loops as seperate loop means just update the ring name in column one so that it will not form the structure of 8. In orignal its a one ring twisted at one point so we have to divide that into two seperate rings. So rows in one loop of 8 belongs to one ring and other rows in other ring.

I hope now I am clear and you are able to understand.
Pacific
jcthornton
Premium Member
Premium Member
Posts: 79
Joined: Thu Mar 22, 2007 4:58 pm
Location: USA

Post by jcthornton »

Code: Select all

C-B-A
|      |
D----E---F
       |    |
       G---H
Is this what you mean? And you want to be able to detect both cycles and divide it like this:

Code: Select all

E---F
|    |
G---H  
and

Code: Select all

C-B-A
|      |
D----E
I hope you understand that this is a very difficult puzzle for DataStage to solve, simply because of how each stage gets to look at the data. In this case, by looking at 1 record at a time, you need to combine data from 9 records to get the right answer - and if you have an extended dataset (say 1000 'rings' with 1200 links) there is no easy way to even group together the links you need to look at together.

And what happens if the A-E link does not exist? Does this still need to be split? Are these two-way links or one-way links?

You are only giving us part of the problem here. What is the full problem space?

Based on my understanding of your problem right now, I am going to have to say that DataStage is not the right tool for this part of the job. It is somewhat like using a hammer to put in a rivet. You might be able to find a way to do it, but it is going to be a lot more painful than it ought to be.
Jack Thornton
----------------
Spectacular achievement is always preceded by spectacular preparation - Robert H. Schuller
Pacific007
Participant
Posts: 35
Joined: Wed Oct 06, 2010 11:24 am

Post by Pacific007 »

Thanks Jack, for presenting my problem in absolute same way which I want to be. Yes Link may come any way those are two way. I know its difficult but not impossible, we can use DataStage+SQL and try to solve and I have to solve.

I have tried through Querry but still I reached half a way or no where.

The above mentioned source data is in table Test
I loaded that data into one new table test1 with converting one row into two and two columns A_point and B_point into one column.

then I fired a querry:

select point from (select count(point) as num, point, ring from test1 group by point, ring having count(point)>2)

because in data every point comes twice but crossover point comes 4 times so by above method i got the crossover point i.e. 'E'

Now by using dataStage I created one more table test2 with all original row and row with reverse a_point and b_point apart from rows which contains crossover point i.e. 'E'

On test2 I fired a querry:

select link, b_point, a_point, level as num, connect_by_isleaf "IsLeaf", connect_by_iscycle "Cycle", sys_connect_by_path(link,'/') "path" from test2 where connect_by_isleaf>0 start with a_point='E' connect by nocycle prior a_point=b_point order by num

I got 4 records in which two path traversal are correct as i want but how I will remove those unwanted paths and after that how I will update the data?

Please suggest me on this.
Pacific
jcthornton
Premium Member
Premium Member
Posts: 79
Joined: Thu Mar 22, 2007 4:58 pm
Location: USA

Post by jcthornton »

Hi Prasant,

With your code being one-way connections (A->E is not the same as E->A), and you doubling all of your connections, it makes sense that you are getting your paths in multiple - with each path duplicated in reverse (This is what is happening, correct?)

The simplest solution I can think of is to mark your original links/connections with one value (say '0') with the reversed links/connections with a different value (say '1'). Then you can use 0/1 value for the first step on your path to keep only 1 record - the '0' or the '1'.

As before, this relies on a number of assumptions
1. The links/connections are defined consistent to the path (For the 'E' node for example, is defined as the A_point half the time and the B_point half the time) at the shared nodes.

2. There are not multiple cycle bases (what paths are correct for this example - starting from 'E' as before. With J also being a shared node, you won't get one good path out of that side)

Code: Select all

A-B-C
|    |
D----E-F-G
     |   |
     H---J--K
         |  |
         L--M

3. You can easily identify your 'root' node ('E' in the example)
Jack Thornton
----------------
Spectacular achievement is always preceded by spectacular preparation - Robert H. Schuller
Post Reply