The requirement itself is kind of strange, but I'll try and explain it. Basically there is an outbound trip and an inbound trip. Each record has a direction identifier, a sequence and various waypoints aka tp points.
The data looks like this:
create table a(dirid number, seq number, tp number);
insert into a values (1,1,101);
insert into a values (1,2,143);
insert into a values (1,3,120);
insert into a values (1,4,119);
insert into a values (4,1,61);
insert into a values (4,2,119);
insert into a values (4,3,120);
insert into a values (4,4,143);
insert into a values (4,5,101);
The requirement is for the data to be transformed
from this to this
Dirid Seq Tr Dirid Seq Tr
1 1 101 1 1 101
1 2 143 1 2 143
1 3 120 1 3 120
1 4 119 1 4 119
4 1 61 1 5 NULL
4 2 119 4 5 61
4 3 120 4 4 119
4 4 143 4 3 120
4 5 101 4 2 143
4 1 101
So I see a couple of things right off the bat. First off, we've added a row. It appears to be an endpoint to dirid = 1. Secondly we've changed the sequence values for dirid = 4. Thirdly, the sort order are different for the various dirid's. Ascending for dirid=1 and descending for dirid=4
Hmmm, how do we attack this.
Since, we're manufacturing rows, I'm thinking MODEL clause.
Next, I'll need to populate that NULL row with the maximum seq value.
I'll need to modify the seq values where dirid=4. I can do that with a CASE statement around multiple ROW_NUMBER analytics.
Lastly, I'll need to again put a CASE statement around multiple ROW_NUMBER analytics, because he sorting rules change based on dirid.
I warned you the requirement seemed a little sketchy, but fun nonetheless.
So let's start to put it all together.
First, let's add that row using the MODEL clause
SELECT dirid,seq, tp
FROM a
MODEL
DIMENSION BY ( dirid,seq )
MEASURES ( tp )
RULES UPSERT ( tp [1,0] = NULL)
which gives us the following output...so far so good.
Dirid Seq Tr
1 1 101
1 2 143
1 3 120
1 4 119
4 1 61
4 2 119
4 3 120
4 4 143
4 5 101
1 0 NULL
Next we need to populate the new row with the correct seq value. Let's use a MAX analytic function
SELECT dirid,DECODE(seq,0,MAX(seq) OVER (PARTITION BY NULL),seq) seq,tp
FROM (SELECT dirid,seq, tp
FROM a
MODEL
DIMENSION BY ( dirid,seq )
MEASURES ( tp )
RULES UPSERT ( tp [1,0] = NULL))
which gives us the following output...again, so far so good.
Dirid Seq Tr
1 1 101
1 2 143
1 3 120
1 4 119
4 1 61
4 2 119
4 3 120
4 4 143
4 5 101
1 5 NULL
Next we need to modify the seq values where dirid = 4. Let's place a CASE statement around a few ROW_NUMBER analytics.
SELECT dirid,seq,tp,
CASE
WHEN dirid = 1 THEN
row_number() OVER (PARTITION BY dirid ORDER BY seq)
ELSE
row_number() OVER (PARTITION BY dirid ORDER BY seq DESC)
END rn
FROM (SELECT dirid,DECODE(seq,0,MAX(seq) OVER (PARTITION BY NULL),seq) seq,tp
FROM (SELECT dirid,seq, tp
FROM a
MODEL
DIMENSION BY ( dirid,seq )
MEASURES ( tp )
RULES UPSERT ( tp [1,0] = NULL)))
Cool. The rn column contains the values needed for seq.
Dirid Seq Tr rn
1 1 101 1
1 2 143 2
1 3 120 3
1 4 119 4
1 5 NULL 5
4 1 61 5
4 2 119 4
4 3 120 3
4 4 143 2
4 5 101 1
Let's replace seq with rn and again wrap a CASE statement to handle the different sorting requirements of dirid=1 and dirid=4
SELECT dirid,rn seq,tp,
CASE
WHEN dirid = 1 THEN
row_number() OVER (PARTITION BY dirid ORDER BY rn)
ELSE
row_number() OVER (PARTITION BY dirid ORDER BY rn DESC)
END rn2
FROM (SELECT dirid,seq,tp,
CASE
WHEN dirid = 1 THEN
row_number() OVER (PARTITION BY dirid ORDER BY seq)
ELSE
row_number() OVER (PARTITION BY dirid ORDER BY seq DESC)
END rn
FROM (SELECT dirid,DECODE(seq,0,MAX(seq) OVER (PARTITION BY NULL),seq)seq
,tp
FROM (SELECT dirid,seq, tp
FROM a
MODEL
DIMENSION BY ( dirid,seq )
MEASURES ( tp )
RULES UPSERT ( tp [1,0] = NULL))))
Almost there.
Dirid Seq Tr rn2
1 1 101 1
1 2 143 2
1 3 120 3
1 4 119 4
1 5 NULL 5
4 1 101 5
4 2 143 4
4 3 120 3
4 4 119 2
4 5 61 1
The final step is to sort on dirid, rn2.
SELECT dirid,seq,tp
FROM ( SELECT dirid,rn seq,tp,
CASE
WHEN dirid = 1 THEN
row_number() OVER (PARTITION BY dirid ORDER BY rn)
ELSE
row_number() OVER (PARTITION BY dirid ORDER BY rn DESC)
END rn2
FROM (SELECT dirid,seq,tp,
CASE
WHEN dirid = 1 THEN
row_number() OVER (PARTITION BY dirid ORDER BY seq)
ELSE
row_number() OVER (PARTITION BY dirid ORDER BY seq DESC)
END rn
FROM (SELECT dirid,DECODE(seq,0,MAX(seq) OVER (PARTITION BY NULL),seq) seq,tp
FROM (SELECT dirid,seq, tp
FROM a
MODEL
DIMENSION BY ( dirid,seq )
MEASURES ( tp )
RULES UPSERT ( tp [1,0] = NULL)))))
ORDER BY dirid,rn2
Sweetness! The outbound trip (dirid=1) goes out to the following waypoint in seq order, then the inbound trip (dirid=4) comes back to the starting point in reverse.
Dirid Seq Tr
1 1 101
1 2 143
1 3 120
1 4 119
1 5 NULL
4 5 61
4 4 119
4 3 120
4 2 143
4 1 101
I warned you it was a weird requirement, but those are the kind of things that make you think "I can do that in a few lines of SQL"....now with a smattering of analytic functions you can too.
...our journey continues
1 comment:
That is a very weird requirement. Once you get into "throw out the actual value and construct a different value" it's tempting to just whip out the pl/sql or even (argh) the hard coding. Thanks for the nice example of MODEL clause!
-Natalka Roshak - http://rdbms-insight.com
Post a Comment