Recursive WITH, part II: Hierarchical queries
In my last post, I looked at using recursive WITH to implement simple recursive algorithms in SQL. One very common use of recursion is to traverse hierarchical data. I recently wrote a series of posts on hierarchical data, using Oracle’s CONNECT BY syntax and a fun example. In this post, I’ll be revisiting the same data using recursive WITH.
There are dozens of examples of hierarchical data, from the EMP table to the Windows Registry to binary trees, but I went with something more fun: the skeleton from the old song “Dem Dry Bones”.
Toe bone connected to the foot bone
Foot bone connected to the heel bone
Heel bone connected to the ankle bone
Ankle bone connected to the shin bone
Shin bone connected to the knee bone
Knee bone connected to the thigh bone
Thigh bone connected to the hip bone
Hip bone connected to the back bone
Back bone connected to the shoulder bone
Shoulder bone connected to the neck bone
Neck bone connected to the head bone
Since every bone has only one ancestor, and there is a root bone with no ancestor, this is hierarchical data and we can stick it in a table and query it.
SELECT * FROM skeleton; BONE CONNECTED_TO_THE ---------------------------------------- ---------------------------------------- shoulder neck back shoulder hip back thigh hip knee thigh leg knee foot heel head neck head toe foot arm shoulder wrist arm ankle leg heel ankle finger wrist a rib back b rib back c rib back
You can see that I added some ribs and an arm to make the skeleton more complete!
Using Oracle’s CONNECT BY syntax:
SQL> col bone FOR a10 SQL> col connected_to_the FOR a9 SQL> col level FOR 99 SQL> col bone_tree FOR a27 SQL> col path FOR a65 SELECT bone, connected_to_the, level, lpad(' ',2*level, ' ') || bone AS bone_tree , ltrim(sys_connect_by_path(bone,'>'),'>') AS path FROM skeleton START WITH connected_to_the IS NULL CONNECT BY prior bone=connected_to_the ORDER siblings BY 1 BONE CONNECTED LEVEL BONE_TREE PATH ---------- --------- ----- --------------------------- ----------------------------------------------------------------- head 1 head head neck head 2 neck head>neck shoulder neck 3 shoulder head>neck>shoulder arm shoulder 4 arm head>neck>shoulder>arm wrist arm 5 wrist head>neck>shoulder>arm>wrist finger wrist 6 finger head>neck>shoulder>arm>wrist>finger back shoulder 4 back head>neck>shoulder>back a rib back 5 a rib head>neck>shoulder>back>a rib b rib back 5 b rib head>neck>shoulder>back>b rib c rib back 5 c rib head>neck>shoulder>back>c rib hip back 5 hip head>neck>shoulder>back>hip thigh hip 6 thigh head>neck>shoulder>back>hip>thigh knee thigh 7 knee head>neck>shoulder>back>hip>thigh>knee leg knee 8 leg head>neck>shoulder>back>hip>thigh>knee>leg ankle leg 9 ankle head>neck>shoulder>back>hip>thigh>knee>leg>ankle heel ankle 10 heel head>neck>shoulder>back>hip>thigh>knee>leg>ankle>heel foot heel 11 foot head>neck>shoulder>back>hip>thigh>knee>leg>ankle>heel>foot toe foot 12 toe head>neck>shoulder>back>hip>thigh>knee>leg>ankle>heel>foot>toe
The above CONNECT BY query uses the LEVEL pseudocolumn and the SYS_CONNECT_BY_PATH function. With recursive WITH, there’s no need for these built-ins because these values fall naturally out of the recursion.
Let’s start with the basic hierarchical query rewritten in recursive WITH.
The hierarchical relationship in our table is:
Parent(row.bone) = row.connected_to_the
WITH skellarchy (bone, parent) AS ( SELECT bone, connected_to_the FROM skeleton WHERE bone = 'head' -- Start with the root UNION ALL SELECT s.bone, s.connected_to_the FROM skeleton s, skellarchy r WHERE r.bone = s.connected_to_the -- Parent(row.bone) = row.connected_to_the ) SELECT * FROM skellarchy; BONE PARENT ---------- ---------------------------------------- head neck head shoulder neck back shoulder arm shoulder hip back wrist arm a rib back b rib back c rib back thigh hip finger wrist knee thigh leg knee ankle leg heel ankle foot heel toe foot
Because we built up the SKELLARCHY table recursively, it’s easy to make an equivalent to the LEVEL pseudocolumn; it falls right out of the recursion:
WITH skellarchy (bone, parent, the_level) AS ( SELECT bone, connected_to_the, 0 FROM skeleton WHERE bone = 'head' UNION ALL SELECT s.bone, s.connected_to_the , r.the_level + 1 FROM skeleton s, skellarchy r WHERE r.bone = s.connected_to_the ) SELECT * FROM skellarchy; BONE PARENT THE_LEVEL ---------- ---------- ---------- head 0 neck head 1 shoulder neck 2 back shoulder 3 arm shoulder 3 hip back 4 wrist arm 4 a rib back 4 b rib back 4 c rib back 4 thigh hip 5 finger wrist 5 knee thigh 6 leg knee 7 ankle leg 8 heel ankle 9 foot heel 10 toe foot 11
and it’s also easy to build up a path from root to the current node like the “SYS_CONNECT_BY_PATH” function does for CONNECT BY queries:
WITH skellarchy (bone, parent, the_level, the_path) AS ( SELECT bone, connected_to_the, 0, CAST(bone AS varchar2(4000)) FROM skeleton WHERE bone = 'head' UNION ALL SELECT s.bone, s.connected_to_the , r.the_level + 1, r.the_path || '->' || s.bone FROM skeleton s, skellarchy r WHERE r.bone = s.connected_to_the ) SELECT * FROM skellarchy; BONE PARENT THE_LEVEL THE_PATH ---------- ---------- --------- -------------------------------------------------------------------------------- head 0 head neck head 1 head->neck shoulder neck 2 head->neck->shoulder back shoulder 3 head->neck->shoulder->back arm shoulder 3 head->neck->shoulder->arm hip back 4 head->neck->shoulder->back->hip wrist arm 4 head->neck->shoulder->arm->wrist a rib back 4 head->neck->shoulder->back->a rib b rib back 4 head->neck->shoulder->back->b rib c rib back 4 head->neck->shoulder->back->c rib thigh hip 5 head->neck->shoulder->back->hip->thigh finger wrist 5 head->neck->shoulder->arm->wrist->finger knee thigh 6 head->neck->shoulder->back->hip->thigh->knee leg knee 7 head->neck->shoulder->back->hip->thigh->knee->leg ankle leg 8 head->neck->shoulder->back->hip->thigh->knee->leg->ankle heel ankle 9 head->neck->shoulder->back->hip->thigh->knee->leg->ankle->heel foot heel 10 head->neck->shoulder->back->hip->thigh->knee->leg->ankle->heel->foot toe foot 11 head->neck->shoulder->back->hip->thigh->knee->leg->ankle->heel->foot->toe
and we can use our generated the_level column to make a nice display just as we used the level pseudocolumn with CONNECT BY:
WITH skellarchy (bone, parent, the_level) AS ( SELECT bone, connected_to_the, 0 FROM skeleton WHERE bone = 'head' UNION ALL SELECT s.bone, s.connected_to_the , r.the_level + 1 FROM skeleton s, skellarchy r WHERE r.bone = s.connected_to_the ) SELECT lpad(' ',2*the_level, ' ') || bone AS bone_tree FROM skellarchy; BONE_TREE --------------------------- head neck shoulder back arm hip wrist a rib b rib c rib thigh finger knee leg ankle heel foot toe
Now, the bones are coming out in a bit of a funny order for a skeleton. Instead of this:
shoulder back arm hip wrist a rib b rib c rib thigh finger
I want to see this:
shoulder arm wrist finger back a rib b rib c rib hip thigh
The rows are coming out in BREADTH FIRST ordering – meaning all siblings of ‘shoulder’ are printed before any children of ‘shoulder’. But I want to see them in DEPTH FIRST: going from shoulder to finger before we start on the backbone.
WITH skellarchy (bone, parent, the_level) AS ( SELECT bone, connected_to_the, 0 FROM skeleton WHERE bone = 'head' UNION ALL SELECT s.bone, s.connected_to_the , r.the_level + 1 FROM skeleton s, skellarchy r WHERE r.bone = s.connected_to_the ) SEARCH DEPTH FIRST BY bone SET bone_order SELECT lpad(' ',2*the_level, ' ') || bone AS bone_tree FROM skellarchy ORDER BY bone_order; BONE_TREE --------------------------- head neck shoulder arm wrist finger back a rib b rib c rib hip thigh knee leg ankle heel foot toe
And now the result looks more like a proper skeleton.
Now on to cycles. A cycle is a loop in the hierarchical data: a row is its own ancestor. To put a cycle in the example data, I made the skeleton bend over and connect the head to the toe:
UPDATE skeleton SET connected_to_the='toe' WHERE bone='head';
And now if we try to run the query:
ERROR at line 2: ORA-32044: cycle detected while executing recursive WITH query
With the CONNECT BY syntax, we can use CONNECT BY NOCYCLE to run a query even when cycles exist, and the pseudocolumn CONNECT_BY_IS_CYCLE to help detect cycles. For recursive WITH, Oracle provides a CYCLE clause, which is a bit more powerful as it allows us to name the column which is cycling.
WITH skellarchy (bone, parent, the_level) AS ( SELECT bone, connected_to_the, 0 FROM skeleton WHERE bone = 'head' UNION ALL SELECT s.bone, s.connected_to_the , r.the_level + 1 FROM skeleton s, skellarchy r WHERE r.bone = s.connected_to_the ) SEARCH DEPTH FIRST BY bone SET bone_order CYCLE bone SET is_a_cycle TO 'Y' DEFAULT 'N' SELECT lpad(' ',2*the_level, ' ') || bone AS bone_tree, is_a_cycle FROM skellarchy --where is_a_cycle='N' ORDER BY bone_order; BONE_TREE I ------------------------------------------------------------ - head N neck N shoulder N arm N wrist N finger N back N a rib N b rib N c rib N hip N thigh N knee N leg N ankle N heel N foot N toe N head Y
The query runs until the first cycle is detected, then stops.
The CONNECT BY syntax does provide a nice pseudocolumn, CONNECT_BY_ISLEAF, which is 1 when a row has no further children, 0 otherwise. In my next post, I’ll look at emulating this pseudocolumn with recursive WITH.
Republished with permission. Original URL: http://rdbms-insight.com/wp/?p=103
- Natalka Roshak's blog
- Log in to post comments