What are the known ways to store a tree structure in a relational DB?

Question

There is the "put a FK to your parent" method, i.e. each records points to it's parent.
Which is a hard for read actions but very easy to maintain.

And then there is a "directory structure key" method:

0001.0000.0000.0000 main branch 1
0001.0001.0000.0000 child of main branch one
etc

Which is super easy to read, but hard to maintain.
What are the other ways and their cons/pros?

See http://stackoverflow.com/questions/846201/fast-relational-method-of-storing-tree-data-for-instance-threaded-comments-on-ar for nested sets, key naming schemes and other approaches to squeezing hierarchies into relations. — Constantin, Jul 29 '10 at 13:20
We are using the old-age method of referring to the parent using FK for storing hierarchical data and are almost happy. To load lots of data, we make an XML query and deserialise to objects. — Kangkan, Jul 29 '10 at 13:02
On first blush, tree structures and RDBMs are a [horrible fit](http://dev.mysql.com/tech-resources/articles/hierarchical-data.html). If I've ever seen a use case for [Structured Storage](http://en.wikipedia.org/wiki/Nosql), this is it. — msw, Jul 29 '10 at 13:04
+1 that "horrible fit" link gives examples of how to do it using 2 structures and some suggested reading. — phkahler, Jul 29 '10 at 13:14
"Best" implementation depends on data structure and read/write characteristics. For a near-comprehensive community-updated list of options to consider what works best for you: https://stackoverflow.com/questions/4048151 — Kache, Jun 17 '18 at 19:00

score 59 · Answer 1 · edited Dec 14 '20 at 13:55

As always: there is no best solution. Each solution makes different things easier or harder. The right solution for you depends on which operation you will do most.

Naive Approach with parent-id:

Pros:

Easy to implement
Easy to move a big subtree to another parent
Insert is cheap
Needed Fields directly accessible in SQL

Cons:

Retrieving a whole tree is recursive and therefore expensive
Finding all parents is expensive too ( SQL doesn't know recursions... )

Modified Preorder Tree Traversal ( saving a start- & end-point) :

Pros:

Retrieving a whole tree is easy and cheap
Finding all parents is cheap
Needed Fields directly accessible in SQL
Bonus: you're saving the order of childnodes within its parentnode too

Cons:

Inserting / Updating can be very expensive, as you'll maybe have to update a lot of nodes

Saving a path in each Node:

Pros:

Finding all parents is cheap
Retrieving a whole tree is cheap
Inserting is cheap

Cons:

Moving a whole tree is expensive
Depending on the way you save the path, you won't be able to work with it directly in SQL, so you'll always need to fetch & parse it, if you want to change it.

Closure table

Pros:

Easy to implement
Finding all parents is cheap
Inserting is cheap
Retrieving whole trees is cheap

Cons:

Needs an additional table
Takes up a lot of space compared to other approaches
Moving a subtree is expensive

I'd prefer one of the last two, depending on how often the data changes.

See also: http://media.pragprog.com/titles/bksqla/trees.pdf

I agree with you: there's no best solution at all. Anyway, there are several different ways to represent Tree Data Structure, and several kinds of techniques to fit those representations in a relational model. Normally, I prefer to distinguish two cases: 1. read-intensive and 2. write-intensive. According to the specific case, one model fits bettern than another: for example, if write-intensive, writing a lot of pointers on a denormalized table can be quite expensive, degrading the performances. — Paolo Maresca, Apr 27 '14 at 10:30
For the third option use "Materialized Path" in your favorite search engine to find out more. Took me a while to find a name for it. The second option is also called "Nested Set". — Fredrik Boström, Mar 17 '15 at 11:39

score 21 · Answer 2 · edited Jan 15 '21 at 11:03

21

Modified Preorder Tree Traversal

This is a method which uses a non-recursive function (usually a single line of SQL) to retrieve trees from the database, at the cost of being a little trickier to update.

**

Section 2 of the Sitepoint article Storing Hierarchical Data in a Database for more detail.

edited Jan 15 '21 at 11:03

SLCH000

1,055
1
11
14

answered Jul 29 '10 at 13:31

TRiG

9,249
6
50
101

2

That article conveniently ignores that the tree must perhaps be updated later on, and would thus require that a lot of nodes be renumbered to keep it consistent. – Lasse V. Karlsen Jul 29 '10 at 13:59
5

The article doesn't ignore that at all. It explicitly says on page 3 that you have to update the node numbers and gives the SQL to do that. – siride Jul 29 '10 at 14:22
2

It should be noted that this model is great when you have considerably more reads than writes - if you're expecting large number of writes (e.g. perhaps 1 write for every 4 reads) this model can be problematic and something like a simple parent ID (and more expensive reads) may be preferable. – mindplay.dk Jul 15 '14 at 17:55
`Modified Preorder Tree Traversal` is a beautiful way to solve this problem, in case not modify the tree structure frequently. – Bin May 19 '17 at 06:32

score 5 · Answer 3 · answered Jul 29 '10 at 13:02

5

I'd say the "golden way" to store a hierarchical data structure is to use a hierarchical database. Such as, for instance, HDB. That's a relational database that handles trees quite well. If you want something more powerful, LDAP might do for you.

A SQL database is ill-suited to this abstract topology.

answered Jul 29 '10 at 13:02

Borealid

86,367
8
101
120

I know and I know. I am bound to use MySql right now, so I need to find a good solution within my constraints. – Itay Moav -Malimovka Jul 29 '10 at 13:04
@msw that's the FK method I mentioned in my question. – Itay Moav -Malimovka Jul 29 '10 at 13:08

score 0 · Answer 4 · edited Nov 17 '11 at 21:11

I don't think it's hard to build a tree structure with a relational database.

However, an object oriented database would work much better for this purpose.

Using an object oriented database:

parent has a set of child1  
child1 has a set of child2  
child2 has a set of child3  
...  
...

In an object oriented database you can build this structure pretty easily.

In a relational database, you will have to maintain foreign keys to parent.

parent  
id  
name  

child1  
parent_fk  
id  
name 

child2  
parent_fk  
id  
name  

..

Essentially, while you are building your tree structure you will have to join all these tables, or you can iterate through them.

foreach(parent in parents){
   foreach(child1 in parent.child1s)
    foreach(child2 in child1.child2s)

...

Iteratin the database in such a way can be very expensive, mainly if you have to retrieve only a subtree from a large tree structure. — GergelyPolonkai, May 01 '12 at 14:26

What are the known ways to store a tree structure in a relational DB?

4 Answers4

Modified Preorder Tree Traversal

Linked

Related