# Flat Tree

Flat Trees are the core data structure that power Dat's Hypercore feeds. They allow us to deterministically represent a tree structure as a vector. This is particularly useful because vectors map elegantly to disk and memory.

Because Flat Trees are deterministic and pre-computed, there is no overhead to using them. In effect this means that Flat Trees are a specific way of indexing into a vector more than they are their own data structure. This makes them uniquely efficient and convenient to implement in a wide range of languages.

You can represent a binary tree in a simple flat list using the following structure:

``````      3
1       5
0   2   4   6  ...
``````

Let's rotate the tree on its side for notational purposes:

`````` 0─┐
1─┐
2─┘ │
3
4─┐ │
5─┘
6─┘
``````

Each number represents an index in a flat list. Given the tree:

`````` D─┐
B─┐
E─┘ │
A
F─┐ │
C─┘
G─┘
``````

The way this would be expressed in-memory would be as the list (vector): `[D B E A F C G]` or `[0 1 2 3 4 5 6]`.

## Depth

Indexes 0, 2, 4, 6 have a depth of 0. And indexes 1 and 5 have a depth of 1.

``````depth = 2  ^        3
depth = 1  |    1       5
depth = 0  |  0   2   4   6  ...
``````

If we convert the graph to a chart we could express it as such:

``````depth = 0 | 0 2 4 6
depth = 1 | 1 5
depth = 2 | 3
depth = 3 |
``````

Now let's add numbers up to 14:

``````depth = 0 | 0 2 4 6 8 10 12 14
depth = 1 | 1 5 9 13
depth = 2 | 3 11
depth = 3 | 7
``````

### Node Kinds

You might be noticing that the numbers at `depth = 0` is vastly greater than the amount of numbers at every other depth. We refer to nodes at `depth = 0` as `leaf nodes`, and nodes at every other depth as `parent nodes`.

``````leaf nodes   | 0 2 4 6 8 10 12 14
parent nodes | 1 3 5 7 9 11 13
``````

An interesting aspect of flat trees is that the number of `leaf nodes` and number of `parent nodes` is in perfect balance. This comes to an interesting insight:

• All even indexes refer to `leaf nodes`.
• All uneven indexes refer to `parent nodes`.

The depth of a tree node can be calculated by counting the number of trailing 1s a node has in binary notation.

``````5 in binary = 101 (one trailing 1)
3 in binary = 011 (two trailing 1s)
4 in binary = 100 (zero trailing 1s)
``````

## Offset

When reading about flat-trees the word `offset` might regularly pop up. This refers to the offset from the left hand side of the tree.

In the following tree the indexes with an offset of 0 are: `[0 1 3 7]`:

``````(0)┐
(1)┐
2─┘ │
(3)┐
4─┐ │ │
5─┘ │
6─┘   │
(7)
``````

In the next tree the indexes with an offset of 1 are: `[2 5 11]`:

``````  0──┐
1──┐
(2)─┘  │
3──┐
4──┐  │  │
(5)─┘  │
6──┘     │
7
8──┐     │
9──┐  │
10──┘  │  │
(11)─┘
12──┐  │
13──┘
14──┘
``````

## Relationships Between Nodes

When describing nodes we often also talk about the relationship between nodes. This includes words such as `uncle`, and `parent`.

Take this example tree:

`````` 0─┐
1─┐
2─┘ │
3─┐
4─┐ │ │
5─┘ │
6─┘   │
7
8
``````
• parent: A parent has two children under it, and is always odd-numbered. Node 3 is the parent of 1 and 5.
• leaf: A node with no children. A leaf node is always even-numbered. Nodes 0, 2, 4, 6 and 8 are leaf nodes.
• sibling: The other node that shares a parent with the current node. For example nodes 4 and 6 are siblings.
• uncle: A parent's sibling. Node 1 is the uncle of nodes 4 and 6.
• root: A top-most node where the full tree under it is complete (e.g. all parent nodes have 2 children). Node 3 is a root node.
• span: The two nodes that are furthest away in the sub-tree. The span of node 1 is `0, 2`. The span of node 3 is `0, 6`.
• right span: The left-most node in the span. The right span of node 1 is 2. The right span of node 3 is 6.