B-Trees Of Order 5 And 2 B-tree of order m.
m-way search tree.
Not empty => root has at least 2 children.
Remaining internal nodes (if any) have at least (m/2) children.
External (or failure) nodes on same level. B-tree of order 5 is 3-4-5 tree (root may be 2-node though).
B-tree of order 2 is full binary tree.
B-Trees Extension of 2-3 and 2-3-4 trees to higher degree trees.
Used to represent very large dictionaries that reside on disk.
AVL Trees n = 230 = 109 (approx).
30 <= height <= 43.
When the AVL tree resides on a disk, up to 43 disk access are made for a search.
This takes up to (approx) 4 seconds.
Not acceptable.
Red-Black Trees n = 230 = 109 (approx).
30 <= height <= 60.
When the AVL tree resides on a disk, up to 60 disk access are made for a search.
This takes up to (approx) 6 seconds.
Not acceptable.
m-way Search Trees Each node has up to m – 1 pairs and m children.
m = 2 => binary search tree.
4-Way Search Tree 10 30 35 k < 10 10 < k < 30 30 < k < 35 k > 35
Maximum # Of Pairs Happens when all internal nodes are m-nodes.
Full degree m tree.
# of nodes = 1 + m + m2 + m3 + … + mh-1
= (mh – 1)/(m – 1).
Each node has m – 1 pairs.
So, # of pairs = mh – 1.
Capacity Of m-Way Search Tree
Definition Of B-Tree Definition assumes external nodes (extended m-way search tree).
B-tree of order m.
m-way search tree.
Not empty => root has at least 2 children.
Remaining internal nodes (if any) have at least (m/2) children.
External (or failure) nodes on same level.
2-3 And 2-3-4 Trees B-tree of order m.
m-way search tree.
Not empty => root has at least 2 children.
Remaining internal nodes (if any) have at least (m/2) children.
External (or failure) nodes on same level. 2-3 tree is B-tree of order 3.
2-3-4 tree is B-tree of order 4.
B-Trees Of Order 5 And 2 B-tree of order m.
m-way search tree.
Not empty => root has at least 2 children.
Remaining internal nodes (if any) have at least (m/2) children.
External (or failure) nodes on same level. B-tree of order 5 is 3-4-5 tree (root may be 2-node though).
B-tree of order 2 is full binary tree.
Minimum # Of Pairs n = # of pairs.
# of external nodes = n + 1.
Height = h => external nodes on level h + 1. level # of nodes 1 1 2 >= 2 3 >= 2*(m/2) h + 1 >= 2*(m/2)h-1 n + 1 >= 2*(m/2)h-1, h >= 1
Minimum # Of Pairs m = 200. n + 1 >= 2*(m/2)h-1, h >= 1 height # of pairs 2 >= 199 3 >= 19,999 4 >= 2 * 106 – 1 5 >= 2 * 108 – 1 h <= log (m/2)(n+1)/2 + 1
Choice Of m Worst-case search time.
(time to fetch a node + time to search node) * height
(a + b*m + c * log2m) * h
where a, b and c are constants. m search time 50 400
Bottom-Up Insert 15 20 8 4 1 3 5 6 30 40 9 Insertion into a full leaf triggers bottom-up node splitting pass. 16 17
Bottom-up insert/delete preferred over top down, because bottom-up does fewer node splits/joins. Hence, fewer disk accesses are made.
Split An Overfull Node ai is a pointer to a subtree.
pi is a dictionary pair.
m a0 p1 a1 p2 a2 … pm am (m/2)-1 a0 p1 a1 p2 a2 … p (m/2) -1 a (m/2) -1 m-(m/2)a(m/2) p(m/2)+1 a(m/2)+1 … pm am p(m/2) plus pointer to new node is inserted in parent.
Node Splitting
Worst-Case Disk Accesses Assume enough memory to hold all h nodes accessed on way down.
h read accesses on way down.
2s+1 write accesses on way up, s = number of nodes that split.
Total h+2s+1 disk accesses.
Max is 3h+1.
Average Disk Accesses Start with empty B-tree.
Insert n pairs.
Resulting B-tree has p nodes.
# splits <= p –2, p > 2.
# pairs >= 1+((m/2) – 1)(p – 1).
savg <= (p – 2)/(1+((m/2) – 1)(p – 1)).
So, savg < 1/((m/2) – 1).
m = 200 => savg < 1/99.
Average disk accesses < h + 2/99 + 1 ~ h + 1.
Nearly minimum.
An insert must make at least h read and 1 write access. So, h+1 is the minimum number of disk accesses for each insert. Therefore, the average is at least h+1.
Comments