Deletion from interior node is transformed into a deletion from a leaf node.
Deficient leaf triggers bottom-up borrowing and node combining pass.
Deficient node is combined with an adjacent sibling who has exactly (m/2) – 1 pairs.
After combining, the node has [(m/2) – 2] (original pairs) + [(m/2) – 1] (sibling pairs) + 1 (from parent) <= m –1 pairs.
Disk Accesses
Borrow. Merge. Minimum. 15 20 4 5 6 30 40 13 16 17 7 12 9 8 10 3
Minimum is when you delete 5, for example.
Worst-Case Disk Accesses
Assume enough memory to hold all h nodes accessed on way down.
h read accesses on way down.
h – 1 sibling read accesses on way up.
h – 2 writes of combined nodes on way up.
3 writes of root and level 2 nodes for sibling borrowing at level 2.
Total is 3h.
Average Disk Accesses
Start with B-tree that has n pairs and p nodes.
Delete the pairs one by one.
n >= 1+((m/2) – 1)(p – 1).
p <= 1 + (n – 1)/((m/2)– 1).
Upper bound on total number of disk accesses.
Each delete does a borrow.
The deletes together do at most p –1 combines/merges.
# accesses <= n(h+4) + 2(p – 1).
H+4 => h reads on way down, a sibling read, 3 writes.
Merges can only be done until the number of nodes becomes 3. So, number of merges is at most p-3. We use p-1 for simplicity.
2(p-1) => p-1 sibling reads and p-1 writes
Average Disk Accesses
Average # accesses <= [n(h+4) + 2(n – 1)/ ((m/2) – 1)]/n ~ h + 4.
Nearly minimum.
A delete must make at least h read and 1 write access. So, h+1 is the minimum number of disk accesses for each delete. Therefore, the average is at least h+1.
Worst Case
Alternating sequence of inserts and deletes.
Each insert does h splits at a cost of 3h + 1 disk accesses.
Each delete moves back up to root at a cost of 3h disk accesses.
Average for this sequence is 3h + 1 for an insert and 3h for a delete.
Internal Memory B-Trees
Cache access time vs main memory access time.
Reduce main memory accesses using a B-tree.
Node Structure
q a0 p1 a1 p2 a2 … pq aq Node operations during a search.
Search the node for a given key.
As are subtree pointers, ps are dictionary pairs
Node Operations For Insert
m a0 p1 a1 p2 a2 … pm am (m/2)-1 a0 p1 a1 p2 a2 … pceil(m/2)-1 aceil(m/2)-1 m-(m/2)a(m/2) p(m/2)+1 a(m/2)+1 … pm am Insert a dictionary pair and a pointer (p a).
Find middle pair.
3-way split around middle pair.
Node Operations For Delete
Delete a dictionary pair.
3-way join.
Node Structure
Each B-tree node is an array partitioned into indexed red-black tree nodes that will keep one dictionary pair each.
Indexed red-black tree is built using simulated pointers (integer pointers).
Complexity Of B-Tree Node Operations
Search a B-tree node … O(log m).
Find middle pair … O(log m).
Insert a pair … O(log m).
Delete a pair … O(log m).
Split a B-tree node … O(log m).
Join 2 B-tree nodes … O(m).
Need to copy indexed red-black tree that represents one B-tree node into the array space of the other B-tree node.
Comments