Newest Viewed Downloaded

Sorted Array A B C D E elements are in ascending order of key. get(theKey) O(log size) time put(theKey, theElement) O(log size) time to verify duplicate, O(size) to add. remove(theKey) O(size) time.

Dictionaries

Collection of items. Each item is a pair. (key, element) Pairs have different keys. Operations. get(Key) ( or, Search(Key)) put(Key, Element) (or, Insert(Key, Element)) remove(Key) (or, Delete(Key)) The word item is used as a synonym for element to avoid the confusion that would otherwise result by saying that a dictionary is a collection of elements and each element is a pair (key, element). A linear list is an ordered sequence of elements. A dictionary is just a collection of elements/items. Interface Dictionary has these 3 methods.

Application

Collection of student records in this class. (key, element) = (student name, linear list of assignment and exam scores) All keys are distinct. Get the element whose key is John Adams. Update the element whose key is Diana Ross. put() implemented as update when there is already a pair with the given key. remove() delete an element with a specified key value.

Dictionary With Duplicates

Keys are not required to be distinct. Word dictionary. Pairs are of the form (word, meaning). May have two or more entries for the same word. (bolt, a threaded pin) (bolt, a crash of thunder) (bolt, to shoot forth suddenly) (bolt, a gulp) (bolt, a standard roll of cloth) etc.

Represent As A Linear List

L = (e0, e1, e2, e3, …, en-1) Each ei is a pair (key, element). 5-element dictionary D = (a, b, c, d, e). a = (aKey, aElement), b = (bKey, bElement), etc. Array or linked representation. Even though a dictionary is not an ordered sequence, it may be represented as a linear list.

Array Representation

a b c d e get(theKey) O(size) time put(theKey, theElement) O(size) time to verify duplicate, O(1) to add at right end. remove(theKey) O(size) time.

Sorted Array

A B C D E elements are in ascending order of key. get(theKey) O(log size) time put(theKey, theElement) O(log size) time to verify duplicate, O(size) to add. remove(theKey) O(size) time. O(log (size)) for get() results from the use of a binary search.

Unsorted Chain

get(theKey) O(size) time put(theKey, theElement) O(size) time to verify duplicate, O(1) to add at left end. remove(theKey) O(size) time. a b c d e null firstNode

Sorted Chain

Elements are in ascending order of Key. get(theKey) O(size) time put(theKey, theElement) O(size) time to verify duplicate, O(1) to put at proper place. A B C D E null firstNode Binary search cannot be done in O(log size) time on a chain, because it takes O(size) time to locate the middle node. Worst-case performance of a sorted chain is no better than that of an unsorted chain.

Sorted Chain

Elements are in ascending order of Key. A B C D E null firstNode remove(theKey) O(size) time.

Skip Lists

maintain extra pointers for the middle elements of the chain In general, the level 0 chain includes all elements, the level 1 chain include every second elements, the level 2 chain every fourth elements, and so on. assign level numbers of new elements by probability

Skip Lists

In worst case there may be only one level. Worst-case time for get, put, and remove is O(size). Expected time is O(log size).

Hash Tables

Worst-case time for get, put, and remove is O(size). Expected time is O(1).

Ideal Hashing

Uses a 1D array (or table) table[0:b-1]. Each position of this array is a bucket. A bucket can normally hold only one dictionary pair. Uses a hash function f that converts each key k into an index in the range [0, b-1]. f(k) is the home bucket for key k. Every dictionary pair (key, element) is stored in its home bucket table[f[key]]. Bucket size equal to size of L1 cache line or of a disk track or sector may yield better performance.

Ideal Hashing Example

[0] [1] [2] [3] [4] [5] [6] [7] (85,f) (22,a) (33,c) (3,d) (73,e) get, put, and remove take O(1) time. Pairs are: (22,a), (33,c), (3,d), (73,e), (85,f). Hash table is table[0:7], b = 8. Hash function is f(key) = key/11  . Pairs are stored in table as below:

What Can Go Wrong?

[0] [1] [2] [3] [4] [5] [6] [7] (85,f) (22,a) (33,c) (3,d) (73,e) Where does (26,g) go? Keys that have the same home bucket are synonyms. 22 and 26 are synonyms with respect to the hash function that is in use. The home bucket for (26,g) is already occupied. Where does (100,h) go? There is no bucket with index 100/11 = 9. There is a problem with the hash function, because it is supposed to map all valid keys into a valid bucket.

What Can Go Wrong?

(85,f) (22,a) (33,c) (3,d) (73,e) A collision occurs when the home bucket for a new pair is occupied by a pair with a different key. An overflow occurs when there is no space in the home bucket for the new pair. When a bucket can hold only one pair, collisions and overflows occur together. Need a method to handle overflows.

Hash Table Issues

Choice of hash function. Collision handling method. Size (number of buckets) of hash table. 2/23 I3B

Hash Functions

Two parts: Convert key into an integer in case the key is not an integer. Map an integer into a home bucket. f(k) is an integer in the range [0, b-1], where b is the number of buckets in the table. Object.hashCode() returns an int based on the reference value of this. So user defined classes must over ride Object.hashCode() for things to work right. hashCode() is already defined properly for Java’s classes: Integer, Long, Float, Double, String, Character, etc. The hashCode() definition in these classes over rides the default definition in Object. Note that hashCode may return a negative int.

Map Into A Home Bucket

[0] [1] [2] [3] [4] [5] [6] [7] (85,f) (22,a) (33,c) (3,d) (73,e) Most common method is by division. homeBucket = hashCode(key) % divisor; divisor equals number of buckets b. 0 <= homeBucket < divisor = b Note: hashCode maps into an integer, not into a nonnegative integer. % works properly only with nonnegative integers.

Showing 1 - 20 of 26 items Details

Name: 
chap7-1
Author: 
Preferred Customer
Company: 
N/A
Description: 
Sorted Array A B C D E elements are in ascending order of key. get(theKey) O(log size) time put(theKey, theElement) O(log size) time to verify duplicate, O(size) to add. remove(theKey) O(size) time.
Tags: 
the | key | time | home | size | hash | bucket | into
Created: 
6/17/1995 11:31:02 PM
Slides: 
26
Views: 
0
Downloads: 
0
Rating: 
0


> Comment



Share this presentation
|

Comments

Share this presentation:

|
Sitemap