This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home. For a version of this page that may be more recent, see HashTablesCowan in WG2's repo for R7RS-large.

Hash­Tables­Cowan

cowan
2013-05-20 00:15:08
23history
source

Hash tables

This WG2 proposal defines an interface to hash tables, which are widely recognized as a fundamental data structure for a wide variety of applications. A hash table is a data structure that:

This proposal is at an intermediate level. It supports a great many convenience procedures on top of the basic hash table interfaces provided by SRFI 69 and R6RS. However, it does not mandate support for immutability or weakness, though it does provide a semi-portable hook for specifying these features.

Issues

I've heard complaints about the messiness of passing two arguments, one of which is optional, when creating a hash table. What about having a procedure that accepts an equivalence and a hash procedure, and returns a function that behaves the same as the equivalence function when called with two arguments, but when called with zero arguments, returns the hash procedure? This is used to create hash tables, saving an extra argument each time. Note that this style allows the author of an equivalence predicate to provide hashing without cluttering up the interface, and allows the predicate to be provided as transparently as built-in predicates are. (This idea was suggested by early reviewer kpreid.)

Rationale

Sources

The procedures in this proposal are drawn primarily from SRFI 69, R6RS, and the native hash tables of Racket. In addition, the following sources are acknowledged:

The procedures hash-table-replace!, hash-table-replace!/default, hash-table-difference, hash-table-difference! and most of the procedures in the section "Functional update" were added for completeness.

The native hash tables of MIT, Guile, SISC, Bigloo, Scheme48, SLIB, RScheme, Scheme 7, Scheme 9, Rep, and FemtoLisp were also investigated, but no additional procedures were incorporated.

SRFI 69 compatibility

The names used in this proposal are mostly derived from SRFI 69, with the following changes:

R6RS compatibility

The relatively few hash table procedures in R6RS are all available in this proposal under somewhat different names. This proposal adopts SRFI 69's term hash-table rather than R6RS's hashtable, because of the universal use of "hash table" rather than "hashtable" in other languages and in technical prose generally. Besides, the English word hashtable obviously means something that can be ... hashted.

In addition, the hashtable-ref and hashtable-update! of R6RS correspond to the hash-table-ref/default and hash-table-update!/default of this proposal.

It would be trivial to provide the R6RS names on top of this proposal. The only substantive difference is that R6RS hashtable-values and hashtable-entries return vectors, whereas in this proposal hash-table-values and hash-table-entries return lists.

Common Lisp compatibility

As usual, the Common Lisp names are completely different from the Scheme names. Common Lisp provides the following functions that are not in this proposal:

Reflection and hash-function procedures

SRFI 69 provides reflective procedures that, given a hash table, returns its equivalence and hash procedures, as well as procedures that expose the implementation's hash procedures suitable for the equivalence procedures eq?, equal?, string=?, and string-ci=?. The second of these can also be used for eqv?. However, if the eq? hash procedure directly exposes the address of the key, and the garbage collector moves the hash table, it must also rehash its keys. In such implementations, the hash-by-identity procedure is not idempotent, which makes it dangerous to use outside the context of implementation-provided hash tables.

R6RS eliminates this issue by providing separate constructors for eq? and eqv? hash tables, and refusing to expose the hash functions for them. However, this proposal takes the radical option of providing neither reflection nor implementation-based hash functions. They can of course be provided bby implementations as extensions.

Pronunciation

The slash in the names of some procedures can be pronounced "with".

Procedures

Constructors

(make-hash-table equivalence [ hash ] [ arg ... ])

Returns a newly allocated hash table whose equivalence procedure is equivalence and hash procedure is hash. It is the programmer's responsibility to ensure that two objects passed to the hash procedure return the same value if the same objects passed to the equivalence procedure return true; however, the converse is not required.

The ability to omit the hash argument is severely limited. If equivalence is eq?, eqv?, equal?, string=?, or string-ci=?, a suitable implementation-provided procedure will be used. The implementation may extend this list. But if any other equivalence procedure is provided without a hash procedure, an error that satisfies hash-table-error? is signaled.

The meaning of the remaining arguments is implementation-dependent. However, implementations which support the ability to specify the initial capacity of a hash table should interpret a non-negative exact integer as the specification of that capacity. In addition, if the symbols immutable, weak-keys or weak-values are present, implementations should create immutable hash tables, hash tables with weak keys, and hash tables with weak values respectively. In an implementation which does not, an error that satisfies hash-table-error? should be signaled. To avoid collision with the hash argument, none of these arguments can be procedures.

(SRFI 69 make-hash-table; R6RS make-eq-hashtable, make-eqv-hashtable, and make-hashtable; Common Lisp make-hash-table)

(hash-table equivalence hash ( key value ) ...)

Returns a newly allocated hash table, created as if by make-hash-table, whose equivalence procedure is equivalence and hash procedure is hash. For each pair of arguments, an association is created in the new hash table with key as its key and value as its value.

Predicates

(hash-table? obj)

Returns #t if obj is a hash table. (SRFI 69 hash-table?; R6RS hashtable?; Common Lisp hash-table-p)

(hash-table-contains? hash-table key)

Returns #t if there is any association to key in hash-table, and #f otherwise. Must execute in amortized O(1) time. (SRFI 69 hash-table-exists?; R6RS hashtable-contains?)

(hash-table=? key= value= hash-table1 hash-table2)

Not yet defined

Accessors

The following procedures, given a key, return the corresponding value.

(hash-table-ref hash-table key [ failure [ success ] ])

Extracts the value associated to key in hash-table, invokes the procedure success on it, and returns its result. Otherwise, invokes the procedure failure on no arguments and returns its result. If success is not provided and is required, the value itself is returned. If failure is not provided but is required, an error that satisfies hash-table-not-found? is signaled. Must execute in amortized O(1) time, not counting the time to call the procedures. (SRFI 69 hash-table-ref)

(hash-table-ref/default hash-table key default)

Semantically equivalent to, but may be more efficient than, the following code: (hash-table-ref hash-table key (lambda () default))

((SRFI69 hash-table-ref/default, R6RS hashtable-ref; Common Lisp gethash)

Mutators

The following procedures alter the associations in a hash table either unconditionally, or conditionally on the presence or absence of a specified key.

(hash-table-set! hash-table key value)

Creates a new association in hash-table that associates key with value. If there is a previous association for key, it is deleted. It is an error if hash-table is not a valid argument to the equality predicate of hash-table. Returns an unspecified value. Must execute in amortized O(1) time. (SRFI 69 hash-table-set!; R6RS hashtable-set!)

(hash-table-set-all! hash-table ( key value ) ...)

Repeatedly mutates hash-table, setting each key to the value that follows it.

(hash-table-set-entries! hash-table keys values)

Repeatedly mutates hash-table, setting each element of keys to the corresponding element of values.

(hash-table-set-alist! hash-table alist)

Repeatedly mutates hash-table using the associations of alist.

(hash-table-delete! hash-table key)

Deletes any association to key in hash-table and returns an unspecified value. It is not an error if no association for that key exists. Must execute in amortized O(1) time. (SRFI 69 hash-table-delete!; R6RS hashtable-delete!; Common Lisp remhash)

(hash-table-delete-keys! hash-table keylist)

Semantically equivalent to, but may be implemented more efficiently than, the following code:

(for-each (lambda (key) (hash-table-delete hash-table key)) keylist)

(hash-table-extend! hash-table key [ failure [ success ] ])

Invokes hash-table-ref with the given arguments and returns what it returns. If key was not found in hash-table, its value is set to the result being returned.

(hash-table-extend!/default hash-table key default)

Invokes hash-table-ref/default with the given arguments and returns what it returns. If key was not found in hash-table, its association is set to the result being returned.

(hash-table-replace! hash-table key updater [ failure [ ''success ] ])

Invokes hash-table-ref with the given arguments and returns what it returns. If key was found in hash-table, its value is set to the result being returned.

(hash-table-replace!/default hash-table key updater default`)

Invokes hash-table-ref/default with the given arguments and returns what it returns. If key was found in hash-table, its value is set to the result being returned.

(hash-table-update! hash-table key updater [ failure [ ''success ] ])

Semantically equivalent to, but may be implemented more efficiently than, the following code:

(hash-table-set! hash-table key (updater (hash-table-ref hash-table key failure success)))

Must execute in amortized O(1) time. Returns an unspecified value. (SRFI 69 hash-table-update!/default; R6RS hashtable-update!)

(hash-table-update!/default hash-table key updater default)

Semantically equivalent to, but may be implemented more efficiently than, the following code:

(hash-table-set! hash-table key (updater (hash-table-ref/default hash-table key default)))

Must execute in amortized O(1) time. Returns an unspecified value. (SRFI 69 hash-table-update!)

(hash-table-push! hash-table key value)

Semantically equivalent to, but may be more efficient than, the following code:

(hash-table-update!/default hash-table key (lambda (x) (cons value` x)) '())

(hash-table-pop! hash-table key)

If an association with key is found in hash-table, then return the car of the value, and set the value to its cdr. If the value is not found or is not a pair, signal an error.

Functional update

(hash-table-set hash-table key value)

(hash-table-set-all hash-table ( key value ) ...)

(hash-table-set-entries hash-table keys values)

(hash-table-set-alist hash-table alist)

(hash-table-delete hash-table key)

(hash-table-delete-keys hash-table keylist)

(hash-table-extendhash-table key [ failure [ ''success ] ])

(hash-table-extend/default hash-table key value default)

(hash-table-replace hash-table key [ failure [ ''success ] ])

(hash-table-replace/default hash-table key value default)

(hash-table-update hash-table key updater [ failure [ ''success ] ])

(hash-table-update/default hash-table key updater default)

These procedures are equivalent to the mutators with corresponding names, except that they return a new hash table which differs from the original in the specified ways. No guarantees are provided about their amortized execution time, as they may copy the original hash table.

The whole hash table

(hash-table-clear! hash-table)

Remove all the associations from hash-table. Should execute in amortized O(1) time. (R6RS hashtable-clear!; Common Lisp clrhash)

(hash-table-length hash-table)

Returns the number of associations in hash-table as an exact integer. Must execute in amortized O(1) time. (SRFI 69 hash-table-size, R6RS hashtable-size; Common Lisp hash-table-count)

Return number of keys in hash-table.

(hash-table-keys hash-table)

Return list of all keys in an unspecified order. (SRFI 69 hash-table-keys; R6RS hashtable-keys returns a vector)

(hash-table-values hash-table)

Return list of all values in an unspecified order, not necessarily the same order as hash-table-keys. (SRFI 69 hash-table-values)

(hash-table-entries hash-table)

Returns two values, the list of keys in an unspecified order and the list of values in the corresponding order. (R6RS hash-table-entries returns vectors)

(hash-table-find hash-table proc)

For each association of hash-table, invoke proc on its key and value in an unspecified order. If proc returns true, then return the value. If proc always returns #f, return #f.

(hash-table-count hash-table pred)

For each association of hash-table, invoke pred on its key and value in an unspecified order. Return the number of calls to pred which returned true.

(hash-table-remove! hash-table pred)

For each association of hash-table, invoke pred on its key and value in an unspecified order. If pred returns true, delete the association.

Mapping and folding

(hash-table-map equivalence hash proc merger hash-table)

Returns a newly allocated hash table as if by make-hash-table. Calls proc for every association in hash-table with two arguments: the key of the association and the value of the association. The two values returned by proc are inserted into the new hash table as a key and value. The order in which procedure is called for different associations is unspecified.

When a key already exists in the hash table, the procedure merger is called with arguments oldkey, oldvalue, newkey, newvalue and returns two values, the proper key and the proper value.

(hash-table-map! proc merger hash-table)

The same as hash-table-map, but the values returned by proc are used to mutate hash-table instead of creating a new one.

(hash-table-for-each proc hash-table)

Calls proc for every association in hash-table with two arguments: the key of the association and the value of the association. The value returned by proc is discarded. The order in which procedure is called for different associations is unspecified. Returns an unspecified value. (SRFI 69 hash-table-walk has the hash-table as the first argument; Common Lisp maphash)

(hash-table-map->list proc hash-table)

Calls proc for every association in hash-table with two arguments: the key of the association and the value of the association. The value returned by proc is accumulated into a list, which is returned. The order in which procedure is called for different associations is unspecified.

(hash-table-fold proc init hash-table)

Calls proc for every association in hash-table with three arguments: the key of the association, the value of the association, and an accumulated value val. Val is init for the first invocation of procedure, and for subsequent invocations of procedure, the returned value of the previous invocation. The value returned by hash-table-fold is the return value of the last invocation of proc. The order in which procedure is called for different associations is unspecified. (SRFI 69 hash-table-fold has the hash-table as the first argument)

(hash-table-unfold equivalence hash continue? mapper successor seed)

Create a new hash table using equivalence and hash. If the result of applying the predicate continue? to seed is #f, return the hash table. Otherwise, apply the procedure mapper to seed. Mapper returns two values, which are inserted into the hash table as the key and the value respectively. Then get a new seed by applying the procedure successor to seed, and repeat this algorithm.

Copying and conversion

(hash-table-copy hash-table)

Returns a newly allocated hash table with the same properties and associations as hash-table. (SRFI 69 hash-table-copy; R6RS hashtable-copy, passing #t as the second argument)

(hash-table->alist hash-table)

Returns an alist with the same associations as hash-table in an unspecified order.

(alist->hash-table alist equivalence hash)`

Returns a newly allocated hash-table as if by make-hash-table, initializing it from the associations of alist. (SRFI 69 alist->hash-table)

Hash tables as functions

(hash-table-accessor hash-table [ failure [ success ] ])

(hash-table-accessor/default hash-table default)

(hash-table-mutator hash-table)

(hash-table-deleter hash-table)

(hash-table-extender hash-table [ failure [ success ] ])

(hash-table-extender/default hash-table default)

(hash-table-replacer hash-table updater [ failure [ success ] ])

(hash-table-replacer/default hash-table updater default)

(hash-table-updater hash-table updater [ failure [ success ] ])

(hash-table-updater/default hash-table updater default)

These procedures allow hash tables to be used as functions with mutable behavior. They return procedures that are curried versions of hash-table-ref, hash-table-ref/default, hash-table-set!, hash-table-delete!, hash-table-extend!, hash-table-extend!/default, hash-table-replace!, hash-table-replace!/default hash-table-update!, and hash-table-update!/default, respectively on the given arguments. All such procedures accept a key and return either the corresponding value (for procedures created by hash-table-accessor, hash-table-accessor/default, hash-table-extender, or hash-table-extender/default), or an unspecified value (for all other procedures).

Hash tables as sets

(hash-table-union hash-table1 hash-table2 [ merger ])

Adds the associations of hash-table2 to a copy of hash-table1 and returns it. The values are merged using the merger procedure, which defaults to (lambda (value1 value2) value2). (SRFI 69 hash-table-merge!)

(hash-table-union! hash-table1 hash-table2 [ merger ])

Adds the associations of hash-table2 to hash-table1 and returns it. The values are merged using the merger procedure, which defaults to (lambda (value1 value2) value2).

(hash-table-difference hash-table1 hash-table2)

Removes the keys of hash-table2 from a copy of hash-table1 and returns it.

(hash-table-difference! hash-table1 hash-table2)

Removes the keys of hash-table2 from hash-table1 and returns hash-table1.

Exceptions

(hash-table-error? obj)

Returns #t if obj is an object raised by the make-hash-table procedure or any other procedure that creates hash tables, and #f otherwise.

(hash-table-not-found? obj)

Returns #t if obj is an object raised by the hash-table-ref procedure or any other procedure that accesses hash tables when the key is not found and there is no failure procedure, and #f otherwise.