This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home.

Source for wiki HashTablesCowan version 14

author

cowan

comment


    

ipnr

198.185.18.207

name

HashTablesCowan

readonly

0

text

'''Work in progress, do not use yet'''

== Hash tables ==

This WG2 proposal defines an interface to hash tables, which are widely recognized as a fundamental data structure for a wide variety of applications.  A hash table is a data structure that:

   * Is disjoint from all other types.
   * Provides a mapping from objects known as ''keys'' to corresponding objects known as ''values''.
     * Keys may be any Scheme objects in some kinds of hash tables, but are restricted in other kinds.
     * Values may be any Scheme objects.
   * Has no intrinsic order for the key-value ''associations'' it contains, though an order may be defined as an extension.
   * Provides an ''equivalence function'' which defines when a proposed key is the same as an existing key.  No table may contain more than one value for a given key.
   * Supports mutation as the primary means of setting the contents of a table.
   * Assumes that keys are immutable; mutating a key leads to unspecified behavior.
   * Provides key lookup and destructive update in amortised constant time.
   * Does not guarantee that whole-table operations work in the presence of concurrent mutation.

== Procedures ==

`(make-hash-table `''= hash'' `.` ''args''`)`

Returns a newly allocated hash table whose equality procedure is =, hash function is hash (defaults to a suitable function, and args mean whatever (number is initial capacity, 'immutable, 'weakkeys, 'weak-values, all implementation-dependent).  The "safe" equality procedures are eq?, eqv?, equal?, string=?, string-ci=?.

`(hash-table `''= hash'' ( ''key value'' ) ...`)`

Ditto, but with explicitly specified key/value pairs

`(hash-table? `''obj''`)`

Returns #t if obj is a hash table.

`(hash-table-ref/default `''hash-table key default''`)`

Returns the value corresponding to key in hash-table, or default if none.

`(hash-table-ref `''hash-table key'' [ ''failure'' [ ''success'' ] ]`)`

Looks up key in hash table.  If found, passes value to success procedure (default identity) and returns its value; if not found, invoke failure procedure (default is to signal an error) and return its value

`(hash-table-ref!/default `''hash-table key default''`)`

Same as hash-table-ref/default, but updates hash table if needed.

`(hash-table-ref! `''hash-table key'' [ ''failure'' [ ''success'' ] ]`)`

Same as hash-table-ref, but updates hash table if needed.

`(hash-table-exists? `''hash-table key''`)`

Returns #t if key exists, #f if not.

`(hash-table-set! `''hash-table key value''`)`

Primitive mutator.

`(hash-table-set*! `''hash-table'' ( ''key value'' ) ...`)`

Mutate key1 to value1, key2 to value2, etc.

`(hash-table-set `''hash-table key value''`)`

Like hash-table-set! but returns new hash table.

`(hash-table-set* `''hash-table'' ( ''key value'' ) ...`)`

Like hash-table-set*! but returns new hash table.

`(hash-table-delete! `''hash-table key''`)`

Delete key if it exists.

`(hash-table-delete `''hash-table key''`)`

Allocate copy without key.

`(hash-table-delete*! `''hash-table keylist''`)`

Delete each key in keylist if it exists.

`(hash-table-delete* `''hash-table keylist''`)`

Allocate copy without keys in keylist.

`(hash-table-update!/default `''hash-table key updater default''`)`

Looks up key in hash-table, using default if not found.  Invoke updater procedure and mutate the value of the same key.

`(hash-table-update! `''hash-table key updater'' [ ''failure'' [ ''success ] ]`)`

Looks up key in hash-table, passing it through success procedure if found, or result of failure procedure if not found.  Pass through updater, mutate the value of the same key.

`(hash-table-update/default `''hash-table key updater default''`)`

Same as hash-table-update!/default but returns new hash table.

`(hash-table-update `''hash-table key updater'' [ ''failure'' [ ''success ] ]`)`

Same as hash-table-update! but returns new hash table.

`(hash-table-replace!/default `''hash-table key updater default''`)

If key is found, replace it with value and return value, otherwise return default.

`(hash-table-replace! `''hash-table key updater'' [ ''failure'' [ ''success ] ]`)`

The same, but uses failure and success instead.

`(hash-table-replace/default hash-table `''key value default''`)`

Same as hash-table-replace!/default but returns new hash table.

`(hash-table-replace `''hash-table key updater'' [ ''failure'' [ ''success ] ]`)`

Same as hash-table-replace! but returns new hash table.

`(hash-table-clear! `''hash-table''`)`

Remove all keys from hash-table as efficiently as possible.

`(hash-table-length `''hash-table''`)`

Return number of keys in hash-table.

`(hash-table-keys `''hash-table''`)`

Return list of all keys in random order.

`(hash-table-values `''hash-table''`)`

Return list of all values in random order, not necessarily the same order as hash-table-keys.

`(hash-table-entries `''hash-table''`)`

Returns two values, the list of keys in random order and the list of values in the matching order.

`(hash-table-map->list `''hash-table proc''`)`

Pass each key and value as separate arguments to proc, collect results in list.  There is no `hash-table-map` in the sense of lifting a procedure to the domain of hash tables.

`(hash-table-map `''= hash proc merger hash-table'' ...`)`

For n hash-table arguments, invoke proc with 2n arguments (key, value, key, value ...) and expect it to return two values, which are used to construct a new hash table using = and hash.  Merger is called with key,,1,, value,,1,, key,,2,, value,,2,, when the two keys are equal according to =, and returns two values, the proper key and the proper value.

`(hash-table-map! `''proc merger hash-table'' ...`)`

The same, but the values are used to mutate the first hash table instead of creating a new one.

`(hash-table-for-each `''proc hash-table'' ...`)`

The same, but discards the results.

`(hash-table-copy `''hash-table''`)`

Returns a fresh copy of hash-table, same equality and hash procedures.

`(hash-table->alist `''hash-table''`)`

Returns an alist with the keys and values of hash-table.

`(alist->hash-table `''alist = hash'' `.` ''args''`)`

Returns a new hash-table, as if with make-hash-table, initializing it with alist.

`(hash-table-push! `''hash-table key value''`)`

Conses value onto the value of key, empty list if no such key.

`(hash-table-pop! `''hash-table key'' [ ''success'' [ ''failure'' ] ]`)`

Returns car of value corresponding to key, mutates key to cdr.  Failure if the key does not exist or the value is not a pair.  Issue: signal an error for non-pair instead?

`(hash-table-pop!/default `''hash-table key default''`)`

The same, but with default if key does not exist or value is not a pair.  Issue: signal an error for non-pair instead?

`(hash-table-find `''hash-table proc''`)`

Proc gets called with each key and value in random order.  If it returns #f, try another key and value, otherwise return what proc returns.

`(hash-table-accessor/default `''hash-table default''`)`

Curries hash-table-ref/default.

`(hash-table-accessor `''hash-table'' [ ''success'' [ ''failure'' ] ]`)`

Curries hash-table-ref.

`(hash-table-mutator `''hash-table''`)`

Curries hash-table-set!.

`(hash-table-deleter `''hash-table''`)`

Curries hash-table-delete.

`(hash-table-updater/default `''hash-table updater default''`)`

Curries hash-table-update!/default.

`(hash-table-updater `''hash-table updater'' [ ''failure'' [ ''success'' ] ]`)`

Curries hash-table-update!

`(hash-table-union `''hash-table,,1,, hash-table,,2,,'' [ ''merger'' ]`)`

Adds keys and values of hash-table,,2,, to a copy of hash-table,,1,, and returns it.  The values are merged using the merger procedure, which defaults to `(lambda (value,,1,, value,,2,,) value,,2)`. 

`(hash-table-union! `''hash-table,,1,, hash-table,,2,,'' [ ''merger'' ]`)`

Adds keys and values of hash-table,,2,, to a copy of hash-table,,1,, and returns it.  The values are merged using the merger procedure, which defaults to `(lambda (value,,1,, value,,2,,) value,,2)`. 

`(hash-table-difference `''hash-table,,1,, hash-table,,2,,'' [ ''merger'' ]`)`

Removes keys of hash-table,,2,, from copy of hash-table,,1,, and returns it.

`(hash-table-difference! `''hash-table,,1,, hash-table,,2,,'' [ ''merger'' ]`)`

Removes keys of hash-table,,2,, from hash-table,,1,, and returns it.

`(hash-table=? `''key= value= hash-table,,1,, hash-table,,2,,''`)`

Returns #t if hash-tables have the same keys (in the sense of key=) and the same corresponding values (in the sense of value=).

== Not yet processed ==

set!* from an alist

accept key list and value list and add them all, destructive or non-destructive, analogous to CL pairlis

hash-table-value-exists?

hash-table-unfold

count and remove/remove! by a predicate

== An idea ==

Have a procedure that accepts an equality and a hash function, and returns a function that behaves as the equality function when called with two arguments, but when called with zero arguments, returns the hash function.  This is used to create hash tables, saving an extra argument each time.  Note that this allows the author of an equality predicate to provide hashing without cluttering up the interface, and allows the predicate to be an equal peer with built-in predicates.

== Everything past this point is obsolete. ==

Only kept around as a source of useful text to cut and paste.

== Constructors ==

These constructors provide suitable hash functions for the equivalence function specified as part of the constructor name.  This proposal does not allow (semi-)arbitrary equivalence and hash functions to be specified.

`(make-eq-hash-table)`

Returns a newly allocated table with no associations whose equivalence function is `eq?`.  (SRFI-69 `(make-hash-table eq? hash-by-identity)`; R6RS `(make-eq-hashtable)`)

`(make-equal-hash-table)`

Returns a newly allocated with no associations whose equivalence function is `equal?`.  (SRFI-69 `(make-hash-table equal? hash)`; R6RS `(make-hashtable equal? equal-hash)`)

`(make-string-hash-table)`

Returns a newly allocated table with no associations whose equivalence function is `string=?`.  (SRFI-69 `(make-hash-table string=? string-hash)`; R6RS `(make-hashtable string=? string-hash)`)

`(make-string-ci-hash-table)`

Returns a newly allocated table with no associations whose equivalence function is `string-ci=?`.  (SRFI-69 `(make-hash-table string-ci=? string-ci-hash)`; R6RS `(make-hashtable string-ci=? string-ci-hash)`)

Note that there are no hash tables whose equivalence function is `eqv?`, because SRFI 69 does not support them.  Users will have to live with `eq?` or `equal?` hash tables as the case may be.

'''Issue''':  add `(hash-table-unfold `''p f g key-seed value-seed''`)`?  Or possibly just one seed and let `f` generate the two values from a single value.

== Copiers ==

`(hash-table-copy `''hash-table''`)`

Returns a newly allocated hash table with the same equivalence predicate and associations as ''hash-table''. (SRFI-69 `hash-table-copy`; R6RS `(hashtable-copy `''hash-table''` #t)`)

== Predicates ==

`(hash-table? `''obj''`)` 

Returns `#t` if ''obj'' is a hash table.  (SRFI-69 `hash-table?`; R6RS `hashtable?`)

`(hash-table-contains? `''hash-table''` `''key''`)`

Returns `#t` if there is any association to ''key'' in ''hash-table''.  Must execute in amortized O(1) time.  (SRFI-69 `hash-table-exists?`; R6RS `hashtable-contains?`)

== Accessors ==

`(hash-table-ref `''hash-table''` `''key'' ''default-proc''`)`

Returns the value associated to ''key'' in ''table''. If no value is associated to ''key'', ''default-proc'' is applied to no arguments and the result is returned.  Must execute in amortized O(1) time, not counting the time to call ''default-proc'' if necessary.  (SRFI-69 `hash-table-ref/default`; R6RS `hashtable-ref`)

`(hash-table-size `''table''`)`

Returns the number of associations in ''hash-table''.  Must execute in amortized O(1) time.  (SRFI-69 `hash-table-size`; R6RS `hashtable-size`)

== Mutators ==

`(hash-table-set! `''hash-table''` `''key''` `''value''`)`

Creates a new association in ''hash-table'' that associates ''key'' with ''value''. The previous association (if any) is deleted.  It is an error if ''hash-table'' is a string or string-ci hash table and ''key'' is not a string.  Must execute in amortized O(1) time.  Returns an unspecified value.  (SRFI-69 `hash-table-set!`; R6RS `hashtable-set!`)

`(hash-table-delete! `''hash-table''` `''key''`)`

Deletes any association to ''key'' in ''hash-table''. It is not an error if no association for that ''key'' exists.    Must execute in amortized O(1) time.  Returns an unspecified value.  (SRFI-69 `hash-table-delete!`; R6RS `hashtable-delete!`)

`(hash-table-update! `''hash-table''` `''key''` `''procedure'' ''default-proc'' ]`)`

Semantically equivalent to, but may be implemented more efficiently than, the following code:

`(hash-table-set! `''hash-table''` `''key''` (`''procedure'' `(hash-table-ref `''hash-table''` `''key''` `''default-proc''`)))`

Must execute in amortized O(1) time.  Returns an unspecified value.  (SRFI-69 `hash-table-update!/default`; R6RS `hashtable-update!`)

'''Issue''':  add `hash-table-clear`?


== Iterators ==

`(hash-table-fold `''procedure''` `''init''` `''hash-table''`)`

Calls ''procedure'' for every association in ''hash-table'' with three arguments: the key of the association, the value of the association, and an accumulated value ''val''.  ''Val'' is ''init'' for the first invocation of ''procedure'', and for subsequent invocations of ''procedure'', the return value of the previous invocation.  The value returned by `hash-table-fold` is the return value of the last invocation of f. The order in which ''procedure'' is called for different associations is unspecified.  (SRFI-69 `hash-table-fold`; not in R6RS but trivially implemented: see below.)


== Excluded features ==

The following features are '''not''' part of this proposal for various reasons.

=== Incompatible return values ===

`Hash-table-keys` is present in both SRFI 69 and R6RS, but returns a list of keys in the first, a vector of keys in the second.

`Hash-table-values` returns a list of values in SRFI 69; `hashtable-entries` returns two values, a vector of keys and a vector of values in R6RS.

`Hash-table-hash-function` always returns the hash function of a hash table in SRFI 69, but returns `#f` in the case of `eq?` (and `eqv?`) hash tables in R6RS.

`Hash-table-equivalence-function` returns the equivalence function, but in R6/R7RS you can't reliably check functions for identity, so you don't know what you've got.

=== In R6RS but not in SRFI 69 ===

`Hashtable-clear!`, which removes all associations from a hash table.

Specifying the initial capacity of a hash table.

Immutable hash tables.

Hash tables based on `eqv?`.

=== In SRFI 69 but not in R6RS ===

`Hash-table-walk`, which is the analogue of `for-each` for hash tables.

Conversion from a-lists to hash tables and back.

`Hash-table-merge`, which destructively adds all associations in a source hash table to a destination hash table.

Default values specified in the form of a thunk to call.

=== In Common Lisp but not Scheme ===

The current capacity (as opposed to size) of a hash table.

Rehash size and threshold in constructor, plus accessors and mutators for them.

Hash tables based on `equalp` (which is not in Scheme).

`With-hash-table-iterator`, a hash table external iterator implemented as a local macro.

`Sxhash`, a stable hash function.

== R6RS implementation of `hash-table-fold` ==

{{{
(define (hash-table-fold h proc init)
  (let-values (((k v) (hashtable-entries h))
               ((s) (hashtable-size h)))
    (do ((i 0 (+ 1 i))
         (val init (proc (vector-ref k i) (vector-ref v i) val)))
        ((= i s) val))))
}}}

I have added equivalences to SRFI 69 and R6RS facilities.  Systems supporting either should be able to support these hash tables trivially; it will be necessary to write `hash-table-fold` for R6RS systems.  I have adopted SRFI 69's term `hash-table` rather than R6RS's `hashtable` because of our decision in #40.  Besides, the word ''hashtable'' obviously means something that can be ... hashted.

The main annoyances of this proposal for SRFI 69 programmers will be remembering to supply a third argument to `hash-table-ref`, and for R6RS programmers will be remembering to insert a hyphen in `hashtable`.  Both will have to get used to the new constructors.

time

2013-05-18 01:46:55

version

14