Please see SRFI 114 for the predecessor to this proposal. SRFI 114 has been deprecated by its author. Here is the first draft of a simpler and better replacement, which will eventually become a new SRFI.
This SRFI provides comparators, procedure bundles that represent one or more of an equality predicate, an ordering predicate, and a hash function. By packaging these procedures together, along with a type test predicate, they can be treated as a single item for use in the implementation of data structures.
None at present.
The four procedures above have complex dependencies on one another, and it is inconvenient to have to pass them individually to other procedures that might or might not make use of all of them. For example, a set implementation by nature requires only an equality predicate, but if it is implemented using a hash table, an appropriate hash function is also required if the implementation does not provide one; alternatively, if it is implemented using a tree, an ordering predicate is required. By passing a comparator rather than a bare equality predicate, the set implementation can make use of whatever procedures are available and useful to it.
This SRFI is a simplified and enhanced rewrite of SRFI 114, and shares some of its design rationale and all of its acknowledgements. The largest change is the replacement of the comparison procedure with the ordering procedure. This allowed most of the special-purpose comparators to be removed. In addition, many of the more specialized
Special thanks to Taylan Ulrich Bayırlı/Kammer, whose insistence that SRFI 114 was over-complicated but also inadequate inspired this redesign.
The procedures in this SRFI are in the
(srfi ???) library (or
(srfi :???) on R6RS), but the sample implementation currently places them in the
(comparators) library. This means it can't be used alongside SRFI 114, but there's no reason for anyone to do that.
A comparator is an object of a disjoint type. It is a bundle of procedures that are useful for comparing two objects either for equality or for ordering. There are four procedures in the bundle:
The type test predicate returns
#tif its argument has the correct type to be passed as an argument to the other three procedures, and
The equality predicate returns
#tif the two objects are the same in the sense of the comparator, and
#fotherwise. It is the programmer's responsibility to ensure that it is reflexive, symmetric, transitive, and can handle any arguments that satisfy the type test predicate.
The ordering predicate returns
#tif the first object precedes the second in a total order, and
#fotherwise. Note that if it is true, the equality predicate must be false. It is the programmer's responsibility to ensure that it is irreflexive, asymmetric, transitive, and can handle any arguments that satisfy the type test predicate.
The hash function takes one or more arguments and returns an exact non-negative integer; the first argument is the object to be hashed. It is the programmer's responsibility to ensure that it can handle any argument that satisfies the type test predicate, and that it returns the same value on two objects if the equality predicate says they are the same (but not necessarily the converse).
If a second argument is provided, it is a non-negative exact integer, a hint from the caller of the hash function that the returned value should be bounded from above by the second argument. There is no requirement that a hash function support this hint, and the caller cannot count on its being applied. Supplying
#fas the second argument is equivalent to providing no argument.
If a third argument is provided, it is a positive exact integer that serves as a salt value to be mixed in along with the first argument when performing the hash calculation. If the caller is careful to provide a sufficiently random salt, attacks on the hash function become more difficult. In addition, callers that need more than one hash function can easily obtain it by varying the salt. Like the upper bound, the salt is a hint: no hash function is required to make use of it.
It is also the programmer's responsibility to ensure that all four procedures provide the same result whenever they are applied to the same object(s) (in the sense of
eqv?), unless the object(s) have been mutated since the last invocation. In particular, they must not depend in any way on memory addresses in implementations where the garbage collector can move objects in memory.
The comparator objects defined in this SRFI are not applicable to circular structure or to NaNs, or to objects containing any of these. Attempts to pass any such objects to any procedure defined here, or to any procedure that is part of a comparator defined here, is an error except as otherwise noted.
comparator?, comparator-ordered?, comparator-hashable?
make-comparator, make-pair-comparator, make-list-comparator, make-vector-comparator, make-eq-comparator, make-eqv-comparator, make-equal-comparator
Standard hash functions:
boolean-hash, char-hash, char-ci-hash, string-hash, string-ci-hash, symbol-hash, number-hash
comparator-type-test-procedure, comparator-equality-predicate, comparator-ordering-predicate, comparator-hash-function, comparator-check-type
=?, <?, >?, <=?, >=?
#t if obj is a comparator, and
#t if comparator has a supplied ordering predicate, and
#t if comparator has a supplied hash function, and
The following comparator constructors all supply appropriate type test functions, equality predicates, ordering predicates, and hash functions based on the supplied arguments. They are allowed to cache their results: they need not return a newly allocated object, since comparators are pure and functional.
(make-comparator type-test equality ordering hash
Returns a comparator which bundles the type-test, equality, ordering, and hash procedures provided. As a convenience, if ordering or hash is
#f, a procedure is provided that signals an error on application. The predicates
comparator-hashable?, respectively, will return
#f in these cases.
Here are calls on
make-comparator that will return useful comparators for standard Scheme types:
(make-comparator boolean? boolean=? (lambda (x y) (and (not x) y)) (lambda (x) (if x 1 0)))will return a comparator for booleans, expressing the ordering
#tand a plausible hash function.
(make-comparator real? = < (lambda (x) (exact (abs x))))will return a comparator expressing the natural ordering of real numbers and a plausible (but not optimal) hash function.
(make-comparator string? string=? string<? string-hash)will return a comparator expressing the implementation's ordering of strings. On R5RS systems this is a lexicographic extension of some implementation-defined ordering of characters; on R6RS systems it is a lexicographic extension of Unicode codepoint order; on R7RS systems it is implementation-defined.
(make-comparator string? string-ci=? string-ci<? string-ci-hashwill return a comparator expressing the implementation's case-insensitive ordering of strings.
(make-pair-comparator [ car-comparator cdr-comparator ]
This comparator compares pairs using car-comparator (a default comparator by default) on their cars. If the cars are not equal, that value is returned. If they are equal, cdr-comparator (also a default comparator by default) is used on their cdrs and that value is returned. Hashing is done on both the car and cdr sides, and the results are hashed together.
(make-list-comparator [ element-comparator [ type-test empty? head tail ] ]
Returns a comparator which lexicographically compares two sequences that satisfy type-test (
list by default), as follows:
- The empty sequence, as determined by calling empty? (
null?by default) compares equal to itself.
- The empty sequence compares less than any non-empty sequence.
- Two non-empty sequences are compared by comparing the result of calling the head procedure (
carby default). If the heads are not equal when compared using element-comparator (a default comparator by default), then the result is the result of that comparison. Otherwise, the results of calling the tail procedure (
cdrby default) are compared recursively.
(make-list-comparator) returns a comparator that compares lists using a default comparator for the elements, and
) returns a comparator that compares lists using element-comparator for the elements.
(make-vector-comparator [ element-comparator [ type-test length ref ] ]
Returns a comparator which compares two sequences that satisfy type-test (
vector? by default), using the length procedure (
vector-length by default) to determine the lengths of the sequences, and the ref procedure (
vector-ref by default) to access a particular elements. If one sequence is shorter than the other, it compares less than the other. If the sequences are the same length, they are compared element-wise using element-comparator (a default comparator by default) until they are exhausted.
(make-vector-comparator) returns a comparator that compares vectors using a default comparator for the elements, and
) returns a comparator that compares vectors using element-comparator. Similarly,
(make-vector-comparator (make-default-comparator) bytevector? bytevector-length bytevector-u8-ref) returns a comparator that compares bytevectors.
The equality predicates of these comparators are
equal? respectively. When their ordering procedures are applied to non-equal objects, their behavior is implementation-defined. The type test predicates always return
These comparators accept circular structure (in the case of
equal-comparator, provided the implementation's
does so) and NaNs.
Standard hash functions
These are hash functions for standard Scheme types, suitable for passing to
(boolean-hash obj [ bound [ salt ] ]
(char-hash obj [ bound [ salt ] ]
(char-ci-hash obj [ bound [ salt ] ]
(string-hash obj [ bound [ salt ] ]
(string-ci-hash obj [ bound [ salt ] ]
(symbol-hash obj [ bound [ salt ] ]
(number-hash obj [ bound [ salt ] ]
These are suitable hash functions for the specified types. The hash functions
string-ci-hash treat their first argument case-insensitively.
Returns a comparator known as a default comparator that accepts any two Scheme values (with the exceptions listed in the Limitations section) and orders them in some implementation-defined way, subject to the following conditions:
The following arbitrary ordering between types must hold: the empty list precedes pairs, which precede booleans, which precede characters, which precede strings, which precede symbols, which precede numbers, which precede vectors, which precede bytevectors, which precede all other objects.
When applied to pairs, it must behave the same as a comparator returned by
make-pair-comparatorwhen applied to no arguments.
When applied to booleans, it must compare them using the total order
#tand hash them using the hash function
When applied to characters, it must compare them using
char<?. In R5RS, this is an implementation-dependent order that is typically the same as Unicode codepoint order; in R6RS and R7RS, it is Unicode codepoint order. It must also hash them using the hash function
When applied to strings, it must compare them using
string<?. In R5RS, this is lexicographic order on the implementation-dependent order defined by
char<?; in R6RS it is lexicographic order order on Unicode codepoint order; in R7RS it is an implementation-defined order. It must also hash them using the hash function
When applied to symbols, it must compare them using an implementation-dependent total order. One possibility is to use the order obtained applying
symbol->stringto the symbols and comparing them using the total order implied by
string<?. It must also hash them using the hash function
symbol-hash. It is not a requirement that symbol hash functions be consistent with string hash functions, however.
When applied to numbers where either number is complex, since non-real numbers cannot be compared with
<, the following least-surprising ordering is defined: If the real parts are < or >, so are the numbers; otherwise, the numbers are ordered by their imaginary parts. This can still produce somewhat surprising results if one real part is exact and the other is inexact. It must also hash them using the hash function
When applied to real numbers, it must compare them using
<, and hash them using the hash function
When applied to vectors, it must behave the same as a comparator returned by
make-vector-comparatorwith default arguments.
When applied to bytevectors, it must behave the same as a comparator created by
(make-vector-comparator (make-comparator < = (lambda (x) x)) bytevector? bytevector-length bytevector-u8-ref).
When applied to types registered with
comparator-register-default!: it must behave the same as the comparator registered using that function.
Given disjoint types a and b, one of three conditions must hold:
- All objects of type a compare less than all objects of type b.
- All objects of type a compare greater than all objects of type b.
- All objects of either type a or type b compare equal to each other. This is not permitted for any of the standard types listed above.
Registers comparator for use by default comparators, such that if the objects being compared both satisfy the type test predicate of comparator, it will be employed by default comparators to compare them.
Returns the type test predicate of comparator.
Returns the equality predicate of comparator.
Returns the ordering predicate of comparator.
Returns the hash function of comparator.
(comparator-check-type comparator obj
Invokes the type test predicate of comparator on obj and returns true if it returns true and signals an error otherwise.
(=? comparator object1 object2 object3 ...
(<? comparator object1 object2 object3 ...
(>? comparator object1 object2 object3 ...
(<=? comparator object1 object2 object3 ...
(>=? comparator object1 object2 object3 ...
These procedures are analogous to the number, character, and string comparison predicates of Scheme. They allow the convenient use of comparators to handle arbitrary data types.
These procedures apply the comparison procedure of comparator to the objects as follows. If the specified relation returns
#t for all objecti and objectj where n is the number of objects and 1 <= i < j <= n, then the procedures return
#t, but otherwise
The order in which the values are compared is unspecified. Because the relations are transitive, it suffices to compare each object with its successor.
The sample implementation contains the following files: FIXME
comparators-impl.scm- the record type definition and most of the procedures
default.scm- a simple implementation of the default constructor, which should be improved by implementers to handle records and implementation-specific types
hash.scm- built-in hash functions
r7rs-shim.scm- procedures for R7RS compatibility, including a trivial implementation of bytevectors on top of SRFI 4 u8vectors
complex-shim.scm- a trivial implementation of
imag-partfor Schemes that don't have complex numbers
comparators.sld- an R7RS library
comparators.scm- a Chicken library
comparators-test.scm- a test file using the Chicken