# Comparators

Please see SRFI 114 for the predecessor to this proposal. SRFI 114 has been deprecated by its author. Here is the first draft of a simpler and better replacement, which will eventually become a new SRFI.

## Abstract

This SRFI provides *comparators*, procedure bundles that represent one or more of an equality predicate, an ordering predicate, and a hash function. By packaging these procedures together, along with a type test predicate, they can be treated as a single item for use in the implementation of data structures.

## Issues

None at present.

## Rationale

The four procedures above have complex dependencies on one another, and it is inconvenient to have to pass them individually to other procedures that might or might not make use of all of them. For example, a set implementation by nature requires only an equality predicate, but if it is implemented using a hash table, an appropriate hash function is also required if the implementation does not provide one; alternatively, if it is implemented using a tree, an ordering predicate is required. By passing a comparator rather than a bare equality predicate, the set implementation can make use of whatever procedures are available and useful to it.

This SRFI is a simplified and enhanced rewrite of SRFI 114, and shares some of its design rationale and all of its acknowledgements. The largest change is the replacement of the comparison procedure with the ordering procedure. This allowed most of the special-purpose comparators to be removed. In addition, many of the more specialized

Special thanks to Taylan Ulrich Bayırlı/Kammer, whose insistence that SRFI 114 was over-complicated but also inadequate inspired this redesign.

## Specification

The procedures in this SRFI are in the `(srfi ???)`

library (or `(srfi :???)`

on R6RS), but the sample implementation currently places them in the `(comparators)`

library. This means it can't be used alongside SRFI 114, but there's no reason for anyone to do that.

### Definitions

A *comparator* is an object of a disjoint type. It is a bundle of procedures that are useful for comparing two objects either for equality or for ordering. There are four procedures in the bundle:

The

*type test predicate*returns`#t`

if its argument has the correct type to be passed as an argument to the other three procedures, and`#f`

otherwise.The

*equality predicate*returns`#t`

if the two objects are the same in the sense of the comparator, and`#f`

otherwise. It is the programmer's responsibility to ensure that it is reflexive, symmetric, transitive, and can handle any arguments that satisfy the type test predicate.The

*ordering predicate*returns`#t`

if the first object precedes the second in a total order, and`#f`

otherwise. Note that if it is true, the equality predicate must be false. It is the programmer's responsibility to ensure that it is irreflexive, asymmetric, transitive, and can handle any arguments that satisfy the type test predicate.The

*hash function*takes one or more arguments and returns an exact non-negative integer; the first argument is the object to be hashed. It is the programmer's responsibility to ensure that it can handle any argument that satisfies the type test predicate, and that it returns the same value on two objects if the equality predicate says they are the same (but not necessarily the converse).If a second argument is provided, it is a non-negative exact integer, a hint from the caller of the hash function that the returned value should be bounded from above by the second argument. There is no requirement that a hash function support this hint, and the caller cannot count on its being applied. Supplying

`#f`

as the second argument is equivalent to providing no argument.If a third argument is provided, it is a positive exact integer that serves as a

*salt*value to be mixed in along with the first argument when performing the hash calculation. If the caller is careful to provide a sufficiently random salt, attacks on the hash function become more difficult. In addition, callers that need more than one hash function can easily obtain it by varying the salt. Like the upper bound, the salt is a hint: no hash function is required to make use of it.

It is also the programmer's responsibility to ensure that all four procedures provide the same result whenever they are applied to the same object(s) (in the sense of `eqv?`

), unless the object(s) have been mutated since the last invocation. In particular, they must not depend in any way on memory addresses in implementations where the garbage collector can move objects in memory.

### Limitations

The comparator objects defined in this SRFI are not applicable to circular structure or to NaNs, or to objects containing any of these. Attempts to pass any such objects to any procedure defined here, or to any procedure that is part of a comparator defined here, is an error except as otherwise noted.

### Index

Predicates:

`comparator?, comparator-ordered?, comparator-hashable?`

Constructors:

`make-comparator, make-pair-comparator, make-list-comparator, make-vector-comparator, make-eq-comparator, make-eqv-comparator, make-equal-comparator`

Standard hash functions:

`boolean-hash, char-hash, char-ci-hash, string-hash, string-ci-hash, symbol-hash, number-hash`

Default comparators:

`default-comparator, comparator-register-default!`

Accessors:

`comparator-type-test-procedure, comparator-equality-predicate, comparator-ordering-predicate, comparator-hash-function, comparator-check-type`

Comparison predicates:

`=?, <?, >?, <=?, >=?`

### Predicates

`(comparator? `

*obj*`)`

Returns `#t`

if *obj* is a comparator, and `#f`

otherwise.

`(comparator-ordered? `

*comparator*`)`

Returns `#t`

if *comparator* has a supplied ordering predicate, and `#f`

otherwise.

`(comparator-hashable? `

*comparator*`)`

Returns `#t`

if *comparator* has a supplied hash function, and `#f`

otherwise.

### Constructors

The following comparator constructors all supply appropriate type test functions, equality predicates, ordering predicates, and hash functions based on the supplied arguments. They are allowed to cache their results: they need not return a newly allocated object, since comparators are pure and functional.

`(make-comparator `

*type-test equality ordering hash*`)`

Returns a comparator which bundles the *type-test, equality, ordering*, and *hash* procedures provided. As a convenience, if *ordering* or *hash* is `#f`

, a procedure is provided that signals an error on application. The predicates `comparator-ordered?`

and/or `comparator-hashable?`

, respectively, will return `#f`

in these cases.

Here are calls on `make-comparator`

that will return useful comparators for standard Scheme types:

`(make-comparator boolean? boolean=? (lambda (x y) (and (not x) y)) (lambda (x) (if x 1 0)))`

will return a comparator for booleans, expressing the ordering`#f`

<`#t`

and a plausible hash function.`(make-comparator real? = < (lambda (x) (exact (abs x))))`

will return a comparator expressing the natural ordering of real numbers and a plausible (but not optimal) hash function.`(make-comparator string? string=? string<? string-hash)`

will return a comparator expressing the implementation's ordering of strings. On R5RS systems this is a lexicographic extension of some implementation-defined ordering of characters; on R6RS systems it is a lexicographic extension of Unicode codepoint order; on R7RS systems it is implementation-defined.`(make-comparator string? string-ci=? string-ci<? string-ci-hash`

will return a comparator expressing the implementation's case-insensitive ordering of strings.

`(make-pair-comparator `

[ `car-comparator cdr-comparator` ]`)`

This comparator compares pairs using `car-comparator` (a default comparator by default) on their cars. If the cars are not equal, that value is returned. If they are equal, `cdr-comparator` (also a default comparator by default) is used on their cdrs and that value is returned. Hashing is done on both the car and cdr sides, and the results are hashed together.

`(make-list-comparator `

[ *element-comparator* [ *type-test* *empty? head tail* ] ]`)`

Returns a comparator which lexicographically compares two sequences that satisfy *type-test* (`list`

by default), as follows:

- The empty sequence, as determined by calling
`empty?`(`null?`

by default) compares equal to itself. - The empty sequence compares less than any non-empty sequence.
- Two non-empty sequences are compared by comparing the result of calling the
`head`procedure (`car`

by default). If the heads are not equal when compared using`element-comparator`(a default comparator by default), then the result is the result of that comparison. Otherwise, the results of calling the`tail`procedure (`cdr`

by default) are compared recursively.

In particular, `(make-list-comparator)`

returns a comparator that compares lists using a default comparator for the elements, and `(make-list-comparator `

`element-comparator``)`

returns a comparator that compares lists using `element-comparator for the elements.`

`(make-vector-comparator `

[ *element-comparator* [ *type-test* *length ref* ] ]`)`

Returns a comparator which compares two sequences that satisfy *type-test* (`vector?`

by default), using the *length* procedure (`vector-length`

by default) to determine the lengths of the sequences, and the *ref* procedure (`vector-ref`

by default) to access a particular elements. If one sequence is shorter than the other, it compares less than the other. If the sequences are the same length, they are compared element-wise using `element-comparator` (a default comparator by default) until they are exhausted.

In particular, `(make-vector-comparator)`

returns a comparator that compares vectors using a default comparator for the elements, and `(make-vector-comparator `

`element-comparator``)`

returns a comparator that compares vectors using `element-comparator`. Similarly, `(make-vector-comparator (make-default-comparator) bytevector? bytevector-length bytevector-u8-ref)`

returns a comparator that compares bytevectors.

`(make-eq-comparator)`

`(make-eqv-comparator)`

`(make-equal-comparator)`

The equality predicates of these comparators are `eq?`

, `eqv?`

, and `equal?`

respectively. When their ordering procedures are applied to non-equal objects, their behavior is implementation-defined. The type test predicates always return `#t`

.

These comparators accept circular structure (in the case of
`equal-comparator`

, provided the implementation's `equal?`

predicate
does so) and NaNs.

### Standard hash functions

These are hash functions for standard Scheme types, suitable for passing to `make-comparator`

.

`(boolean-hash`

obj [ `bound` [ `salt` ] ] `)`

`(char-hash`

obj [ `bound` [ `salt` ] ] `)`

`(char-ci-hash`

obj [ `bound` [ `salt` ] ] `)`

`(string-hash`

obj [ `bound` [ `salt` ] ] `)`

`(string-ci-hash`

obj [ `bound` [ `salt` ] ] `)`

`(symbol-hash`

obj [ `bound` [ `salt` ] ] `)`

`(number-hash`

obj [ `bound` [ `salt` ] ] `)`

These are suitable hash functions for the specified types. The hash functions `char-ci-hash`

and `string-ci-hash`

treat their first argument case-insensitively.

### Default comparators

`(make-default-comparator)`

Returns a comparator known as a `default comparator` that accepts any two Scheme values (with the exceptions listed in the Limitations section) and orders them in some implementation-defined way, subject to the following conditions:

The following arbitrary ordering between types must hold: the empty list precedes pairs, which precede booleans, which precede characters, which precede strings, which precede symbols, which precede numbers, which precede vectors, which precede bytevectors, which precede all other objects.

When applied to pairs, it must behave the same as a comparator returned by

`make-pair-comparator`

when applied to no arguments.When applied to booleans, it must compare them using the total order

`#f`

<`#t`

and hash them using the hash function`boolean-hash`

.When applied to characters, it must compare them using

`char=?`

and`char<?`

. In R5RS, this is an implementation-dependent order that is typically the same as Unicode codepoint order; in R6RS and R7RS, it is Unicode codepoint order. It must also hash them using the hash function`char-hash`

.When applied to strings, it must compare them using

`string=?`

and`string<?`

. In R5RS, this is lexicographic order on the implementation-dependent order defined by`char<?`

; in R6RS it is lexicographic order order on Unicode codepoint order; in R7RS it is an implementation-defined order. It must also hash them using the hash function`string-hash`

.When applied to symbols, it must compare them using an implementation-dependent total order. One possibility is to use the order obtained applying

`symbol->string`

to the symbols and comparing them using the total order implied by`string<?`

. It must also hash them using the hash function`symbol-hash`

. It is not a requirement that symbol hash functions be consistent with string hash functions, however.When applied to numbers where either number is complex, since non-real numbers cannot be compared with

`<`

, the following least-surprising ordering is defined: If the real parts are < or >, so are the numbers; otherwise, the numbers are ordered by their imaginary parts. This can still produce somewhat surprising results if one real part is exact and the other is inexact. It must also hash them using the hash function`number-hash`

.When applied to real numbers, it must compare them using

`=`

and`<`

, and hash them using the hash function`number-hash`

.When applied to vectors, it must behave the same as a comparator returned by

`make-vector-comparator`

with default arguments.When applied to bytevectors, it must behave the same as a comparator created by

`(make-vector-comparator (make-comparator < = (lambda (x) x)) bytevector? bytevector-length bytevector-u8-ref)`

.When applied to types registered with

`comparator-register-default!`

: it must behave the same as the comparator registered using that function.Given disjoint types

*a*and*b*, one of three conditions must hold:- All objects of type
*a*compare less than all objects of type*b*. - All objects of type
*a*compare greater than all objects of type*b*. - All objects of either type
*a*or type*b*compare equal to each other. This is not permitted for any of the standard types listed above.

- All objects of type

`(comparator-register-default! `

`comparator``)`

Registers `comparator` for use by default comparators, such that if the objects being compared
both satisfy the type test predicate of `comparator`, it will be employed by default comparators
to compare them.

### Accessors

`(comparator-type-test-procedure `

*comparator*`)`

Returns the type test predicate of *comparator*.

`(comparator-equality-predicate `

*comparator*`)`

Returns the equality predicate of *comparator*.

`(comparator-ordering-predicate `

*comparator*`)`

Returns the ordering predicate of *comparator*.

`(comparator-hash-function `

*comparator*`)`

Returns the hash function of *comparator*.

`(comparator-check-type `

*comparator obj*`)`

Invokes the type test predicate of *comparator* on *obj* and returns true if it returns true and signals an error otherwise.

### Comparison predicates

`(=? `

*comparator* *object _{1} object_{2} object_{3}* ...

`)`

`(<? `

*comparator* *object _{1} object_{2} object_{3}* ...

`)`

`(>? `

*comparator* *object _{1} object_{2} object_{3}* ...

`)`

`(<=? `

*comparator* *object _{1} object_{2} object_{3}* ...

`)`

`(>=? `

*comparator* *object _{1} object_{2} object_{3}* ...

`)`

These procedures are analogous to the number, character, and string comparison predicates of Scheme. They allow the convenient use of comparators to handle arbitrary data types.

These procedures apply the comparison procedure of *comparator* to the *objects* as follows. If the specified relation returns `#t`

for all *object _{i}* and

*object*where

_{j}*n*is the number of objects and 1 <=

*i < j <= n*, then the procedures return

`#t`

, but otherwise `#f`

.The order in which the values are compared is unspecified. Because the relations are transitive, it suffices to compare each object with its successor.

## Implementation

The sample implementation contains the following files: FIXME

`comparators-impl.scm`

- the record type definition and most of the procedures`default.scm`

- a simple implementation of the default constructor, which should be improved by implementers to handle records and implementation-specific types`hash.scm`

- built-in hash functions`r7rs-shim.scm`

- procedures for R7RS compatibility, including a trivial implementation of bytevectors on top of SRFI 4 u8vectors`complex-shim.scm`

- a trivial implementation of`real-part`

and`imag-part`

for Schemes that don't have complex numbers`comparators.sld`

- an R7RS library`comparators.scm`

- a Chicken library`comparators-test.scm`

- a test file using the Chicken`test`

egg