This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home. For a version of this page that may be more recent, see ImmutableDataStructuresWortman in WG2's repo for R7RS-large.

Immutable­Data­Structures­Wortman

kevinwortman
2013-06-04 13:18:56
1First draft of deque and set; introduction and map are still TODOhistory
source

Introduction

(TODO)

Efficiency bounds

We specify required time efficiency upper bounds using big-O notation. We note when, in some cases, there is "slack" between the required bound and the theoretically optimal bound for an operation. Implementations may use data structures with amortized time bounds, but should document which bounds hold in only an amortized sense. The use of randomized data structures with expected time bounds is discouraged.

Deque

A deque (conventionally pronounced "deck") is a queue data structure that supports efficient manipulation of either of its ends. It may be used in place of a (LIFO) stack or (FIFO) queue.

The unlabeled finger tree data structure can meet all these requirements rather conveniently.

Construction

(ideque [element...])

Create a new deque containing element.... The leftmost element (if any) will be at the front of the deque and the rightmost element (if any) will be at the back. Takes O(n) time, where n is the number of elements.

Queries

(ideque? x) (ideque-empty? deque) (ideque-size deque)

ideque?}}} and {{{ideque-empty?}}} take O(1) time; {{{ideque-size may take O(n) time, though O(1) is optimal.

Queue operations

(ideque-front deque) (ideque-back deque)

Return the front/back element. It is an error for dequeue to be empty. Each takes O(1) time.

(ideque-push-front deque x) (ideque-push-back deque x)

Return a new deque with x pushed to the front/back of {{{deque}}. Each takes O(1) time.

Miscellaneous

(ideque-append deque...)

Return a new deque with the contents of the first argument deque followed by the others. Takes O(kn) time, where k is the number of deques and n is the number of elements involved, though O(k log n) is possible.

Conversion

(list->ideque list) (ideque->list deque)

Conversion between deque and list structures. FIFO order is preserved, so the front of a list corresponds to the front of a deque. Each operation takes O(n) time.

Set

A sorted set data structure stores a finite collection of unique orderable (a.k.a. comparable or sortable) elements.

These requirements can be satisfied by many flavors of self-balancing binary trees. Red-black trees, 1-2 brother trees, and labeled 2-3 finger trees are particularly convenient in an immutable context.

Order predicates

An order predicate for a set of orderable elements is a procedure that takes two arguments x}}} and {{{y}}}, both elements of the set, and returns non-{{{#false}}} iff {{{x}}} precedes {{{y}}} in the set's ordering. It returns {{{#false}}} when {{{x}}} succeeds (comes after) {{{y or when the elements are tied for the same order.

Note that <}}}, {{{char<?}}}, and {{{string<? are valid order predicates for sets of numbers, characters, and strings, respectively.

Any procedure argument called precedes? must be an order predicate.

It is essential that order predicates return #false in the case of equal elements, as implementations may test for element equality with expressions of the form

(and (not (precedes? x y)) (not (precedes? y x)))

The time bounds stated below are all based on the assumption that evaluating an order predicate takes O(1) time.

Construction

(iset precedes? [element...])

Return a new set using the order predicate precedes?}}} and containing elements {{{element...}}}. If any subset of element arguments are equal according to {{{precedes? then only the leftmost of them is a member of the returned set. Takes O(n log n) time.

Set queries

(iset? x) (iset-empty? set) (iset-size set) (iset-order-predicate set)

iset-size may take O(n) time, though O(1) is optimal. The other procedures take O(1) time.

Set operations

(iset-union set...) (iset-intersection set...) (iset-difference left right...) (iset-symmetric-difference left right...)

Return a new set containing the union/intersection/difference/symmetric difference of the arguments. All the arguments must be sets sharing an equivalent order predicate. Runs in O(kn log n) time, where k is the number of sets and n is the number of elements involved; O(kn) time is optimal.

Element queries

(iset-member? set x)

Returns non-#false}}} iff {{{x}}} is an element of {{{set}}}, or {{{#false otherwise. Takes O(log n) time.

(iset-min set) (iset-max set)

Returns the least/greatest element of set}}}. It is an error for {{{set to be empty. Takes O(log n) time; O(1) is optimal.

Element operations

(iset-search set x present-proc absent-proc)

A continuation-based universal lookup procedure. Attemps to find an element y}}} equal to {{{x}}}. When such a {{{y is found, calls

(present-proc y update remove)

which may call

(update new-y)

to replace y}}} with {{{new-y, or

(remove)

to remove y}}} from {{{set.

When no such y is found, calls

(absent-proc insert)

where insert}}} is a nullary function that inserts {{{x}}} into {{{set.

In either case, iset-search returns

(values new-set ret)

where new-set}}} is the result of modifying {{{set}}} as indicated, and {{{ret}}} is the value returned by {{{present-proc}}} or {{{absent-proc.

Runs in O(log n) time.

(This procedure is based on a similar procedure for hash tables suggested by Alexey Radul and attributed to Taylor Campbell.)

(iset-find iset x [absent-thunk])

Returns the element y}}} equal to {{{x}}} in {{{set}}}, or the result of evaluating {{{(absent-thunk)}}} if no such element exists. If {{{absent-thunk}}} is unspecified then a default procedure that always returns {{{#false is used. This operation is equivalent to, but may be more efficient than,

(iset-search set x (lambda (y update remove) y) (lambda (insert) (absent-thunk)))

It takes O(log n) time.

(iset-include set x) (iset-exclude set x)

Return a new set that certainly does/doesn't include x}}}. Each operation takes O(log n) time. {{{iset-include is equivalent to, but may be more efficient than,

(iset-search set x (lambda (y update remove) y) (lambda (insert) (insert)))

and iset-exclude is equivalent to, but may be more efficient than,

(iset-search set x (lambda (y update remove) (remove)) (lambda (insert) *unspecified-value*))

Each operation takes O(log n) time.

(iset-predecessor set x [absent-thunk]) (iset-successor set x [absent-thunk])

Returns the element that immediately precedes/succeeds x}}} in {{{set}}}'s ordering. If no such element exists because {{{x}}} is the minimum/maximum element, or because {{{set}}} is empty, returns the result of evaluating {{{(absent-thunk)}}}. If {{{absent-thunk}}} is unspecified then a default procedure that always returns {{{#false is used. Takes O(log n) time.

Range queries

(iset-between set least include-least most include-most)

Returns a new set containing the elements of set}}} in the interval between {{{least}}} and {{{most}}}. If {{{include-least}}}/{{{include-most}}} is non-{{{#false}}} then the result includes elements equal to {{{least}}}/{{{most respectively; otherwise those elements are not returned. Takes O(k log k + log n) time, where n is the number of elements in the set and k is the number of elements returned; O(k + log n) is optimal.

(iset-range< set x) (iset-range<= set x) (iset-range= set x) (iset-range>= set x) (iset-range> set x)

Returns a new set containing only the elements of set}}} that are less/less-or-equal/equal/greater-or-equal/greater than {{{x. Takes O(k log k + log n) time, where n is the number of elements in the set and k is the number of elements returned; O(k + log n) is optimal.

Note that since set elements are unique, iset-range= returns a set of at most one element.

Filter, fold, map

(iset-filter predicate? set)

Returns a new set containing only those elements x}}} for which {{{(predicate? x)}}} returns non-{{{#false. Takes O(n log n) time; O(n) is optimal.

(iset-fold proc base set)

The fundamental set iterator. Equivalent to, but may be more efficient than,

(fold proc base (iset->ordered-list set))

Takes O(n) time.

(iset-map/monotone proc set)

Returns a new set containing the elements (proc x)}}} for every {{{x}}} in {{{set}}}. {{{proc must be a monotone unary procedure that preserves the order and uniqueness of set elements. Takes O(n) time.

(iset-map proc set)

Returns a new set containing the elements (proc x)}}} for every {{{x}}} in {{{set}}}. {{{proc}}} must be a unary procedure, but is not required to be monotone, and so the result of the map operation may be out of order and may contain duplicates. When a subset of elements in {{{set all map to the same result value, only the least of those elements is a member, in mapped form, of the resulting set. This operation is equivalent to, but may be more efficient than,

(list->iset (set-order-predicate set) (map proc (set->ordered-list set)))

It takes O(n log n) time.

Conversion

(iset->ordered-list set)

Returns a list containing the elements of set in increasing order. Takes O(n) time.

(ordered-list->iset precedes? list)

Returns a set containing the elements of list}}} and using order predicate {{{precedes?}}}. It is an error for {{{list to be anything other than a proper list of elements in increasing order. Takes O(n log n) time; O(n) is optimal.

(list->iset precedes? list)

Returns a set containing the elements of list}}} and using order predicate {{{precedes?}}}. {{{list}}} must be a proper list, but may contain duplicates and need not be in order. When a subset of input elements are equal according to {{{precedes?}}}, only the first such element in {{{list is a member of the returned set. Takes O(n log n) time.

Element comparison

(iset-element<? set x y) (iset-element=? set x y) (iset-element>? set x y)

These procedures compare x}}} to {{{y}}} according to {{{set}}}'s order predicate and the conventional meaning of {{{<}}}, {{{=}}}, and {{{>.

Maps

(TODO)

Open questions

Should there be an explicitly-immutable pair and list?