Changes between Version 13 and Version 14 of ImmutableDataStructuresWortman


Ignore:
Timestamp:
07/03/13 19:56:09 (4 years ago)
Author:
kevinwortman
Comment:

Port iset to use comparators instead of order predicates

Legend:

Unmodified
Added
Removed
Modified
  • ImmutableDataStructuresWortman

    v13 v14  
    7272= Set = 
    7373 
    74 A sorted set data structure stores a finite collection of unique orderable (a.k.a. comparable or sortable) elements. 
     74A sorted set data structure stores a finite collection of unique elements with a defined comparator ( http://ccil.org/~cowan/temp/srfi-114.html ). 
    7575 
    7676These requirements can be satisfied by many flavors of ''self-balancing binary trees.'' Red-black trees, 1-2 brother trees, and labeled 2-3 finger trees are particularly convenient in an immutable context. 
    7777 
    78 === Order predicates === 
    79  
    80 An ''order predicate'' for a universe of orderable elements is a procedure {{{precedes?}}} such that, for set elements of the universe {{{x}}} and {{{y}}}, 
    81 {{{#!scheme 
    82 (precedes? x y) 
    83 }}} 
    84 returns non-{{{#false}}} iff {{{x}}} precedes {{{y}}} in the set's ordering. It returns {{{#false}}} when {{{x}}} succeeds (comes after) {{{y}}} or when the elements are tied for the same position. 
    85  
    86 It is essential that order predicates return {{{#false}}} in the case of equal elements, as implementations may test for element equality with expressions of the form 
    87 {{{#!scheme 
    88 (and (not (precedes? x y)) 
    89      (not (precedes? y x))) 
    90 }}} 
    91  
    92 Note that {{{<}}}, {{{char<?}}}, and {{{string<?}}} are valid order predicates for sets of numbers, characters, and strings, respectively. 
    93  
    94 The time bounds stated below are all based on the assumption that evaluating an order predicate takes O(1) time. 
    95  
    96 === Deduplication policy === 
    97  
    98 All elements of a set must be unique, which complicates operations that produce sets from collections that may contain duplicates. The procedures described consintely follow a ''least recently used'' removal policy. That means that, when an operation encounters two elements that are equal according to the order predicate, the one that was encountered earlier is removed from the set and replaced with the element that was encountered later. 
     78=== Set merger procedures === 
     79 
     80Some set operations involve transforming input that may include duplicate elements into a set that must not contain duplicates. These operations take a ''merger'' procedure which decides which of the duplicates to retain. A set merger procedure is a binary procedure 
     81 
     82{{{#!scheme 
     83(merger left right) 
     84}}} 
     85 
     86that returns {{{left}}} or {{{right}}}. 
     87 
     88The order in which merger procedures is called is undefined. 
     89 
     90Two trivial set merger procedures are defined: 
     91{{{#!scheme 
     92(define (set-merger-left left right) 
     93  left) 
     94(define (set-merger-right left right) 
     95  right) 
     96}}} 
    9997 
    10098=== Construction === 
    10199 
    102100{{{#!scheme 
    103 (iset precedes? [element...]) 
    104 }}} 
    105 Return a new set using the order predicate {{{precedes?}}} and containing elements {{{element...}}}. If any elements are equal according to {{{precedes?}}}, only the rightmost of them is retained. O(n log n) time. 
     101(iset/merger comparator merger [element...]) 
     102(iset comparator [element...]) 
     103}}} 
     104Return a new set using containing elements {{{element...}}}. {{{iset}}} uses {{{set-merger-right}}}. O(n log n) time. 
    106105 
    107106=== Set-wide queries === 
     
    111110(iset-empty? set) 
    112111(iset-size set) 
    113 (iset-order-predicate set) 
     112(iset-comparator set) 
    114113}}} 
    115114 
     
    119118 
    120119{{{#!scheme 
    121 (iset-union set...) 
    122 (iset-intersection set...) 
     120(iset-union set... [merger]) 
     121(iset-intersection set... [merger]) 
    123122(iset-difference left set...) 
    124123(iset-xor set1 set2) 
    125124}}} 
    126125 
    127 Return a new set containing the union/intersection/difference/symmetric difference of the arguments. All the arguments must be sets sharing an equivalent order predicate. The set operator is applied in a left-associative order. When an element is a member of multiple arguments to {{{set-union}}}, the object from the rightmost argument is retained. May take O(kn log n) time, where k is the number of sets and n is the number of elements involved, though O(kn) time is optimal. 
     126Return a new set containing the union/intersection/difference/symmetric difference of the arguments. All the arguments must be sets sharing an equivalent order predicate. The set operator is applied in a left-associative order. May take O(kn log n) time, where k is the number of sets and n is the number of elements involved, though O(kn) time is optimal. 
    128127 
    129128=== Element queries === 
     
    190189 
    191190{{{#!scheme 
    192 (iset-include set x) 
     191(iset-include set x [merger]) 
    193192(iset-exclude set x) 
    194193}}} 
    195 Return a new set that certainly does/doesn't include {{{x}}}. {{{iset-include}}} may replace an object equal to {{{x}}} already in {{{set}}}, if any. Each operation takes O(log n) time. 
     194Return a new set that certainly does/doesn't include {{{x}}}. {{{merger}}} is called when {{{set}}} already includes an element equal to {{{x}}} according to the set's comparator, with {{{left}}} as the object already in {{{set}}} and {{{right}}} as {{{x}}}. Each operation takes O(log n) time. 
    196195 
    197196{{{#!scheme 
     
    239238 
    240239{{{#!scheme 
    241 (iset-map/monotone proc set [precedes?]) 
    242 }}} 
    243 Returns a new set containing the elements {{{(proc x)}}} for every {{{x}}} in {{{set}}}. {{{proc}}} must be a ''monotone'' unary procedure that preserves the order of set elements. Observe that mapping a set of unique elements with a monotone function yields a new set of unique elements, so element uniqueness follows from the monotonicity assumption. If {{{precedes?}}} is given, it is the order predicate of the new set; otherwise the new set uses the same order predicate as {{{set}}}. Takes O(n) time. 
    244  
    245 {{{#!scheme 
    246 (iset-map proc set [precedes?]) 
    247 }}} 
    248 As {{{iset-map/monotone}}}, except the {{{proc}}} is not required to be monotone. When multiple elements are mapped to the same value, only the value originating from the greatest unmapped element is retained. O(n log n) time. 
     240(iset-map/monotone proc set [comparator]) 
     241}}} 
     242Returns a new set containing the elements {{{(proc x)}}} for every {{{x}}} in {{{set}}}. {{{proc}}} must be a ''monotone'' unary procedure that preserves the order of set elements. Observe that mapping a set of unique elements with a monotone function yields a new set of unique elements, so element uniqueness follows from the monotonicity assumption. If {{{comparator}}} is given, it is the comparator of the new set; otherwise the new set uses the same order predicate as {{{set}}}. Takes O(n) time. 
     243 
     244{{{#!scheme 
     245(iset-map proc set [merger [comparator]]) 
     246}}} 
     247As {{{iset-map/monotone}}}, except the {{{proc}}} is not required to be monotone. {{{merger}}} is used to select among any duplicates that might be created, and defaults to {{{set-merger-right}}}. O(n log n) time. 
    249248 
    250249=== Conversion === 
     
    257256 
    258257{{{#!scheme 
    259 (ordered-list->iset precedes? list) 
    260 }}} 
    261  
    262 Returns a set containing the elements of {{{list}}} and using order predicate {{{precedes?}}}. It is an error for {{{list}}} to be anything other than a proper list of elements in increasing order. O(n log n) time; O(n) is optimal. 
    263  
    264 {{{#!scheme 
    265 (list->iset precedes? list) 
    266 }}} 
    267  
    268 Returns a set containing the elements of {{{list}}} and using order predicate {{{precedes?}}}. {{{list}}} must be a proper list, but may contain duplicates and need not be in order. When a subset of input elements are equal according to {{{precedes?}}}, only the last occurrence is retained. O(n log n) time. 
    269  
    270 === Element comparison === 
    271  
    272 {{{#!scheme 
    273 (iset-element<? set x y) 
    274 (iset-element=? set x y) 
    275 (iset-element>? set x y) 
    276 }}} 
    277  
    278 These procedures compare {{{x}}} to {{{y}}} according to {{{set}}}'s order predicate and the conventional meaning of {{{<}}}, {{{=}}}, and {{{>}}}. 
     258(ordered-list->iset comparator list) 
     259}}} 
     260 
     261Returns a set containing the elements of {{{list}}} and using comparator {{{comparator}}}. It is an error for {{{list}}} to be anything other than a proper list of elements in increasing order. O(n log n) time; O(n) is optimal. 
     262 
     263{{{#!scheme 
     264(list->iset comparator list [merger]) 
     265}}} 
     266 
     267Returns a set containing the elements of {{{list}}} and using comparator {{{comparator}}}. {{{list}}} must be a proper list, but may contain duplicates and need not be in order. {{{merger}}} defaults to {{{set-merger-right}}}. O(n log n) time. 
    279268 
    280269= Maps =