This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home.

Source for wiki StringTitlecaseCowan version 4

author

cowan

comment


    

ipnr

127.11.51.1

name

StringTitlecaseCowan

readonly

0

text

== Rationale ==

This SRFI explains how to implement Unicodely correct and R7RS-style `char-titlecase?`, `char-titlecase`, and `string-titlecase` procedures similar to those specified in [http://www.r6rs.org/final/html/r6rs-lib/r6rs-lib-Z-H-2.html#node_sec_1.2 R6RS] and [http://srfi.schemers.org/srfi-13/srfi-13.html SRFI 13].  The algorithm does not depend on the availability of full Unicode, however, and will work just as well with a purely ASCII repertoire.

Consider the string `floo bar`, which begins with a ligature of the characters `fl`.  The Unicode way of titlecasing this string is to treat the ligature the same as `fl`, in which case the result is `Floo Bar`.  However, by the strict letter of R6RS, the `fl` character must be passed to `char-titlecase`, which in this case will return its argument unchanged, and the result is `floo Bar`.  What is more, if the `fl` character is not even seen as a casing letter, then the result will be `flOo Bar`.  Different Schemes show all of these behaviors.

== Specification ==

`(char-titlecase? `''char''`)`

Returns `#t` if ''char'' is a Unicode titlecase character, and `#f` otherwise.  (Not in R6RS.)

`(char-titlecase `''char''`)`

Returns the titlecase version of ''char'', if that character exists in the implementation.  Note that this is not necessarily a titlecase character: for most values of ''char'', it is an uppercase character.  Note that language-sensitive mappings are not used.  (The same as the R6RS equivalent.)

`(string-titlecase `''string'' [ ''start'' [ ''end'' ] ]`)`

When strict compatibility with R6RS is desired, the ''start'' and ''end'' arguments must not be used.

This procedure applies the Unicode full string lowercasing algorithm to the substring of its argument beginning with ''start'' and ending with ''end''.  However, any character preceded by a non-cased character, or by no character at all, is processed by a different algorithm.  If such a character has a multi-character titlecase mapping specified by Unicode, and all the characters of the mapping are supported by the implementation, then it is replaced by that mapping.  Otherwise, it is replaced by its single-character titlecase mapping as if by `char-titlecase`.  The result of the application of these algorithms is returned.

In certain cases, the result differs in length from the argument. If the result is equal to the argument in the sense of `string=?`, the argument may be returned. Note that language-sensitive mappings are not used.  (The R6RS version does not make use of multi-character titlecase mappings.)

time

2015-11-17 09:51:58

version

4