This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home. For a version of this page that may be more recent, see StringTitlecaseCowan in WG2's repo for R7RS-large.

String­Titlecase­Cowan

cowan
2015-11-13 06:38:49
2history
source

Rationale

This SRFI explains how to implement a Unicodely correct string-titlecase procedure similar to those specified in R6RS and SRFI 13. The algorithm does not depend on the availability of full Unicode, however, and will work just as well with a purely ASCII repertoire.

Consider the string floo bar, which begins with a ligature of the characters fl. The right way of titlecasing this string is to treat the ligature the same as fl, in which case the result is Floo Bar. However, by the strict letter of R6RS, the character must be passed to char-titlecase, which in this case will return its argument unchanged, and the result is floo Bar. What is more, if the character is not even seen as a casing letter, then the result will be flOo Bar.

Specification

(string-titlecase string [ start [ end ] ])

When strict compatibility with R6RS is desired, the start and end arguments must not be supported.

This procedure applies the Unicode full string lowercasing algorithm to the substring of its argument beginning with start and ending with end. However, any character preceded by a non-cased character, or by no character at all, is processed by a different algorithm. If it has a multi-character titlecase mapping, it is replaced by that mapping. Otherwise, it is replaced by its single-character titlecase mapping. Note that with four exceptions the single-character titlecase mapping is the same as the uppercase mapping. The result of the application of these algorithms is returned.

In certain cases, the result differs in length from the argument. If the result is equal to the argument in the sense of string=?, the argument may be returned. Note that language-sensitive mappings are not used.