Changes between Initial Version and Version 1 of BinaryData


Ignore:
Timestamp:
04/29/10 09:39:51 (7 years ago)
Author:
alexshinn
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • BinaryData

    v1 v1  
     1Most Scheme implementations provide one or more 
     2ways to represent blocks of data which are fundamentally 
     3binary in nature - they are opaque sequences of 8-bit 
     4bytes whose structure is to be interpreted at the 
     5application level.  This is particularly important 
     6for efficient I/O, and for other host system interfaces 
     7such as pathnames which may not be valid strings. 
     8 
     9SRFI-4 provides multiple uniform vector data-types, of 
     10which the u8vector often gets special treatment as a 
     11general container of binary data.  R6RS provides only 
     12a byte-vector data-type, similar to the u8vector, with 
     13an API that allows accessing other machine numeric types 
     14from any offset.  We need to decide what, if any, binary 
     15data type we will provide in WG1, including any read/write 
     16representations. 
     17 
     18If you view the bytes as primarily textual (as in the 
     19pathname case), then it makes sense to provide an 
     20external representation which allows ASCII.  PLT, for 
     21instance provides 
     22 
     23  #"ABC\0DEF" 
     24 
     25where the \0 indicates a NULL byte. 
     26 
     27On the other hand, if you view the bytes as primarily 
     28binary, then it makes sense to encode each of the bytes 
     29as an integer, so the above example becomes 
     30 
     31  #vu8(65 66 67 0 68 69 70) 
     32 
     33Erlang allows mixing both, where numbers are taken as 
     34individual bytes and ASCII strings are flattened.  So 
     35the same example becomes 
     36 
     37  #vu8("ABC" 0 "DEF") 
     38 
     39The #vu8 is the R6RS syntax.  SRFI-4 uses #u8.  The 
     40former has the advantage that only one letter is taken 
     41up after the #, leaving room for more future syntax 
     42extensions (SRFI-4 uses #u, #s and #f).  The latter 
     43has the advantage that it's more widely implemented.