--- CBOR-XS/README	2013/10/29 15:56:31	1.7
+++ CBOR-XS/README	2016/02/08 04:37:12	1.16
@@ -23,26 +23,31 @@
      }
 
 DESCRIPTION
-    WARNING! This module is very new, and not very well tested (that's up to
-    you to do). Furthermore, details of the implementation might change
-    freely before version 1.0. And lastly, the object serialisation protocol
-    depends on a pending IANA assignment, and until that assignment is
-    official, this implementation is not interoperable with other
-    implementations (even future versions of this module) until the
-    assignment is done.
-
-    You are still invited to try out CBOR, and this module.
-
     This module converts Perl data structures to the Concise Binary Object
     Representation (CBOR) and vice versa. CBOR is a fast binary
-    serialisation format that aims to use a superset of the JSON data model,
-    i.e. when you can represent something in JSON, you should be able to
-    represent it in CBOR.
+    serialisation format that aims to use an (almost) superset of the JSON
+    data model, i.e. when you can represent something useful in JSON, you
+    should be able to represent it in CBOR.
 
-    In short, CBOR is a faster and very compact binary alternative to JSON,
+    In short, CBOR is a faster and quite compact binary alternative to JSON,
     with the added ability of supporting serialisation of Perl objects.
     (JSON often compresses better than CBOR though, so if you plan to
-    compress the data later you might want to compare both formats first).
+    compress the data later and speed is less important you might want to
+    compare both formats first).
+
+    To give you a general idea about speed, with texts in the megabyte
+    range, "CBOR::XS" usually encodes roughly twice as fast as Storable or
+    JSON::XS and decodes about 15%-30% faster than those. The shorter the
+    data, the worse Storable performs in comparison.
+
+    Regarding compactness, "CBOR::XS"-encoded data structures are usually
+    about 20% smaller than the same data encoded as (compact) JSON or
+    Storable.
+
+    In addition to the core CBOR data format, this module implements a
+    number of extensions, to support cyclic and shared data structures (see
+    "allow_sharing" and "allow_cycles"), string deduplication (see
+    "pack_strings") and scalar references (always enabled).
 
     The primary goal of this module is to be *correct* and the secondary
     goal is to be *fast*. To reach the latter goal it was written in C.
@@ -74,7 +79,7 @@
         The mutators for flags all return the CBOR object again and thus
         calls can be chained:
 
-        #TODO my $cbor = CBOR::XS->new->encode ({a => [1,2]});
+           my $cbor = CBOR::XS->new->encode ({a => [1,2]});
 
     $cbor = $cbor->max_depth ([$maximum_nesting_depth])
     $max_depth = $cbor->get_max_depth
@@ -115,6 +120,159 @@
         See SECURITY CONSIDERATIONS, below, for more info on why this is
         useful.
 
+    $cbor = $cbor->allow_unknown ([$enable])
+    $enabled = $cbor->get_allow_unknown
+        If $enable is true (or missing), then "encode" will *not* throw an
+        exception when it encounters values it cannot represent in CBOR (for
+        example, filehandles) but instead will encode a CBOR "error" value.
+
+        If $enable is false (the default), then "encode" will throw an
+        exception when it encounters anything it cannot encode as CBOR.
+
+        This option does not affect "decode" in any way, and it is
+        recommended to leave it off unless you know your communications
+        partner.
+
+    $cbor = $cbor->allow_sharing ([$enable])
+    $enabled = $cbor->get_allow_sharing
+        If $enable is true (or missing), then "encode" will not
+        double-encode values that have been referenced before (e.g. when the
+        same object, such as an array, is referenced multiple times), but
+        instead will emit a reference to the earlier value.
+
+        This means that such values will only be encoded once, and will not
+        result in a deep cloning of the value on decode, in decoders
+        supporting the value sharing extension. This also makes it possible
+        to encode cyclic data structures (which need "allow_cycles" to ne
+        enabled to be decoded by this module).
+
+        It is recommended to leave it off unless you know your communication
+        partner supports the value sharing extensions to CBOR
+        (<http://cbor.schmorp.de/value-sharing>), as without decoder
+        support, the resulting data structure might be unusable.
+
+        Detecting shared values incurs a runtime overhead when values are
+        encoded that have a reference counter large than one, and might
+        unnecessarily increase the encoded size, as potentially shared
+        values are encode as shareable whether or not they are actually
+        shared.
+
+        At the moment, only targets of references can be shared (e.g.
+        scalars, arrays or hashes pointed to by a reference). Weirder
+        constructs, such as an array with multiple "copies" of the *same*
+        string, which are hard but not impossible to create in Perl, are not
+        supported (this is the same as with Storable).
+
+        If $enable is false (the default), then "encode" will encode shared
+        data structures repeatedly, unsharing them in the process. Cyclic
+        data structures cannot be encoded in this mode.
+
+        This option does not affect "decode" in any way - shared values and
+        references will always be decoded properly if present.
+
+    $cbor = $cbor->allow_cycles ([$enable])
+    $enabled = $cbor->get_allow_cycles
+        If $enable is true (or missing), then "decode" will happily decode
+        self-referential (cyclic) data structures. By default these will not
+        be decoded, as they need manual cleanup to avoid memory leaks, so
+        code that isn't prepared for this will not leak memory.
+
+        If $enable is false (the default), then "decode" will throw an error
+        when it encounters a self-referential/cyclic data structure.
+
+        FUTURE DIRECTION: the motivation behind this option is to avoid
+        *real* cycles - future versions of this module might chose to decode
+        cyclic data structures using weak references when this option is
+        off, instead of throwing an error.
+
+        This option does not affect "encode" in any way - shared values and
+        references will always be encoded properly if present.
+
+    $cbor = $cbor->pack_strings ([$enable])
+    $enabled = $cbor->get_pack_strings
+        If $enable is true (or missing), then "encode" will try not to
+        encode the same string twice, but will instead encode a reference to
+        the string instead. Depending on your data format, this can save a
+        lot of space, but also results in a very large runtime overhead
+        (expect encoding times to be 2-4 times as high as without).
+
+        It is recommended to leave it off unless you know your
+        communications partner supports the stringref extension to CBOR
+        (<http://cbor.schmorp.de/stringref>), as without decoder support,
+        the resulting data structure might not be usable.
+
+        If $enable is false (the default), then "encode" will encode strings
+        the standard CBOR way.
+
+        This option does not affect "decode" in any way - string references
+        will always be decoded properly if present.
+
+    $cbor = $cbor->validate_utf8 ([$enable])
+    $enabled = $cbor->get_validate_utf8
+        If $enable is true (or missing), then "decode" will validate that
+        elements (text strings) containing UTF-8 data in fact contain valid
+        UTF-8 data (instead of blindly accepting it). This validation
+        obviously takes extra time during decoding.
+
+        The concept of "valid UTF-8" used is perl's concept, which is a
+        superset of the official UTF-8.
+
+        If $enable is false (the default), then "decode" will blindly accept
+        UTF-8 data, marking them as valid UTF-8 in the resulting data
+        structure regardless of whether thats true or not.
+
+        Perl isn't too happy about corrupted UTF-8 in strings, but should
+        generally not crash or do similarly evil things. Extensions might be
+        not so forgiving, so it's recommended to turn on this setting if you
+        receive untrusted CBOR.
+
+        This option does not affect "encode" in any way - strings that are
+        supposedly valid UTF-8 will simply be dumped into the resulting CBOR
+        string without checking whether that is, in fact, true or not.
+
+    $cbor = $cbor->filter ([$cb->($tag, $value)])
+    $cb_or_undef = $cbor->get_filter
+        Sets or replaces the tagged value decoding filter (when $cb is
+        specified) or clears the filter (if no argument or "undef" is
+        provided).
+
+        The filter callback is called only during decoding, when a
+        non-enforced tagged value has been decoded (see "TAG HANDLING AND
+        EXTENSIONS" for a list of enforced tags). For specific tags, it's
+        often better to provide a default converter using the
+        %CBOR::XS::FILTER hash (see below).
+
+        The first argument is the numerical tag, the second is the (decoded)
+        value that has been tagged.
+
+        The filter function should return either exactly one value, which
+        will replace the tagged value in the decoded data structure, or no
+        values, which will result in default handling, which currently means
+        the decoder creates a "CBOR::XS::Tagged" object to hold the tag and
+        the value.
+
+        When the filter is cleared (the default state), the default filter
+        function, "CBOR::XS::default_filter", is used. This function simply
+        looks up the tag in the %CBOR::XS::FILTER hash. If an entry exists
+        it must be a code reference that is called with tag and value, and
+        is responsible for decoding the value. If no entry exists, it
+        returns no values.
+
+        Example: decode all tags not handled internally into
+        "CBOR::XS::Tagged" objects, with no other special handling (useful
+        when working with potentially "unsafe" CBOR data).
+
+           CBOR::XS->new->filter (sub { })->decode ($cbor_data);
+
+        Example: provide a global filter for tag 1347375694, converting the
+        value into some string form.
+
+           $CBOR::XS::FILTER{1347375694} = sub {
+              my ($tag, $value);
+
+              "tag 1347375694 value $value"
+           };
+
     $cbor_data = $cbor->encode ($perl_scalar)
         Converts the given Perl data structure (a scalar value) to its CBOR
         representation.
@@ -136,6 +294,64 @@
            CBOR::XS->new->decode_prefix ("......")
            => ("...", 3)
 
+  INCREMENTAL PARSING
+    In some cases, there is the need for incremental parsing of JSON texts.
+    While this module always has to keep both CBOR text and resulting Perl
+    data structure in memory at one time, it does allow you to parse a CBOR
+    stream incrementally, using a similar to using "decode_prefix" to see if
+    a full CBOR object is available, but is much more efficient.
+
+    It basically works by parsing as much of a CBOR string as possible - if
+    the CBOR data is not complete yet, the pasrer will remember where it
+    was, to be able to restart when more data has been accumulated. Once
+    enough data is available to either decode a complete CBOR value or raise
+    an error, a real decode will be attempted.
+
+    A typical use case would be a network protocol that consists of sending
+    and receiving CBOR-encoded messages. The solution that works with CBOR
+    and about anything else is by prepending a length to every CBOR value,
+    so the receiver knows how many octets to read. More compact (and
+    slightly slower) would be to just send CBOR values back-to-back, as
+    "CBOR::XS" knows where a CBOR value ends, and doesn't need an explicit
+    length.
+
+    The following methods help with this:
+
+    @decoded = $cbor->incr_parse ($buffer)
+        This method attempts to decode exactly one CBOR value from the
+        beginning of the given $buffer. The value is removed from the
+        $buffer on success. When $buffer doesn't contain a complete value
+        yet, it returns nothing. Finally, when the $buffer doesn't start
+        with something that could ever be a valid CBOR value, it raises an
+        exception, just as "decode" would. In the latter case the decoder
+        state is undefined and must be reset before being able to parse
+        further.
+
+        This method modifies the $buffer in place. When no CBOR value can be
+        decoded, the decoder stores the current string offset. On the next
+        call, continues decoding at the place where it stopped before. For
+        this to make sense, the $buffer must begin with the same octets as
+        on previous unsuccessful calls.
+
+        You can call this method in scalar context, in which case it either
+        returns a decoded value or "undef". This makes it impossible to
+        distinguish between CBOR null values (which decode to "undef") and
+        an unsuccessful decode, which is often acceptable.
+
+    @decoded = $cbor->incr_parse_multiple ($buffer)
+        Same as "incr_parse", but attempts to decode as many CBOR values as
+        possible in one go, instead of at most one. Calls to "incr_parse"
+        and "incr_parse_multiple" can be interleaved.
+
+    $cbor->incr_reset
+        Resets the incremental decoder. This throws away any saved state, so
+        that subsequent calls to "incr_parse" or "incr_parse_multiple" start
+        to parse a new CBOR value from the beginning of the $buffer again.
+
+        This method can be caled at any time, but it *must* be called if you
+        want to change your $buffer or there was a decoding error and you
+        want to reuse the $cbor object for future incremental parsings.
+
 MAPPING
     This section describes how CBOR::XS maps Perl values to CBOR values and
     vice versa. These mappings are designed to "do the right thing" in most
@@ -152,7 +368,7 @@
         support, 64 bit integers will be truncated or otherwise corrupted.
 
     byte strings
-        Byte strings will become octet strings in Perl (the byte values
+        Byte strings will become octet strings in Perl (the Byte values
         0..255 will simply become characters of the same value in Perl).
 
     UTF-8 strings
@@ -176,23 +392,11 @@
         numbers 1 and 0 (for true and false) or to throw an exception on
         access (for error). See the Types::Serialiser manpage for details.
 
-    CBOR tag 256 (perl object)
-        The tag value 256 (TODO: pending iana registration) will be used to
-        deserialise a Perl object serialised with "FREEZE". See OBJECT
-        SERIALISATION, below, for details.
-
-    CBOR tag 55799 (magic header)
-        The tag 55799 is ignored (this tag implements the magic header).
-
-    other CBOR tags
-        Tagged items consists of a numeric tag and another CBOR value. Tags
-        not handled internally are currently converted into a
-        CBOR::XS::Tagged object, which is simply a blessed array reference
-        consisting of the numeric tag value followed by the (decoded) CBOR
-        value.
+    tagged values
+        Tagged items consists of a numeric tag and another CBOR value.
 
-        In the future, support for user-supplied conversions might get
-        added.
+        See "TAG HANDLING AND EXTENSIONS" and the description of "->filter"
+        for details on which tags are handled how.
 
     anything else
         Anything else (e.g. unsupported simple values) will raise a decoding
@@ -200,13 +404,14 @@
 
   PERL -> CBOR
     The mapping from Perl to CBOR is slightly more difficult, as Perl is a
-    truly typeless language, so we can only guess which CBOR type is meant
-    by a Perl value.
+    typeless language. That means this module can only guess which CBOR type
+    is meant by a perl value.
 
     hash references
         Perl hash references become CBOR maps. As there is no inherent
         ordering in hash keys (or CBOR maps), they will usually be encoded
-        in a pseudo-random order.
+        in a pseudo-random order. This order can be different each time a
+        hahs is encoded.
 
         Currently, tied hashes will use the indefinite-length format, while
         normal hashes will use the fixed-length format.
@@ -215,14 +420,17 @@
         Perl array references become fixed-length CBOR arrays.
 
     other references
-        Other unblessed references are generally not allowed and will cause
-        an exception to be thrown, except for references to the integers 0
-        and 1, which get turned into false and true in CBOR.
+        Other unblessed references will be represented using the indirection
+        tag extension (tag value 22098,
+        <http://cbor.schmorp.de/indirection>). CBOR decoders are guaranteed
+        to be able to decode these values somehow, by either "doing the
+        right thing", decoding into a generic tagged object, simply ignoring
+        the tag, or something else.
 
     CBOR::XS::Tagged objects
         Objects of this type must be arrays consisting of a single "[tag,
         value]" pair. The (numerical) tag will be encoded as a CBOR tag, the
-        value will be encoded as appropriate for the value. You cna use
+        value will be encoded as appropriate for the value. You must use
         "CBOR::XS::tag" to create such objects.
 
     Types::Serialiser::true, Types::Serialiser::false,
@@ -233,11 +441,12 @@
 
     other blessed objects
         Other blessed objects are serialised via "TO_CBOR" or "FREEZE". See
-        "OBJECT SERIALISATION", below, for details.
+        "TAG HANDLING AND EXTENSIONS" for specific classes handled by this
+        module, and "OBJECT SERIALISATION" for generic object serialisation.
 
     simple scalars
-        TODO Simple Perl scalars (any scalar that is not a reference) are
-        the most difficult objects to encode: CBOR::XS will encode undefined
+        Simple Perl scalars (any scalar that is not a reference) are the
+        most difficult objects to encode: CBOR::XS will encode undefined
         scalars as CBOR null values, scalars that have last been used in a
         string context before encoding as CBOR strings, and anything else as
         number value:
@@ -247,7 +456,7 @@
            encode_cbor [-3.0e17]                # yields [-3e+17]
            my $value = 5; encode_cbor [$value]  # yields [5]
 
-           # used as string, so dump as string
+           # used as string, so dump as string (either byte or text)
            print $value;
            encode_cbor [$value]                 # yields ["5"]
 
@@ -261,6 +470,16 @@
            $x .= "";    # another, more awkward way to stringify
            print $x;    # perl does it for you, too, quite often
 
+        You can force whether a string ie encoded as byte or text string by
+        using "utf8::upgrade" and "utf8::downgrade"):
+
+          utf8::upgrade $x;   # encode $x as text string
+          utf8::downgrade $x; # encode $x as byte string
+
+        Perl doesn't define what operations up- and downgrade strings, so if
+        the difference between byte and text is important, you should up- or
+        downgrade your string as late as possible before encoding.
+
         You can force the type to be a CBOR number by numifying it:
 
            my $x = "3"; # some variable containing a string
@@ -279,10 +498,15 @@
         might suffer loss of precision.
 
   OBJECT SERIALISATION
+    This module implements both a CBOR-specific and the generic
+    Types::Serialier object serialisation protocol. The following
+    subsections explain both methods.
+
+   ENCODING
     This module knows two way to serialise a Perl object: The CBOR-specific
     way, and the generic way.
 
-    Whenever the encoder encounters a Perl object that it cnanot serialise
+    Whenever the encoder encounters a Perl object that it cannot serialise
     directly (most of them), it will first look up the "TO_CBOR" method on
     it.
 
@@ -297,12 +521,17 @@
     The "FREEZE" method can return any number of values (i.e. zero or more).
     These will be encoded as CBOR perl object, together with the classname.
 
+    These methods *MUST NOT* change the data structure that is being
+    serialised. Failure to comply to this can result in memory corruption -
+    and worse.
+
     If an object supports neither "TO_CBOR" nor "FREEZE", encoding will fail
     with an error.
 
-    Objects encoded via "TO_CBOR" cannot be automatically decoded, but
-    objects encoded via "FREEZE" can be decoded using the following
-    protocol:
+   DECODING
+    Objects encoded via "TO_CBOR" cannot (normally) be automatically
+    decoded, but objects encoded via "FREEZE" can be decoded using the
+    following protocol:
 
     When an encoded CBOR perl object is encountered by the decoder, it will
     look up the "THAW" method, by using the stored classname, and will fail
@@ -333,7 +562,7 @@
          my ($self) = @_;
          my $uri = "$self"; # stringify uri
          utf8::upgrade $uri; # make sure it will be encoded as UTF-8 string
-         CBOR::XS::tagged 32, "$_[0]"
+         CBOR::XS::tag 32, "$_[0]"
       }
 
     This will encode URIs as a UTF-8 string with tag 32, which indicates an
@@ -378,10 +607,10 @@
     There is no way to distinguish CBOR from other formats programmatically.
     To make it easier to distinguish CBOR from other formats, the CBOR
     specification has a special "magic string" that can be prepended to any
-    CBOR string without changing it's meaning.
+    CBOR string without changing its meaning.
 
     This string is available as $CBOR::XS::MAGIC. This module does not
-    prepend this string tot he CBOR data it generates, but it will ignroe it
+    prepend this string to the CBOR data it generates, but it will ignore it
     if present, so users can prepend this string as a "file type" indicator
     as required.
 
@@ -443,6 +672,109 @@
           CBOR::XS::tag 24,
              encode_cbor [1, 2, 3];
 
+TAG HANDLING AND EXTENSIONS
+    This section describes how this module handles specific tagged values
+    and extensions. If a tag is not mentioned here and no additional filters
+    are provided for it, then the default handling applies (creating a
+    CBOR::XS::Tagged object on decoding, and only encoding the tag when
+    explicitly requested).
+
+    Tags not handled specifically are currently converted into a
+    CBOR::XS::Tagged object, which is simply a blessed array reference
+    consisting of the numeric tag value followed by the (decoded) CBOR
+    value.
+
+    Future versions of this module reserve the right to special case
+    additional tags (such as base64url).
+
+  ENFORCED TAGS
+    These tags are always handled when decoding, and their handling cannot
+    be overriden by the user.
+
+    26 (perl-object, <http://cbor.schmorp.de/perl-object>)
+        These tags are automatically created (and decoded) for serialisable
+        objects using the "FREEZE/THAW" methods (the Types::Serialier object
+        serialisation protocol). See "OBJECT SERIALISATION" for details.
+
+    28, 29 (shareable, sharedref, <http://cbor.schmorp.de/value-sharing>)
+        These tags are automatically decoded when encountered (and they do
+        not result in a cyclic data structure, see "allow_cycles"),
+        resulting in shared values in the decoded object. They are only
+        encoded, however, when "allow_sharing" is enabled.
+
+        Not all shared values can be successfully decoded: values that
+        reference themselves will *currently* decode as "undef" (this is not
+        the same as a reference pointing to itself, which will be
+        represented as a value that contains an indirect reference to itself
+        - these will be decoded properly).
+
+        Note that considerably more shared value data structures can be
+        decoded than will be encoded - currently, only values pointed to by
+        references will be shared, others will not. While non-reference
+        shared values can be generated in Perl with some effort, they were
+        considered too unimportant to be supported in the encoder. The
+        decoder, however, will decode these values as shared values.
+
+    256, 25 (stringref-namespace, stringref,
+    <http://cbor.schmorp.de/stringref>)
+        These tags are automatically decoded when encountered. They are only
+        encoded, however, when "pack_strings" is enabled.
+
+    22098 (indirection, <http://cbor.schmorp.de/indirection>)
+        This tag is automatically generated when a reference are encountered
+        (with the exception of hash and array refernces). It is converted to
+        a reference when decoding.
+
+    55799 (self-describe CBOR, RFC 7049)
+        This value is not generated on encoding (unless explicitly requested
+        by the user), and is simply ignored when decoding.
+
+  NON-ENFORCED TAGS
+    These tags have default filters provided when decoding. Their handling
+    can be overriden by changing the %CBOR::XS::FILTER entry for the tag, or
+    by providing a custom "filter" callback when decoding.
+
+    When they result in decoding into a specific Perl class, the module
+    usually provides a corresponding "TO_CBOR" method as well.
+
+    When any of these need to load additional modules that are not part of
+    the perl core distribution (e.g. URI), it is (currently) up to the user
+    to provide these modules. The decoding usually fails with an exception
+    if the required module cannot be loaded.
+
+    0, 1 (date/time string, seconds since the epoch)
+        These tags are decoded into Time::Piece objects. The corresponding
+        "Time::Piece::TO_CBOR" method always encodes into tag 1 values
+        currently.
+
+        The Time::Piece API is generally surprisingly bad, and fractional
+        seconds are only accidentally kept intact, so watch out. On the plus
+        side, the module comes with perl since 5.10, which has to count for
+        something.
+
+    2, 3 (positive/negative bignum)
+        These tags are decoded into Math::BigInt objects. The corresponding
+        "Math::BigInt::TO_CBOR" method encodes "small" bigints into normal
+        CBOR integers, and others into positive/negative CBOR bignums.
+
+    4, 5 (decimal fraction/bigfloat)
+        Both decimal fractions and bigfloats are decoded into Math::BigFloat
+        objects. The corresponding "Math::BigFloat::TO_CBOR" method *always*
+        encodes into a decimal fraction.
+
+        CBOR cannot represent bigfloats with *very* large exponents -
+        conversion of such big float objects is undefined.
+
+        Also, NaN and infinities are not encoded properly.
+
+    21, 22, 23 (expected later JSON conversion)
+        CBOR::XS is not a CBOR-to-JSON converter, and will simply ignore
+        these tags.
+
+    32 (URI)
+        These objects decode into URI objects. The corresponding
+        "URI::TO_CBOR" method again results in a CBOR URI value.
+
 CBOR and JSON
     CBOR is supposed to implement a superset of the JSON data model, and is,
     with some coercion, able to represent all JSON texts (something that
@@ -507,6 +839,14 @@
 
     Strict mode and canonical mode are not implemented.
 
+LIMITATIONS ON PERLS WITHOUT 64-BIT INTEGER SUPPORT
+    On perls that were built without 64 bit integer support (these are rare
+    nowadays, even on 32 bit architectures, as all major Perl distributions
+    are built with 64 bit integer support), support for any kind of 64 bit
+    integer in CBOR is very limited - most likely, these 64 bit values will
+    be truncated, corrupted, or otherwise not decoded correctly. This also
+    includes string, array and map sizes that are stored as 64 bit integers.
+
 THREADS
     This module is *not* guaranteed to be thread safe and there are no plans
     to change this until Perl gets thread support (as opposed to the