--- JSON-XS/README	2007/03/24 19:42:14	1.6
+++ JSON-XS/README	2007/07/26 11:33:35	1.16
@@ -4,12 +4,13 @@
 SYNOPSIS
      use JSON::XS;
 
-     # exported functions, croak on error
+     # exported functions, they croak on error
+     # and expect/generate UTF-8
 
      $utf8_encoded_json_text = to_json $perl_hash_or_arrayref;
      $perl_hash_or_arrayref  = from_json $utf8_encoded_json_text;
 
-     # oo-interface
+     # OO-interface
 
      $coder = JSON::XS->new->ascii->pretty->allow_nonref;
      $pretty_printed_unencoded = $coder->encode ($perl_scalar);
@@ -32,14 +33,15 @@
     vice versa.
 
   FEATURES
-    * correct handling of unicode issues
+    * correct unicode handling
         This module knows how to handle Unicode, and even documents how and
         when it does so.
 
     * round-trip integrity
         When you serialise a perl data structure using only datatypes
         supported by JSON, the deserialised data structure is identical on
-        the Perl level. (e.g. the string "2.0" doesn't suddenly become "2").
+        the Perl level. (e.g. the string "2.0" doesn't suddenly become "2"
+        just because it looks like a number).
 
     * strict checking of JSON correctness
         There is no guessing, no generating of illegal JSON texts by
@@ -57,9 +59,10 @@
     * reasonably versatile output formats
         You can choose between the most compact guarenteed single-line
         format possible (nice for simple line-based protocols), a pure-ascii
-        format (for when your transport is not 8-bit clean), or a
-        pretty-printed format (for when you want to read that stuff). Or you
-        can combine those features in whatever way you like.
+        format (for when your transport is not 8-bit clean, still supports
+        the whole unicode range), or a pretty-printed format (for when you
+        want to read that stuff). Or you can combine those features in
+        whatever way you like.
 
 FUNCTIONAL INTERFACE
     The following convinience methods are provided by this module. They are
@@ -87,6 +90,15 @@
 
         except being faster.
 
+    $is_boolean = JSON::XS::is_bool $scalar
+        Returns true if the passed scalar represents either JSON::XS::true
+        or JSON::XS::false, two constants that act like 1 and 0,
+        respectively and are used to represent JSON "true" and "false"
+        values in Perl.
+
+        See MAPPING, below, for more information on how JSON values are
+        mapped to Perl.
+
 OBJECT-ORIENTED INTERFACE
     The object oriented interface lets you configure your own encoding or
     decoding style, within the limits of supported formats.
@@ -107,15 +119,47 @@
         generate characters outside the code range 0..127 (which is ASCII).
         Any unicode characters outside that range will be escaped using
         either a single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL
-        escape sequence, as per RFC4627.
+        escape sequence, as per RFC4627. The resulting encoded JSON text can
+        be treated as a native unicode string, an ascii-encoded,
+        latin1-encoded or UTF-8 encoded string, or any other superset of
+        ASCII.
 
         If $enable is false, then the "encode" method will not escape
-        Unicode characters unless required by the JSON syntax. This results
-        in a faster and more compact format.
+        Unicode characters unless required by the JSON syntax or other
+        flags. This results in a faster and more compact format.
+
+        The main use for this flag is to produce JSON texts that can be
+        transmitted over a 7-bit channel, as the encoded JSON texts will not
+        contain any 8 bit characters.
 
           JSON::XS->new->ascii (1)->encode ([chr 0x10401])
           => ["\ud801\udc01"]
 
+    $json = $json->latin1 ([$enable])
+        If $enable is true (or missing), then the "encode" method will
+        encode the resulting JSON text as latin1 (or iso-8859-1), escaping
+        any characters outside the code range 0..255. The resulting string
+        can be treated as a latin1-encoded JSON text or a native unicode
+        string. The "decode" method will not be affected in any way by this
+        flag, as "decode" by default expects unicode, which is a strict
+        superset of latin1.
+
+        If $enable is false, then the "encode" method will not escape
+        Unicode characters unless required by the JSON syntax or other
+        flags.
+
+        The main use for this flag is efficiently encoding binary data as
+        JSON text, as most octets will not be escaped, resulting in a
+        smaller encoded size. The disadvantage is that the resulting JSON
+        text is encoded in latin1 (and must correctly be treated as such
+        when storing and transfering), a rare encoding for JSON. It is
+        therefore most useful when you want to store data structures known
+        to contain binary data efficiently in files or databases, not when
+        talking to other JSON encoders/decoders.
+
+          JSON::XS->new->latin1->encode (["\x{89}\x{abc}"]
+          => ["\x{89}\\u0abc"]    # (perl syntax, U+abc escaped, U+89 not)
+
     $json = $json->utf8 ([$enable])
         If $enable is true (or missing), then the "encode" method will
         encode the JSON result into UTF-8, as required by many protocols,
@@ -232,6 +276,116 @@
            JSON::XS->new->allow_nonref->encode ("Hello, World!")
            => "Hello, World!"
 
+    $json = $json->allow_blessed ([$enable])
+        If $enable is true (or missing), then the "encode" method will not
+        barf when it encounters a blessed reference. Instead, the value of
+        the convert_blessed option will decide wether "null"
+        ("convert_blessed" disabled or no "to_json" method found) or a
+        representation of the object ("convert_blessed" enabled and
+        "to_json" method found) is being encoded. Has no effect on "decode".
+
+        If $enable is false (the default), then "encode" will throw an
+        exception when it encounters a blessed object.
+
+    $json = $json->convert_blessed ([$enable])
+        If $enable is true (or missing), then "encode", upon encountering a
+        blessed object, will check for the availability of the "TO_JSON"
+        method on the object's class. If found, it will be called in scalar
+        context and the resulting scalar will be encoded instead of the
+        object. If no "TO_JSON" method is found, the value of
+        "allow_blessed" will decide what to do.
+
+        The "TO_JSON" method may safely call die if it wants. If "TO_JSON"
+        returns other blessed objects, those will be handled in the same
+        way. "TO_JSON" must take care of not causing an endless recursion
+        cycle (== crash) in this case. The name of "TO_JSON" was chosen
+        because other methods called by the Perl core (== not by the user of
+        the object) are usually in upper case letters and to avoid
+        collisions with the "to_json" function.
+
+        This setting does not yet influence "decode" in any way, but in the
+        future, global hooks might get installed that influence "decode" and
+        are enabled by this setting.
+
+        If $enable is false, then the "allow_blessed" setting will decide
+        what to do when a blessed object is found.
+
+    $json = $json->filter_json_object ([$coderef->($hashref)])
+        When $coderef is specified, it will be called from "decode" each
+        time it decodes a JSON object. The only argument is a reference to
+        the newly-created hash. If the code references returns a single
+        scalar (which need not be a reference), this value (i.e. a copy of
+        that scalar to avoid aliasing) is inserted into the deserialised
+        data structure. If it returns an empty list (NOTE: *not* "undef",
+        which is a valid scalar), the original deserialised hash will be
+        inserted. This setting can slow down decoding considerably.
+
+        When $coderef is omitted or undefined, any existing callback will be
+        removed and "decode" will not change the deserialised hash in any
+        way.
+
+        Example, convert all JSON objects into the integer 5:
+
+           my $js = JSON::XS->new->filter_json_object (sub { 5 });
+           # returns [5]
+           $js->decode ('[{}]')
+           # throw an exception because allow_nonref is not enabled
+           # so a lone 5 is not allowed.
+           $js->decode ('{"a":1, "b":2}');
+
+    $json = $json->filter_json_single_key_object ($key [=>
+    $coderef->($value)])
+        Works remotely similar to "filter_json_object", but is only called
+        for JSON objects having a single key named $key.
+
+        This $coderef is called before the one specified via
+        "filter_json_object", if any. It gets passed the single value in the
+        JSON object. If it returns a single value, it will be inserted into
+        the data structure. If it returns nothing (not even "undef" but the
+        empty list), the callback from "filter_json_object" will be called
+        next, as if no single-key callback were specified.
+
+        If $coderef is omitted or undefined, the corresponding callback will
+        be disabled. There can only ever be one callback for a given key.
+
+        As this callback gets called less often then the
+        "filter_json_object" one, decoding speed will not usually suffer as
+        much. Therefore, single-key objects make excellent targets to
+        serialise Perl objects into, especially as single-key JSON objects
+        are as close to the type-tagged value concept as JSON gets (its
+        basically an ID/VALUE tuple). Of course, JSON does not support this
+        in any way, so you need to make sure your data never looks like a
+        serialised Perl hash.
+
+        Typical names for the single object key are "__class_whatever__", or
+        "$__dollars_are_rarely_used__$" or "}ugly_brace_placement", or even
+        things like "__class_md5sum(classname)__", to reduce the risk of
+        clashing with real hashes.
+
+        Example, decode JSON objects of the form "{ "__widget__" => <id> }"
+        into the corresponding $WIDGET{<id>} object:
+
+           # return whatever is in $WIDGET{5}:
+           JSON::XS
+              ->new
+              ->filter_json_single_key_object (__widget__ => sub {
+                    $WIDGET{ $_[0] }
+                 })
+              ->decode ('{"__widget__": 5')
+
+           # this can be used with a TO_JSON method in some "widget" class
+           # for serialisation to json:
+           sub WidgetBase::TO_JSON {
+              my ($self) = @_;
+
+              unless ($self->{id}) {
+                 $self->{id} = ..get..some..id..;
+                 $WIDGET{$self->{id}} = $self;
+              }
+
+              { __widget__ => $self->{id} }
+           }
+
     $json = $json->shrink ([$enable])
         Perl usually over-allocates memory a bit when allocating space for
         strings. This flag optionally resizes strings generated by either
@@ -240,7 +394,12 @@
         many short strings. It will also try to downgrade any strings to
         octet-form if possible: perl stores strings internally either in an
         encoding called UTF-X or in octet-form. The latter cannot store
-        everything but uses less space in general.
+        everything but uses less space in general (and some buggy Perl or C
+        code might even rely on that internal representation being used).
+
+        The actual definition of what shrink does might change in future
+        versions, but it will always try to save space at the expense of
+        time.
 
         If $enable is true (or missing), the string returned by "encode"
         will be shrunk-to-fit, while all strings generated by "decode" will
@@ -254,6 +413,42 @@
         or floats internally (there is no difference on the Perl level),
         saving space.
 
+    $json = $json->max_depth ([$maximum_nesting_depth])
+        Sets the maximum nesting level (default 512) accepted while encoding
+        or decoding. If the JSON text or Perl data structure has an equal or
+        higher nesting level then this limit, then the encoder and decoder
+        will stop and croak at that point.
+
+        Nesting level is defined by number of hash- or arrayrefs that the
+        encoder needs to traverse to reach a given point or the number of
+        "{" or "[" characters without their matching closing parenthesis
+        crossed to reach a given character in a string.
+
+        Setting the maximum depth to one disallows any nesting, so that
+        ensures that the object is only a single hash/object or array.
+
+        The argument to "max_depth" will be rounded up to the next highest
+        power of two. If no argument is given, the highest possible setting
+        will be used, which is rarely useful.
+
+        See SECURITY CONSIDERATIONS, below, for more info on why this is
+        useful.
+
+    $json = $json->max_size ([$maximum_string_size])
+        Set the maximum length a JSON text may have (in bytes) where
+        decoding is being attempted. The default is 0, meaning no limit.
+        When "decode" is called on a string longer then this number of
+        characters it will not attempt to decode the string but throw an
+        exception. This setting has no effect on "encode" (yet).
+
+        The argument to "max_size" will be rounded up to the next highest
+        power of two (so may be more than requested). If no argument is
+        given, the limit check will be deactivated (same as when 0 is
+        specified).
+
+        See SECURITY CONSIDERATIONS, below, for more info on why this is
+        useful.
+
     $json_text = $json->encode ($perl_scalar)
         Converts the given Perl data structure (a simple scalar or a
         reference to a hash or array) to its JSON representation. Simple
@@ -271,6 +466,19 @@
         become Perl arrayrefs and JSON objects become Perl hashrefs. "true"
         becomes 1, "false" becomes 0 and "null" becomes "undef".
 
+    ($perl_scalar, $characters) = $json->decode_prefix ($json_text)
+        This works like the "decode" method, but instead of raising an
+        exception when there is trailing garbage after the first JSON
+        object, it will silently stop parsing there and return the number of
+        characters consumed so far.
+
+        This is useful if your JSON texts are not delimited by an outer
+        protocol (which is not the brightest thing to do in the first place)
+        and you need to know where the JSON text ends.
+
+           JSON::XS->new->decode_prefix ("[1] the tail")
+           => ([], 3)
+
 MAPPING
     This section describes how JSON::XS maps Perl values to JSON values and
     vice versa. These mappings are designed to "do the right thing" in most
@@ -296,18 +504,31 @@
         so no manual decoding is necessary.
 
     number
-        A JSON number becomes either an integer or numeric (floating point)
-        scalar in perl, depending on its range and any fractional parts. On
-        the Perl level, there is no difference between those as Perl handles
-        all the conversion details, but an integer may take slightly less
-        memory and might represent more values exactly than (floating point)
-        numbers.
+        A JSON number becomes either an integer, numeric (floating point) or
+        string scalar in perl, depending on its range and any fractional
+        parts. On the Perl level, there is no difference between those as
+        Perl handles all the conversion details, but an integer may take
+        slightly less memory and might represent more values exactly than
+        (floating point) numbers.
+
+        If the number consists of digits only, JSON::XS will try to
+        represent it as an integer value. If that fails, it will try to
+        represent it as a numeric (floating point) value if that is possible
+        without loss of precision. Otherwise it will preserve the number as
+        a string value.
+
+        Numbers containing a fractional or exponential part will always be
+        represented as numeric (floating point) values, possibly at a loss
+        of precision.
+
+        This might create round-tripping problems as numbers might become
+        strings, but as Perl is typeless there is no other way to do it.
 
     true, false
-        These JSON atoms become 0, 1, respectively. Information is lost in
-        this process. Future versions might represent those values
-        differently, but they will be guarenteed to act like these integers
-        would normally in Perl.
+        These JSON atoms become "JSON::XS::true" and "JSON::XS::false",
+        respectively. They are overloaded to act almost exactly like the
+        numbers 1 and 0. You can check wether a scalar is a JSON boolean by
+        using the "JSON::XS::is_bool" function.
 
     null
         A JSON null atom becomes "undef" in Perl.
@@ -319,17 +540,32 @@
 
     hash references
         Perl hash references become JSON objects. As there is no inherent
-        ordering in hash keys, they will usually be encoded in a
-        pseudo-random order that can change between runs of the same program
-        but stays generally the same within a single run of a program.
-        JSON::XS can optionally sort the hash keys (determined by the
-        *canonical* flag), so the same datastructure will serialise to the
-        same JSON text (given same settings and version of JSON::XS), but
-        this incurs a runtime overhead.
+        ordering in hash keys (or JSON objects), they will usually be
+        encoded in a pseudo-random order that can change between runs of the
+        same program but stays generally the same within a single run of a
+        program. JSON::XS can optionally sort the hash keys (determined by
+        the *canonical* flag), so the same datastructure will serialise to
+        the same JSON text (given same settings and version of JSON::XS),
+        but this incurs a runtime overhead and is only rarely useful, e.g.
+        when you want to compare some JSON text against another for
+        equality.
 
     array references
         Perl array references become JSON arrays.
 
+    other references
+        Other unblessed references are generally not allowed and will cause
+        an exception to be thrown, except for references to the integers 0
+        and 1, which get turned into "false" and "true" atoms in JSON. You
+        can also use "JSON::XS::false" and "JSON::XS::true" to improve
+        readability.
+
+           to_json [\0,JSON::XS::true]      # yields [false,true]
+
+    JSON::XS::true, JSON::XS::false
+        These special values become JSON true and JSON false values,
+        respectively. You cna alos use "\1" and "\0" directly if you want.
+
     blessed objects
         Blessed objects are not allowed. JSON::XS currently tries to encode
         their underlying representation (hash- or arrayref), but this
@@ -370,9 +606,6 @@
         You can not currently output JSON booleans or force the type in
         other, less obscure, ways. Tell me if you need this capability.
 
-    circular data structures
-        Those will be encoded until memory or stackspace runs out.
-
 COMPARISON
     As already mentioned, this module was created because none of the
     existing JSON modules could be made to work correctly. First I will
@@ -452,64 +685,135 @@
 
         Does not check input for validity.
 
+  JSON and YAML
+    You often hear that JSON is a subset (or a close subset) of YAML. This
+    is, however, a mass hysteria and very far from the truth. In general,
+    there is no way to configure JSON::XS to output a data structure as
+    valid YAML.
+
+    If you really must use JSON::XS to generate YAML, you should use this
+    algorithm (subject to change in future versions):
+
+       my $to_yaml = JSON::XS->new->utf8->space_after (1);
+       my $yaml = $to_yaml->encode ($ref) . "\n";
+
+    This will usually generate JSON texts that also parse as valid YAML.
+    Please note that YAML has hardcoded limits on (simple) object key
+    lengths that JSON doesn't have, so you should make sure that your hash
+    keys are noticably shorter than the 1024 characters YAML allows.
+
+    There might be other incompatibilities that I am not aware of. In
+    general you should not try to generate YAML with a JSON generator or
+    vice versa, or try to parse JSON with a YAML parser or vice versa:
+    chances are high that you will run into severe interoperability
+    problems.
+
   SPEED
     It seems that JSON::XS is surprisingly fast, as shown in the following
     tables. They have been generated with the help of the "eg/bench" program
     in the JSON::XS distribution, to make it easy to compare on your own
     system.
 
-    First comes a comparison between various modules using a very short JSON
-    string (83 bytes), showing the number of encodes/decodes per second
-    (JSON::XS is the functional interface, while JSON::XS/2 is the OO
-    interface with pretty-printing and hashkey sorting enabled). Higher is
-    better:
+    First comes a comparison between various modules using a very short
+    single-line JSON string:
 
+       {"method": "handleMessage", "params": ["user1", "we were just talking"], \
+       "id": null, "array":[1,11,234,-5,1e5,1e7, true,  false]}
+
+    It shows the number of encodes/decodes per second (JSON::XS uses the
+    functional interface, while JSON::XS/2 uses the OO interface with
+    pretty-printing and hashkey sorting enabled, JSON::XS/3 enables shrink).
+    Higher is better:
+
+       Storable   |  15779.925 |  14169.946 |
+       -----------+------------+------------+
        module     |     encode |     decode |
        -----------|------------|------------|
-       JSON       |      14006 |       6820 |
-       JSON::DWIW |     200937 |     120386 |
-       JSON::PC   |      85065 |     129366 |
-       JSON::Syck |      59898 |      44232 |
-       JSON::XS   |    1171478 |     342435 |
-       JSON::XS/2 |     730760 |     328714 |
+       JSON       |   4990.842 |   4088.813 |
+       JSON::DWIW |  51653.990 |  71575.154 |
+       JSON::PC   |  65948.176 |  74631.744 |
+       JSON::PP   |   8931.652 |   3817.168 |
+       JSON::Syck |  24877.248 |  27776.848 |
+       JSON::XS   | 388361.481 | 227951.304 |
+       JSON::XS/2 | 227951.304 | 218453.333 |
+       JSON::XS/3 | 338250.323 | 218453.333 |
+       Storable   |  16500.016 | 135300.129 |
        -----------+------------+------------+
 
-    That is, JSON::XS is 6 times faster than than JSON::DWIW and about 80
-    times faster than JSON, even with pretty-printing and key sorting.
+    That is, JSON::XS is about five times faster than JSON::DWIW on
+    encoding, about three times faster on decoding, and over fourty times
+    faster than JSON, even with pretty-printing and key sorting. It also
+    compares favourably to Storable for small amounts of data.
 
     Using a longer test string (roughly 18KB, generated from Yahoo! Locals
     search API (http://nanoref.com/yahooapis/mgPdGg):
 
        module     |     encode |     decode |
        -----------|------------|------------|
-       JSON       |        673 |         38 |
-       JSON::DWIW |       5271 |        770 |
-       JSON::PC   |       9901 |       2491 |
-       JSON::Syck |       2360 |        786 |
-       JSON::XS   |      37398 |       3202 |
-       JSON::XS/2 |      13765 |       3153 |
+       JSON       |     55.260 |     34.971 |
+       JSON::DWIW |    825.228 |   1082.513 |
+       JSON::PC   |   3571.444 |   2394.829 |
+       JSON::PP   |    210.987 |     32.574 |
+       JSON::Syck |    552.551 |    787.544 |
+       JSON::XS   |   5780.463 |   4854.519 |
+       JSON::XS/2 |   3869.998 |   4798.975 |
+       JSON::XS/3 |   5862.880 |   4798.975 |
+       Storable   |   4445.002 |   5235.027 |
        -----------+------------+------------+
 
-    Again, JSON::XS leads by far in the encoding case, while still beating
-    every other module in the decoding case.
+    Again, JSON::XS leads by far (except for Storable which non-surprisingly
+    decodes faster).
 
-    On large strings containing lots of unicode characters, some modules
-    (such as JSON::PC) decode faster than JSON::XS, but the result will be
-    broken due to missing unicode handling. Others refuse to decode or
-    encode properly, so it was impossible to prepare a fair comparison table
-    for that case.
-
-RESOURCE LIMITS
-    JSON::XS does not impose any limits on the size of JSON texts or Perl
-    values they represent - if your machine can handle it, JSON::XS will
-    encode or decode it. Future versions might optionally impose structure
-    depth and memory use resource limits.
+    On large strings containing lots of high unicode characters, some
+    modules (such as JSON::PC) seem to decode faster than JSON::XS, but the
+    result will be broken due to missing (or wrong) unicode handling. Others
+    refuse to decode or encode properly, so it was impossible to prepare a
+    fair comparison table for that case.
+
+SECURITY CONSIDERATIONS
+    When you are using JSON in a protocol, talking to untrusted potentially
+    hostile creatures requires relatively few measures.
+
+    First of all, your JSON decoder should be secure, that is, should not
+    have any buffer overflows. Obviously, this module should ensure that and
+    I am trying hard on making that true, but you never know.
+
+    Second, you need to avoid resource-starving attacks. That means you
+    should limit the size of JSON texts you accept, or make sure then when
+    your resources run out, thats just fine (e.g. by using a separate
+    process that can crash safely). The size of a JSON text in octets or
+    characters is usually a good indication of the size of the resources
+    required to decode it into a Perl structure. While JSON::XS can check
+    the size of the JSON text, it might be too late when you already have it
+    in memory, so you might want to check the size before you accept the
+    string.
+
+    Third, JSON::XS recurses using the C stack when decoding objects and
+    arrays. The C stack is a limited resource: for instance, on my amd64
+    machine with 8MB of stack size I can decode around 180k nested arrays
+    but only 14k nested JSON objects (due to perl itself recursing deeply on
+    croak to free the temporary). If that is exceeded, the program crashes.
+    to be conservative, the default nesting limit is set to 512. If your
+    process has a smaller stack, you should adjust this setting accordingly
+    with the "max_depth" method.
+
+    And last but least, something else could bomb you that I forgot to think
+    of. In that case, you get to keep the pieces. I am always open for
+    hints, though...
+
+    If you are using JSON::XS to return packets to consumption by javascript
+    scripts in a browser you should have a look at
+    <http://jpsykes.com/47/practical-csrf-and-json-security> to see wether
+    you are vulnerable to some common attack vectors (which really are
+    browser design bugs, but it is still you who will have to deal with it,
+    as major browser developers care only for features, not about doing
+    security right).
 
 BUGS
     While the goal of this module is to be correct, that unfortunately does
     not mean its bug-free, only that I think its design is bug-free. It is
-    still very young and not well-tested. If you keep reporting bugs they
-    will be fixed swiftly, though.
+    still relatively early in its development. If you keep reporting bugs
+    they will be fixed swiftly, though.
 
 AUTHOR
      Marc Lehmann <schmorp@schmorp.de>