--- JSON-XS/XS.pm 2007/03/22 21:13:58 1.4 +++ JSON-XS/XS.pm 2007/03/23 17:40:29 1.10 @@ -20,13 +20,17 @@ See COMPARISON, below, for a comparison to some other JSON modules. +See MAPPING, below, on how JSON::XS maps perl values to JSON values and +vice versa. + =head2 FEATURES =over 4 =item * correct handling of unicode issues -This module knows how to handle Unicode, and even documents how it does so. +This module knows how to handle Unicode, and even documents how and when +it does so. =item * round-trip integrity @@ -37,11 +41,13 @@ =item * strict checking of JSON correctness There is no guessing, no generating of illegal JSON strings by default, -and only JSON is accepted as input (the latter is a security feature). +and only JSON is accepted as input by default (the latter is a security +feature). =item * fast -compared to other JSON modules, this module compares favourably. +Compared to other JSON modules, this module compares favourably in terms +of speed, too. =item * simple to use @@ -50,8 +56,10 @@ =item * reasonably versatile output formats -You can choose between the most compact format possible, a pure-ascii -format, or a pretty-printed format. Or you can combine those features in +You can choose between the most compact guarenteed single-line format +possible (nice for simple line-based protocols), a pure-ascii format (for +when your transport is not 8-bit clean), or a pretty-printed format (for +when you want to read that stuff). Or you can combine those features in whatever way you like. =back @@ -61,7 +69,7 @@ package JSON::XS; BEGIN { - $VERSION = '0.1'; + $VERSION = '0.3'; @ISA = qw(Exporter); @EXPORT = qw(to_json from_json); @@ -84,8 +92,7 @@ a hash or array) to a UTF-8 encoded, binary string (that is, the string contains octets only). Croaks on error. -This function call is functionally identical to C<< JSON::XS->new->utf8 -(1)->encode ($perl_scalar) >>. +This function call is functionally identical to C<< JSON::XS->new->utf8->encode ($perl_scalar) >>. =item $perl_scalar = from_json $json_string @@ -93,8 +100,7 @@ parse that as an UTF-8 encoded JSON string, returning the resulting simple scalar or reference. Croaks on error. -This function call is functionally identical to C<< JSON::XS->new->utf8 -(1)->decode ($json_string) >>. +This function call is functionally identical to C<< JSON::XS->new->utf8->decode ($json_string) >>. =back @@ -116,12 +122,13 @@ my $json = JSON::XS->new->utf8(1)->space_after(1)->encode ({a => [1,2]}) => {"a": [1, 2]} -=item $json = $json->ascii ($enable) +=item $json = $json->ascii ([$enable]) -If C<$enable> is true, then the C method will not generate -characters outside the code range C<0..127>. Any unicode characters -outside that range will be escaped using either a single \uXXXX (BMP -characters) or a double \uHHHH\uLLLLL escape sequence, as per RFC4627. +If C<$enable> is true (or missing), then the C method will +not generate characters outside the code range C<0..127>. Any unicode +characters outside that range will be escaped using either a single +\uXXXX (BMP characters) or a double \uHHHH\uLLLLL escape sequence, as per +RFC4627. If C<$enable> is false, then the C method will not escape Unicode characters unless necessary. @@ -129,20 +136,20 @@ JSON::XS->new->ascii (1)->encode (chr 0x10401) => \ud801\udc01 -=item $json = $json->utf8 ($enable) +=item $json = $json->utf8 ([$enable]) -If C<$enable> is true, then the C method will encode the JSON -string into UTF-8, as required by many protocols, while the C -method expects to be handled an UTF-8-encoded string. Please note that -UTF-8-encoded strings do not contain any characters outside the range -C<0..255>, they are thus useful for bytewise/binary I/O. +If C<$enable> is true (or missing), then the C method will encode +the JSON string into UTF-8, as required by many protocols, while the +C method expects to be handled an UTF-8-encoded string. Please +note that UTF-8-encoded strings do not contain any characters outside the +range C<0..255>, they are thus useful for bytewise/binary I/O. If C<$enable> is false, then the C method will return the JSON string as a (non-encoded) unicode string, while C expects thus a unicode string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs to be done yourself, e.g. using the Encode module. -=item $json = $json->pretty ($enable) +=item $json = $json->pretty ([$enable]) This enables (or disables) all of the C, C and C (and in the future possibly more) flags in one call to @@ -157,9 +164,9 @@ ] } -=item $json = $json->indent ($enable) +=item $json = $json->indent ([$enable]) -If C<$enable> is true, then the C method will use a multiline +If C<$enable> is true (or missing), then the C method will use a multiline format as output, putting every array member or object/hash key-value pair into its own line, identing them properly. @@ -168,9 +175,9 @@ This setting has no effect when decoding JSON strings. -=item $json = $json->space_before ($enable) +=item $json = $json->space_before ([$enable]) -If C<$enable> is true, then the C method will add an extra +If C<$enable> is true (or missing), then the C method will add an extra optional space before the C<:> separating keys from values in JSON objects. If C<$enable> is false, then the C method will not add any extra @@ -179,9 +186,9 @@ This setting has no effect when decoding JSON strings. You will also most likely combine this setting with C. -=item $json = $json->space_after ($enable) +=item $json = $json->space_after ([$enable]) -If C<$enable> is true, then the C method will add an extra +If C<$enable> is true (or missing), then the C method will add an extra optional space after the C<:> separating keys from values in JSON objects and extra whitespace after the C<,> separating key-value pairs and array members. @@ -191,9 +198,9 @@ This setting has no effect when decoding JSON strings. -=item $json = $json->canonical ($enable) +=item $json = $json->canonical ([$enable]) -If C<$enable> is true, then the C method will output JSON objects +If C<$enable> is true (or missing), then the C method will output JSON objects by sorting their keys. This is adding a comparatively high overhead. If C<$enable> is false, then the C method will output key-value @@ -207,9 +214,9 @@ This setting has no effect when decoding JSON strings. -=item $json = $json->allow_nonref ($enable) +=item $json = $json->allow_nonref ([$enable]) -If C<$enable> is true, then the C method can convert a +If C<$enable> is true (or missing), then the C method can convert a non-reference into its corresponding string, number or null JSON value, which is an extension to RFC4627. Likewise, C will accept those JSON values instead of croaking. @@ -219,6 +226,27 @@ or array. Likewise, C will croak if given something that is not a JSON object or array. +=item $json = $json->shrink ([$enable]) + +Perl usually over-allocates memory a bit when allocating space for +strings. This flag optionally resizes strings generated by either +C or C to their minimum size possible. This can save +memory when your JSON strings are either very very long or you have many +short strings. It will also try to downgrade any strings to octet-form +if possible: perl stores strings internally either in an encoding called +UTF-X or in octet-form. The latter cannot store everything but uses less +space in general. + +If C<$enable> is true (or missing), the string returned by C will be shrunk-to-fit, +while all strings generated by C will also be shrunk-to-fit. + +If C<$enable> is false, then the normal perl allocation algorithms are used. +If you work with your data, then this is likely to be faster. + +In the future, this setting might control other things, such as converting +strings that look like integers or floats into integers or floats +internally (there is no difference on the Perl level), saving space. + =item $json_string = $json->encode ($perl_scalar) Converts the given Perl data structure (a simple scalar or a reference @@ -239,6 +267,122 @@ =back +=head1 MAPPING + +This section describes how JSON::XS maps Perl values to JSON values and +vice versa. These mappings are designed to "do the right thing" in most +circumstances automatically, preserving round-tripping characteristics +(what you put in comes out as something equivalent). + +For the more enlightened: note that in the following descriptions, +lowercase I refers to the Perl interpreter, while uppcercase I +refers to the abstract Perl language itself. + +=head2 JSON -> PERL + +=over 4 + +=item object + +A JSON object becomes a reference to a hash in Perl. No ordering of object +keys is preserved. + +=item array + +A JSON array becomes a reference to an array in Perl. + +=item string + +A JSON string becomes a string scalar in Perl - Unicode codepoints in JSON +are represented by the same codepoints in the Perl string, so no manual +decoding is necessary. + +=item number + +A JSON number becomes either an integer or numeric (floating point) +scalar in perl, depending on its range and any fractional parts. On the +Perl level, there is no difference between those as Perl handles all the +conversion details, but an integer may take slightly less memory and might +represent more values exactly than (floating point) numbers. + +=item true, false + +These JSON atoms become C<0>, C<1>, respectively. Information is lost in +this process. Future versions might represent those values differently, +but they will be guarenteed to act like these integers would normally in +Perl. + +=item null + +A JSON null atom becomes C in Perl. + +=back + +=head2 PERL -> JSON + +The mapping from Perl to JSON is slightly more difficult, as Perl is a +truly typeless language, so we can only guess which JSON type is meant by +a Perl value. + +=over 4 + +=item hash references + +Perl hash references become JSON objects. As there is no inherent ordering +in hash keys, they will usually be encoded in a pseudo-random order that +can change between runs of the same program but stays generally the same +within the single run of a program. JSON::XS can optionally sort the hash +keys (determined by the I flag), so the same datastructure +will serialise to the same JSON text (given same settings and version of +JSON::XS), but this incurs a runtime overhead. + +=item array references + +Perl array references become JSON arrays. + +=item blessed objects + +Blessed objects are not allowed. JSON::XS currently tries to encode their +underlying representation (hash- or arrayref), but this behaviour might +change in future versions. + +=item simple scalars + +Simple Perl scalars (any scalar that is not a reference) are the most +difficult objects to encode: JSON::XS will encode undefined scalars as +JSON null value, scalars that have last been used in a string context +before encoding as JSON strings and anything else as number value: + + # dump as number + to_json [2] # yields [2] + to_json [-3.0e17] # yields [-3e+17] + my $value = 5; to_json [$value] # yields [5] + + # used as string, so dump as string + print $value; + to_json [$value] # yields ["5"] + + # undef becomes null + to_json [undef] # yields [null] + +You can force the type to be a string by stringifying it: + + my $x = 3.1; # some variable containing a number + "$x"; # stringified + $x .= ""; # another, more awkward way to stringify + print $x; # perl does it for you, too, quite often + +You can force the type to be a number by numifying it: + + my $x = "3"; # some variable containing a string + $x += 0; # numify it, ensuring it will be dumped as a number + $x *= 1; # same thing, the choise is yours. + +You can not currently output JSON booleans or force the type in other, +less obscure, ways. Tell me if you need this capability. + +=back + =head1 COMPARISON As already mentioned, this module was created because none of the existing @@ -249,7 +393,7 @@ =over 4 -=item JSON +=item JSON 1.07 Slow (but very portable, as it is written in pure Perl). @@ -261,7 +405,7 @@ the string C<2.0> will encode to C<2.0> instead of C<"2.0">, and that will decode into the number 2. -=item JSON::PC +=item JSON::PC 0.01 Very fast. @@ -278,7 +422,7 @@ Unmaintained (maintainer unresponsive for many months, bugs are not getting fixed). -=item JSON::Syck +=item JSON::Syck 0.21 Very buggy (often crashes). @@ -307,7 +451,7 @@ good protocol will at least recover, that is extra unnecessary work and the transaction will still not succeed). -=item JSON::DWIW +=item JSON::DWIW 0.04 Very fast. Very natural. Very nice.