[ViewVC] Diff of: cvs/cvsroot/libecb/ecb.pod

Comparing cvsroot/libecb/ecb.pod (file contents):
Revision 1.94 by root, Sat Jul 31 16:13:30 2021 UTC vs.
Revision 1.101 by root, Mon Nov 22 17:15:50 2021 UTC

 =item uint64_t ecb_rotr64 (uint64_t x, unsigned int count)
 These two families of functions return the value of C<x> after rotating
 all the bits by C<count> positions to the right (C<ecb_rotr>) or left
 (C<ecb_rotl>). There are no restrictions on the value C<count>, i.e. both
-zero and values equal or larger than the word width work correctly.
+zero and values equal or larger than the word width work correctly. Also,
+notwithstanding C<count> being unsigned, negative numbers work and shift
+to the opposite direction.
 Current GCC/clang versions understand these functions and usually compile
 them to "optimal" code (e.g. a single C<rol> or a combination of C<shld>
 on x86).
 =item T ecb_rotr (T x, unsigned int count) [C++]
 Overloaded C++ rotl/rotr functions.
 C<T> must be one of C<uint8_t>, C<uint16_t>, C<uint32_t> or C<uint64_t>.
+=back
+=head2 BIT MIXING, HASHING
+Sometimes you have an integer and want to distribute its bits well, for
+example, to use it as a hash in a hashtable. A common example is pointer
+values, which often only have a limited range (e.g. low and high bits are
+often zero).
+The following functions try to mix the bits to get a good bias-free
+distribution. They were mainly made for pointers, but the underlying
+integer functions are exposed as well.
+As an added benefit, the functions are reversible, so if you find it
+convenient to store only the hash value, you can recover the original
+pointer from the hash ("unmix"), as long as your pinters are 32 or 64 bit
+(if this isn't the case on your platform, drop us a note and we will add
+functions for other bit widths).
+The unmix functions are very slightly slower than the mix functions, so
+it is equally very slightly preferable to store the original values wehen
+convenient.
+The underlying algorithm if subject to change, so currently these
+functions are not suitable for persistent hash tables, as their result
+value can change between diferent versions of libecb.
+=over
+=item uintptr_t ecb_ptrmix (void *ptr)
+Mixes the bits of a pointer so the result is suitable for hash table
+lookups. In other words, this hashes the pointer value.
+=item uintptr_t ecb_ptrmix (T *ptr) [C++]
+Overload the C<ecb_ptrmix> function to work for any pointer in C++.
+=item void *ecb_ptrunmix (uintptr_t v)
+Unmix the hash value into the original pointer. This only works as long
+as the hash value is not truncated, i.e. you used C<uintptr_t> (or
+equivalent) throughout to store it.
+=item T *ecb_ptrunmix<T> (uintptr_t v) [C++]
+The somewhat less useful template version of C<ecb_ptrunmix> for
+C++. Example:
+   sometype *myptr;
+   uintptr_t hash = ecb_ptrmix (myptr);
+   sometype *orig = ecb_ptrunmix<sometype> (hash);
+=item uint32_t ecb_mix32 (uint32_t v)
+=item uint64_t ecb_mix64 (uint64_t v)
+Sometimes you don't have a pointer but an integer whose values are very
+badly distributed. In this case you cna sue these integer versions of the
+mixing function. No C++ template is provided currently.
+=item uint32_t ecb_unmix32 (uint32_t v)
+=item uint64_t ecb_unmix64 (uint64_t v)
+The reverse of the C<ecb_mix> functions - they take a mixed/hashed value
+and recover the original value.
 =back
 =head2 HOST ENDIANNESS CONVERSION
 Similar to their 32 bit counterparts, these take a 64 bit argument.
 =item ECB_I2A_MAX_DIGITS (=21)
-Instead of using a type specific length macro, youi can just use
+Instead of using a type specific length macro, you can just use
 C<ECB_I2A_MAX_DIGITS>, which is good enough for any C<ecb_i2a> function.
 =back
 =head3 LOW-LEVEL API
 The functions above use a number of low-level APIs which have some strict
-limitations, but can be used as building blocks (study of C<ecb_i2a_i32>
+limitations, but can be used as building blocks (studying C<ecb_i2a_i32>
 and related functions is recommended).
 There are three families of functions: functions that convert a number
 to a fixed number of digits with leading zeroes (C<ecb_i2a_0N>, C<0>
 for "leading zeroes"), functions that generate up to N digits, skipping
 =item char *ecb_i2a_08  (char *ptr, uint32_t value) // 64 bit
 =item char *ecb_i2a_09  (char *ptr, uint32_t value) // 64 bit
-The C<< ecb_i2a_0I<N> > functions take an unsigned I<value> and convert
+The C<< ecb_i2a_0I<N> >> functions take an unsigned I<value> and convert
 them to exactly I<N> digits, returning a pointer to the first character
 after the digits. The I<value> must be in range. The functions marked with
 I<32 bit> do their calculations internally in 32 bit, the ones marked with
 I<64 bit> internally use 64 bit integers, which might be slow on 32 bit
 architectures (the high level API decides on 32 vs. 64 bit versions using
 =item char *ecb_i2a_8   (char *ptr, uint32_t value) // 64 bit
 =item char *ecb_i2a_9   (char *ptr, uint32_t value) // 64 bit
-Similarly, the C<< ecb_i2a_I<N> > functions take an unsigned I<value>
+Similarly, the C<< ecb_i2a_I<N> >> functions take an unsigned I<value>
 and convert them to at most I<N> digits, suppressing leading zeroes, and
 returning a pointer to the first character after the digits.
 =item ECB_I2A_MAX_X5 (=59074)
 =item ECB_I2A_MAX_X10 (=2932500665)
 =item char *ecb_i2a_x10 (char *ptr, uint32_t value) // 64 bit
-The C<< ecb_i2a_xI<N> >> functions are similar to the C<< ecb_i2a_I<N> >
+The C<< ecb_i2a_xI<N> >> functions are similar to the C<< ecb_i2a_I<N> >>
 functions, but they can generate one digit more, as long as the number
 is within range, which is given by the symbols C<ECB_I2A_MAX_X5> (almost
 16 bit range) and C<ECB_I2A_MAX_X10> (a bit more than 31 bit range),
 respectively.
 IEEE compliant, of course at a speed and code size penalty, and of course
 also within reasonable limits (it tries to convert NaNs, infinities and
 denormals, but will likely convert negative zero to positive zero).
 On all modern platforms (where C<ECB_STDFP> is true), the compiler should
-be able to optimise away this function completely.
+be able to completely optimise away the 32 and 64 bit functions.
 These functions can be helpful when serialising floats to the network - you
 can serialise the return value like a normal uint16_t/uint32_t/uint64_t.
 Another use for these functions is to manipulate floating point values
 intended to be internal-use only, some of which we forgot to document, and
 some of which we hide because we are not sure we will keep the interface
 stable.
 While you are welcome to rummage around and use whatever you find useful
-(we can't stop you), keep in mind that we will change undocumented
+(we don't want to stop you), keep in mind that we will change undocumented
 functionality in incompatible ways without thinking twice, while we are
 considerably more conservative with documented things.
 =head1 AUTHORS
 C<libecb> is designed and maintained by:
    Emanuele Giaquinta <e.giaquinta@glauco.it>
    Marc Alexander Lehmann <schmorp@schmorp.de>

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing cvsroot/libecb/ecb.pod (file contents): Revision 1.94 by root, Sat Jul 31 16:13:30 2021 UTC vs. Revision 1.101 by root, Mon Nov 22 17:15:50 2021 UTC

Diff Legend

Comparing cvsroot/libecb/ecb.pod (file contents):
Revision 1.94 by root, Sat Jul 31 16:13:30 2021 UTC vs.
Revision 1.101 by root, Mon Nov 22 17:15:50 2021 UTC