--- libecb/ecb.pod	2011/05/26 23:32:41	1.16
+++ libecb/ecb.pod	2011/06/01 00:57:14	1.26
@@ -17,6 +17,10 @@
 it provides a number of other lowlevel C utilities, such as endianness
 detection, byte swapping or bit rotations.
 
+Or in other words, things that should be built into any standard C system,
+but aren't, implemented as efficient as possible with GCC, and still
+correct with other compilers.
+
 More might come.
 
 =head2 ABOUT THE HEADER
@@ -54,7 +58,21 @@
 
 =head2 GCC ATTRIBUTES
 
-blabla where to put, what others
+A major part of libecb deals with GCC attributes. These are additional
+attributes that you can assign to functions, variables and sometimes even
+types - much like C<const> or C<volatile> in C.
+
+While GCC allows declarations to show up in many surprising places,
+but not in many expected places, the safest way is to put attribute
+declarations before the whole declaration:
+
+   ecb_const int mysqrt (int a);
+   ecb_unused int i;
+
+For variables, it is often nicer to put the attribute after the name, and
+avoid multiple declarations using commas:
+
+   int i ecb_unused;
 
 =over 4
 
@@ -93,16 +111,113 @@
 
 =item ecb_noreturn
 
+Marks a function as "not returning, ever". Some typical functions that
+don't return are C<exit> or C<abort> (which really works hard to not
+return), and now you can make your own:
+
+   ecb_noreturn void
+   my_abort (const char *errline)
+   {
+     puts (errline);
+     abort ();
+   }
+
+In this case, the compiler would probably be smart enough to deduce it on
+its own, so this is mainly useful for declarations.
+
 =item ecb_const
 
+Declares that the function only depends on the values of its arguments,
+much like a mathematical function. It specifically does not read or write
+any memory any arguments might point to, global variables, or call any
+non-const functions. It also must not have any side effects.
+
+Such a function can be optimised much more aggressively by the compiler -
+for example, multiple calls with the same arguments can be optimised into
+a single call, which wouldn't be possible if the compiler would have to
+expect any side effects.
+
+It is best suited for functions in the sense of mathematical functions,
+such as a function returning the square root of its input argument.
+
+Not suited would be a function that calculates the hash of some memory
+area you pass in, prints some messages or looks at a global variable to
+decide on rounding.
+
+See C<ecb_pure> for a slightly less restrictive class of functions.
+
 =item ecb_pure
 
+Similar to C<ecb_const>, declares a function that has no side
+effects. Unlike C<ecb_const>, the function is allowed to examine global
+variables and any other memory areas (such as the ones passed to it via
+pointers).
+
+While these functions cannot be optimised as aggressively as C<ecb_const>
+functions, they can still be optimised away in many occasions, and the
+compiler has more freedom in moving calls to them around.
+
+Typical examples for such functions would be C<strlen> or C<memcmp>. A
+function that calculates the MD5 sum of some input and updates some MD5
+state passed as argument would I<NOT> be pure, however, as it would modify
+some memory area that is not the return value.
+
 =item ecb_hot
 
+This declares a function as "hot" with regards to the cache - the function
+is used so often, that it is very beneficial to keep it in the cache if
+possible.
+
+The compiler reacts by trying to place hot functions near to each other in
+memory.
+
+Whether a function is hot or not often depends on the whole program,
+and less on the function itself. C<ecb_cold> is likely more useful in
+practise.
+
 =item ecb_cold
 
+The opposite of C<ecb_hot> - declares a function as "cold" with regards to
+the cache, or in other words, this function is not called often, or not at
+speed-critical times, and keeping it in the cache might be a waste of said
+cache.
+
+In addition to placing cold functions together (or at least away from hot
+functions), this knowledge can be used in other ways, for example, the
+function will be optimised for size, as opposed to speed, and codepaths
+leading to calls to those functions can automatically be marked as if
+C<ecb_unlikely> had been used to reach them.
+
+Good examples for such functions would be error reporting functions, or
+functions only called in exceptional or rare cases.
+
 =item ecb_artificial
 
+Declares the function as "artificial", in this case meaning that this
+function is not really mean to be a function, but more like an accessor
+- many methods in C++ classes are mere accessor functions, and having a
+crash reported in such a method, or single-stepping through them, is not
+usually so helpful, especially when it's inlined to just a few instructions.
+
+Marking them as artificial will instruct the debugger about just this,
+leading to happier debugging and thus happier lives.
+
+Example: in some kind of smart-pointer class, mark the pointer accessor as
+artificial, so that the whole class acts more like a pointer and less like
+some C++ abstraction monster.
+
+  template<typename T>
+  struct my_smart_ptr
+  {
+    T *value;
+
+    ecb_artificial
+    operator T *()
+    {
+      return value;
+    }
+  };
+
 =back
 
 =head2 OPTIMISATION HINTS
@@ -274,13 +389,15 @@
 (most-significant byte first) or little endian (least-significant byte
 first) respectively.
 
+On systems that are neither, their return values are unspecified.
+
 =item int ecb_ctz32 (uint32_t x)
 
 Returns the index of the least significant bit set in C<x> (or
-equivalently the number of bits set to 0 before the least significant
-bit set), starting from 0. If C<x> is 0 the result is undefined. A
-common use case is to compute the integer binary logarithm, i.e.,
-floor(log2(n)). For example:
+equivalently the number of bits set to 0 before the least significant bit
+set), starting from 0. If C<x> is 0 the result is undefined. A common use
+case is to compute the integer binary logarithm, i.e.,  C<floor (log2
+(n))>. For example:
 
   ecb_ctz32 (3) = 0
   ecb_ctz32 (6) = 1
@@ -296,16 +413,19 @@
 
 =item uint32_t ecb_bswap32 (uint32_t x)
 
-These two functions return the value of the 16-bit (32-bit) variable
-C<x> after reversing the order of bytes.
+These two functions return the value of the 16-bit (32-bit) value C<x>
+after reversing the order of bytes (0x11223344 becomes 0x44332211).
 
 =item uint32_t ecb_rotr32 (uint32_t x, unsigned int count)
 
 =item uint32_t ecb_rotl32 (uint32_t x, unsigned int count)
 
-These two functions return the value of C<x> after shifting all the bits
+These two functions return the value of C<x> after rotating all the bits
 by C<count> positions to the right or left respectively.
 
+Current GCC versions understand these functions and usually compile them
+to "optimal" code (e.g. a single C<roll> on x86).
+
 =back
 
 =head2 ARITHMETIC
@@ -314,13 +434,29 @@
 
 =item x = ecb_mod (m, n)
 
-Returns the positive remainder of the modulo operation between C<m> and
-C<n>. Unlike the C modulo operator C<%>, this function ensures that the
-return value is always positive).
+Returns C<m> modulo C<n>, which is the same as the positive remainder
+of the division operation between C<m> and C<n>, using floored
+division. Unlike the C remainder operator C<%>, this function ensures that
+the return value is always positive and that the two numbers I<m> and
+I<m' = m + i * n> result in the same value modulo I<n> - in other words,
+C<ecb_mod> implements the mathematical modulo operation, which is missing
+in the language.
 
-C<n> must be strictly positive (i.e. C<< >1 >>), while C<m> must be
+C<n> must be strictly positive (i.e. C<< >= 1 >>), while C<m> must be
 negatable, that is, both C<m> and C<-m> must be representable in its
-type.
+type (this typically includes the minimum signed integer value, the same
+limitation as for C</> and C<%> in C).
+
+Current GCC versions compile this into an efficient branchless sequence on
+many systems.
+
+For example, when you want to rotate forward through the members of an
+array for increasing C<m> (which might be negative), then you should use
+C<ecb_mod>, as the C<%> operator might give either negative results, or
+change direction for negative values:
+
+   for (m = -100; m <= 100; ++m)
+     int elem = myarray [ecb_mod (m, ecb_array_length (myarray))];
 
 =back
 
@@ -328,7 +464,7 @@
 
 =over 4
 
-=item element_count = ecb_array_length (name) [MACRO]
+=item element_count = ecb_array_length (name)
 
 Returns the number of elements in the array C<name>. For example: