--- libecb/ecb.pod 2011/05/26 23:32:41 1.16 +++ libecb/ecb.pod 2011/12/10 11:58:38 1.39 @@ -17,6 +17,10 @@ it provides a number of other lowlevel C utilities, such as endianness detection, byte swapping or bit rotations. +Or in other words, things that should be built into any standard C system, +but aren't, implemented as efficient as possible with GCC, and still +correct with other compilers. + More might come. =head2 ABOUT THE HEADER @@ -54,7 +58,21 @@ =head2 GCC ATTRIBUTES -blabla where to put, what others +A major part of libecb deals with GCC attributes. These are additional +attributes that you can assign to functions, variables and sometimes even +types - much like C or C in C. + +While GCC allows declarations to show up in many surprising places, +but not in many expected places, the safest way is to put attribute +declarations before the whole declaration: + + ecb_const int mysqrt (int a); + ecb_unused int i; + +For variables, it is often nicer to put the attribute after the name, and +avoid multiple declarations using commas: + + int i ecb_unused; =over 4 @@ -85,6 +103,21 @@ #endif } +=item ecb_inline + +This is not actually an attribute, but you use it like one. It expands +either to C or to just C, if inline isn't +supported. It should be used to declare functions that should be inlined, +for code size or speed reasons. + +Example: inline this function, it surely will reduce codesize. + + ecb_inline int + negmul (int a, int b) + { + return - (a * b); + } + =item ecb_noinline Prevent a function from being inlined - it might be optimised away, but @@ -93,16 +126,113 @@ =item ecb_noreturn +Marks a function as "not returning, ever". Some typical functions that +don't return are C or C (which really works hard to not +return), and now you can make your own: + + ecb_noreturn void + my_abort (const char *errline) + { + puts (errline); + abort (); + } + +In this case, the compiler would probably be smart enough to deduce it on +its own, so this is mainly useful for declarations. + =item ecb_const +Declares that the function only depends on the values of its arguments, +much like a mathematical function. It specifically does not read or write +any memory any arguments might point to, global variables, or call any +non-const functions. It also must not have any side effects. + +Such a function can be optimised much more aggressively by the compiler - +for example, multiple calls with the same arguments can be optimised into +a single call, which wouldn't be possible if the compiler would have to +expect any side effects. + +It is best suited for functions in the sense of mathematical functions, +such as a function returning the square root of its input argument. + +Not suited would be a function that calculates the hash of some memory +area you pass in, prints some messages or looks at a global variable to +decide on rounding. + +See C for a slightly less restrictive class of functions. + =item ecb_pure +Similar to C, declares a function that has no side +effects. Unlike C, the function is allowed to examine global +variables and any other memory areas (such as the ones passed to it via +pointers). + +While these functions cannot be optimised as aggressively as C +functions, they can still be optimised away in many occasions, and the +compiler has more freedom in moving calls to them around. + +Typical examples for such functions would be C or C. A +function that calculates the MD5 sum of some input and updates some MD5 +state passed as argument would I be pure, however, as it would modify +some memory area that is not the return value. + =item ecb_hot +This declares a function as "hot" with regards to the cache - the function +is used so often, that it is very beneficial to keep it in the cache if +possible. + +The compiler reacts by trying to place hot functions near to each other in +memory. + +Whether a function is hot or not often depends on the whole program, +and less on the function itself. C is likely more useful in +practise. + =item ecb_cold +The opposite of C - declares a function as "cold" with regards to +the cache, or in other words, this function is not called often, or not at +speed-critical times, and keeping it in the cache might be a waste of said +cache. + +In addition to placing cold functions together (or at least away from hot +functions), this knowledge can be used in other ways, for example, the +function will be optimised for size, as opposed to speed, and codepaths +leading to calls to those functions can automatically be marked as if +C had been used to reach them. + +Good examples for such functions would be error reporting functions, or +functions only called in exceptional or rare cases. + =item ecb_artificial +Declares the function as "artificial", in this case meaning that this +function is not really mean to be a function, but more like an accessor +- many methods in C++ classes are mere accessor functions, and having a +crash reported in such a method, or single-stepping through them, is not +usually so helpful, especially when it's inlined to just a few instructions. + +Marking them as artificial will instruct the debugger about just this, +leading to happier debugging and thus happier lives. + +Example: in some kind of smart-pointer class, mark the pointer accessor as +artificial, so that the whole class acts more like a pointer and less like +some C++ abstraction monster. + + template + struct my_smart_ptr + { + T *value; + + ecb_artificial + operator T *() + { + return value; + } + }; + =back =head2 OPTIMISATION HINTS @@ -143,12 +273,12 @@ the C evaluates to C a lot, which can be used for static branch optimisations. -Usually, you want to use the more intuitive C and -C functions instead. +Usually, you want to use the more intuitive C and +C functions instead. -=item bool ecb_likely (cond) +=item bool ecb_expect_true (cond) -=item bool ecb_unlikely (cond) +=item bool ecb_expect_false (cond) These two functions expect a expression that is true or false and return C<1> or C<0>, respectively, so when used in the condition of an C or @@ -156,18 +286,18 @@ /* these two do the same thing */ if (some_condition) ...; - if (ecb_likely (some_condition)) ...; + if (ecb_expect_true (some_condition)) ...; -However, by using C, you tell the compiler that the condition -is likely to be true (and for C, that it is unlikely to be -true). +However, by using C, you tell the compiler that the +condition is likely to be true (and for C, that it is +unlikely to be true). For example, when you check for a null pointer and expect this to be a -rare, exceptional, case, then use C: +rare, exceptional, case, then use C: void my_free (void *ptr) { - if (ecb_unlikely (ptr == 0)) + if (ecb_expect_false (ptr == 0)) return; } @@ -175,6 +305,12 @@ tell the compiler what the hot path through a function is can increase performance considerably. +You might know these functions under the name C and C +- while these are common aliases, we find that the expect name is easier +to understand when quickly skimming code. If you wish, you can use +C instead of C and C instead of +C - these are simply aliases. + A very good example is in a function that reserves more space for some memory block (for example, inside an implementation of a string stream) - each time something is added, you have to check for a buffer overrun, but @@ -184,7 +320,7 @@ ecb_inline void reserve (int size) { - if (ecb_unlikely (current + size > end)) + if (ecb_expect_false (current + size > end)) real_reserve_method (size); /* presumably noinline */ } @@ -197,13 +333,13 @@ conditions that might improve code generation, but which are impossible to deduce form the code itself. -For example, the example reservation function from the C +For example, the example reservation function from the C description could be written thus (only C was added): ecb_inline void reserve (int size) { - if (ecb_unlikely (current + size > end)) + if (ecb_expect_false (current + size > end)) real_reserve_method (size); /* presumably noinline */ ecb_assume (current + size <= end); @@ -262,7 +398,7 @@ =back -=head2 BIT FIDDLING / BITSTUFFS +=head2 BIT FIDDLING / BIT WIZARDRY =over 4 @@ -274,37 +410,100 @@ (most-significant byte first) or little endian (least-significant byte first) respectively. +On systems that are neither, their return values are unspecified. + =item int ecb_ctz32 (uint32_t x) +=item int ecb_ctz64 (uint64_t x) + Returns the index of the least significant bit set in C (or -equivalently the number of bits set to 0 before the least significant -bit set), starting from 0. If C is 0 the result is undefined. A -common use case is to compute the integer binary logarithm, i.e., -floor(log2(n)). For example: +equivalently the number of bits set to 0 before the least significant bit +set), starting from 0. If C is 0 the result is undefined. + +For smaller types than C you can safely use C. + +For example: ecb_ctz32 (3) = 0 ecb_ctz32 (6) = 1 +=item int ecb_ld32 (uint32_t x) + +=item int ecb_ld64 (uint64_t x) + +Returns the index of the most significant bit set in C, or the number +of digits the number requires in binary (so that C<< 2**ld <= x < +2**(ld+1) >>). If C is 0 the result is undefined. A common use case is +to compute the integer binary logarithm, i.e. C, for +example to see how many bits a certain number requires to be encoded. + +This function is similar to the "count leading zero bits" function, except +that that one returns how many zero bits are "in front" of the number (in +the given data type), while C returns how many bits the number +itself requires. + +For smaller types than C you can safely use C. + =item int ecb_popcount32 (uint32_t x) -Returns the number of bits set to 1 in C. For example: +=item int ecb_popcount64 (uint64_t x) + +Returns the number of bits set to 1 in C. + +For smaller types than C you can safely use C. + +For example: ecb_popcount32 (7) = 3 ecb_popcount32 (255) = 8 +=item uint8_t ecb_bitrev8 (uint8_t x) + +=item uint16_t ecb_bitrev16 (uint16_t x) + +=item uint32_t ecb_bitrev32 (uint32_t x) + +Reverses the bits in x, i.e. the MSB becomes the LSB, MSB-1 becomes LSB+1 +and so on. + +Example: + + ecb_bitrev8 (0xa7) = 0xea + ecb_bitrev32 (0xffcc4411) = 0x882233ff + =item uint32_t ecb_bswap16 (uint32_t x) =item uint32_t ecb_bswap32 (uint32_t x) -These two functions return the value of the 16-bit (32-bit) variable -C after reversing the order of bytes. +=item uint64_t ecb_bswap64 (uint64_t x) -=item uint32_t ecb_rotr32 (uint32_t x, unsigned int count) +These functions return the value of the 16-bit (32-bit, 64-bit) value +C after reversing the order of bytes (0x11223344 becomes 0x44332211 in +C). + +=item uint8_t ecb_rotl8 (uint8_t x, unsigned int count) + +=item uint16_t ecb_rotl16 (uint16_t x, unsigned int count) =item uint32_t ecb_rotl32 (uint32_t x, unsigned int count) -These two functions return the value of C after shifting all the bits -by C positions to the right or left respectively. +=item uint64_t ecb_rotl64 (uint64_t x, unsigned int count) + +=item uint8_t ecb_rotr8 (uint8_t x, unsigned int count) + +=item uint16_t ecb_rotr16 (uint16_t x, unsigned int count) + +=item uint32_t ecb_rotr32 (uint32_t x, unsigned int count) + +=item uint64_t ecb_rotr64 (uint64_t x, unsigned int count) + +These two families of functions return the value of C after rotating +all the bits by C positions to the right (C) or left +(C). + +Current GCC versions understand these functions and usually compile them +to "optimal" code (e.g. a single C or a combination of C on +x86). =back @@ -314,13 +513,38 @@ =item x = ecb_mod (m, n) -Returns the positive remainder of the modulo operation between C and -C. Unlike the C modulo operator C<%>, this function ensures that the -return value is always positive). +Returns C modulo C, which is the same as the positive remainder +of the division operation between C and C, using floored +division. Unlike the C remainder operator C<%>, this function ensures that +the return value is always positive and that the two numbers I and +I result in the same value modulo I - in other words, +C implements the mathematical modulo operation, which is missing +in the language. -C must be strictly positive (i.e. C<< >1 >>), while C must be +C must be strictly positive (i.e. C<< >= 1 >>), while C must be negatable, that is, both C and C<-m> must be representable in its -type. +type (this typically excludes the minimum signed integer value, the same +limitation as for C and C<%> in C). + +Current GCC versions compile this into an efficient branchless sequence on +almost all CPUs. + +For example, when you want to rotate forward through the members of an +array for increasing C (which might be negative), then you should use +C, as the C<%> operator might give either negative results, or +change direction for negative values: + + for (m = -100; m <= 100; ++m) + int elem = myarray [ecb_mod (m, ecb_array_length (myarray))]; + +=item x = ecb_div_rd (val, div) + +=item x = ecb_div_ru (val, div) + +Returns C divided by C
rounded down or up, respectively. +C and C
must have integer types and C
must be strictly +positive. Note that these functions are implemented with macros in C +and with function templates in C++. =back @@ -328,7 +552,7 @@ =over 4 -=item element_count = ecb_array_length (name) [MACRO] +=item element_count = ecb_array_length (name) Returns the number of elements in the array C. For example: