ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/libecb/ecb.pod
Revision: 1.83
Committed: Mon Jan 20 21:08:19 2020 UTC (4 years, 5 months ago) by root
Branch: MAIN
Changes since 1.82: +3 -0 lines
Log Message:
*** empty log message ***

File Contents

# User Rev Content
1 root 1.14 =head1 LIBECB - e-C-Builtins
2 root 1.3
3 root 1.14 =head2 ABOUT LIBECB
4    
5     Libecb is currently a simple header file that doesn't require any
6     configuration to use or include in your project.
7    
8 sf-exg 1.16 It's part of the e-suite of libraries, other members of which include
9 root 1.14 libev and libeio.
10    
11     Its homepage can be found here:
12    
13     http://software.schmorp.de/pkg/libecb
14    
15     It mainly provides a number of wrappers around GCC built-ins, together
16     with replacement functions for other compilers. In addition to this,
17 sf-exg 1.16 it provides a number of other lowlevel C utilities, such as endianness
18 root 1.14 detection, byte swapping or bit rotations.
19    
20 root 1.24 Or in other words, things that should be built into any standard C system,
21     but aren't, implemented as efficient as possible with GCC, and still
22     correct with other compilers.
23 root 1.17
24 root 1.14 More might come.
25 root 1.3
26     =head2 ABOUT THE HEADER
27    
28 root 1.14 At the moment, all you have to do is copy F<ecb.h> somewhere where your
29     compiler can find it and include it:
30    
31     #include <ecb.h>
32    
33     The header should work fine for both C and C++ compilation, and gives you
34     all of F<inttypes.h> in addition to the ECB symbols.
35    
36 sf-exg 1.16 There are currently no object files to link to - future versions might
37 root 1.14 come with an (optional) object code library to link against, to reduce
38     code size or gain access to additional features.
39    
40     It also currently includes everything from F<inttypes.h>.
41    
42     =head2 ABOUT THIS MANUAL / CONVENTIONS
43    
44     This manual mainly describes each (public) function available after
45     including the F<ecb.h> header. The header might define other symbols than
46     these, but these are not part of the public API, and not supported in any
47     way.
48    
49     When the manual mentions a "function" then this could be defined either as
50     as inline function, a macro, or an external symbol.
51    
52     When functions use a concrete standard type, such as C<int> or
53     C<uint32_t>, then the corresponding function works only with that type. If
54     only a generic name is used (C<expr>, C<cond>, C<value> and so on), then
55     the corresponding function relies on C to implement the correct types, and
56     is usually implemented as a macro. Specifically, a "bool" in this manual
57     refers to any kind of boolean value, not a specific type.
58 root 1.1
59 root 1.40 =head2 TYPES / TYPE SUPPORT
60    
61     ecb.h makes sure that the following types are defined (in the expected way):
62    
63 root 1.76 int8_t uint8_
64     int16_t uint16_t
65     int32_t uint32_
66     int64_t uint64_t
67     int_fast8_t uint_fast8_t
68     int_fast16_t uint_fast16_t
69     int_fast32_t uint_fast32_t
70     int_fast64_t uint_fast64_t
71     intptr_t uintptr_t
72 root 1.40
73     The macro C<ECB_PTRSIZE> is defined to the size of a pointer on this
74 root 1.45 platform (currently C<4> or C<8>) and can be used in preprocessor
75     expressions.
76 root 1.40
77 root 1.74 For C<ptrdiff_t> and C<size_t> use C<stddef.h>/C<cstddef>.
78 root 1.49
79 root 1.62 =head2 LANGUAGE/ENVIRONMENT/COMPILER VERSIONS
80 root 1.43
81 sf-exg 1.46 All the following symbols expand to an expression that can be tested in
82 root 1.44 preprocessor instructions as well as treated as a boolean (use C<!!> to
83     ensure it's either C<0> or C<1> if you need that).
84    
85 root 1.43 =over 4
86    
87 root 1.44 =item ECB_C
88    
89     True if the implementation defines the C<__STDC__> macro to a true value,
90 root 1.82 while not claiming to be C++, i..e C, but not C++.
91 root 1.44
92 root 1.43 =item ECB_C99
93    
94 root 1.47 True if the implementation claims to be compliant to C99 (ISO/IEC
95 root 1.55 9899:1999) or any later version, while not claiming to be C++.
96 root 1.47
97     Note that later versions (ECB_C11) remove core features again (for
98     example, variable length arrays).
99 root 1.43
100 root 1.74 =item ECB_C11, ECB_C17
101 root 1.43
102 root 1.74 True if the implementation claims to be compliant to C11/C17 (ISO/IEC
103     9899:2011, :20187) or any later version, while not claiming to be C++.
104 root 1.44
105     =item ECB_CPP
106    
107     True if the implementation defines the C<__cplusplus__> macro to a true
108     value, which is typically true for C++ compilers.
109    
110 root 1.74 =item ECB_CPP11, ECB_CPP14, ECB_CPP17
111 root 1.44
112 root 1.74 True if the implementation claims to be compliant to C++11/C++14/C++17
113     (ISO/IEC 14882:2011, :2014, :2017) or any later version.
114 root 1.43
115 root 1.83 Note that many C++20 features will likely have their own feature test
116     macros (see e.g. L<http://eel.is/c++draft/cpp.predefined#1.8>).
117    
118 root 1.81 =item ECB_OPTIMIZE_SIZE
119    
120     Is C<1> when the compiler optimizes for size, C<0> otherwise. This symbol
121     can also be defined before including F<ecb.h>, in which case it will be
122     unchanged.
123    
124 root 1.57 =item ECB_GCC_VERSION (major, minor)
125 root 1.43
126     Expands to a true value (suitable for testing in by the preprocessor)
127 sf-exg 1.46 if the compiler used is GNU C and the version is the given version, or
128 root 1.43 higher.
129    
130     This macro tries to return false on compilers that claim to be GCC
131     compatible but aren't.
132    
133 root 1.50 =item ECB_EXTERN_C
134    
135     Expands to C<extern "C"> in C++, and a simple C<extern> in C.
136    
137     This can be used to declare a single external C function:
138    
139     ECB_EXTERN_C int printf (const char *format, ...);
140    
141     =item ECB_EXTERN_C_BEG / ECB_EXTERN_C_END
142    
143     These two macros can be used to wrap multiple C<extern "C"> definitions -
144     they expand to nothing in C.
145    
146     They are most useful in header files:
147    
148     ECB_EXTERN_C_BEG
149    
150     int mycfun1 (int x);
151     int mycfun2 (int x);
152    
153     ECB_EXTERN_C_END
154    
155     =item ECB_STDFP
156    
157     If this evaluates to a true value (suitable for testing in by the
158     preprocessor), then C<float> and C<double> use IEEE 754 single/binary32
159     and double/binary64 representations internally I<and> the endianness of
160     both types match the endianness of C<uint32_t> and C<uint64_t>.
161    
162     This means you can just copy the bits of a C<float> (or C<double>) to an
163     C<uint32_t> (or C<uint64_t>) and get the raw IEEE 754 bit representation
164     without having to think about format or endianness.
165    
166     This is true for basically all modern platforms, although F<ecb.h> might
167     not be able to deduce this correctly everywhere and might err on the safe
168     side.
169    
170 root 1.54 =item ECB_AMD64, ECB_AMD64_X32
171    
172     These two macros are defined to C<1> on the x86_64/amd64 ABI and the X32
173     ABI, respectively, and undefined elsewhere.
174    
175     The designers of the new X32 ABI for some inexplicable reason decided to
176     make it look exactly like amd64, even though it's completely incompatible
177     to that ABI, breaking about every piece of software that assumed that
178     C<__x86_64> stands for, well, the x86-64 ABI, making these macros
179     necessary.
180    
181 root 1.43 =back
182    
183 root 1.62 =head2 MACRO TRICKERY
184    
185     =over 4
186    
187     =item ECB_CONCAT (a, b)
188    
189     Expands any macros in C<a> and C<b>, then concatenates the result to form
190     a single token. This is mainly useful to form identifiers from components,
191     e.g.:
192    
193     #define S1 str
194     #define S2 cpy
195    
196     ECB_CONCAT (S1, S2)(dst, src); // == strcpy (dst, src);
197    
198     =item ECB_STRINGIFY (arg)
199    
200     Expands any macros in C<arg> and returns the stringified version of
201     it. This is mainly useful to get the contents of a macro in string form,
202     e.g.:
203    
204     #define SQL_LIMIT 100
205     sql_exec ("select * from table limit " ECB_STRINGIFY (SQL_LIMIT));
206    
207 root 1.64 =item ECB_STRINGIFY_EXPR (expr)
208    
209     Like C<ECB_STRINGIFY>, but additionally evaluates C<expr> to make sure it
210     is a valid expression. This is useful to catch typos or cases where the
211     macro isn't available:
212    
213     #include <errno.h>
214    
215     ECB_STRINGIFY (EDOM); // "33" (on my system at least)
216     ECB_STRINGIFY_EXPR (EDOM); // "33"
217    
218     // now imagine we had a typo:
219    
220     ECB_STRINGIFY (EDAM); // "EDAM"
221     ECB_STRINGIFY_EXPR (EDAM); // error: EDAM undefined
222    
223 root 1.62 =back
224    
225 sf-exg 1.60 =head2 ATTRIBUTES
226 root 1.1
227 sf-exg 1.60 A major part of libecb deals with additional attributes that can be
228     assigned to functions, variables and sometimes even types - much like
229     C<const> or C<volatile> in C. They are implemented using either GCC
230     attributes or other compiler/language specific features. Attributes
231     declarations must be put before the whole declaration:
232 root 1.20
233     ecb_const int mysqrt (int a);
234     ecb_unused int i;
235    
236 root 1.1 =over 4
237    
238 root 1.3 =item ecb_unused
239    
240     Marks a function or a variable as "unused", which simply suppresses a
241     warning by GCC when it detects it as unused. This is useful when you e.g.
242     declare a variable but do not always use it:
243    
244 root 1.15 {
245 sf-exg 1.61 ecb_unused int var;
246 root 1.3
247 root 1.15 #ifdef SOMECONDITION
248     var = ...;
249     return var;
250     #else
251     return 0;
252     #endif
253     }
254 root 1.3
255 root 1.56 =item ecb_deprecated
256    
257     Similar to C<ecb_unused>, but marks a function, variable or type as
258     deprecated. This makes some compilers warn when the type is used.
259    
260 root 1.62 =item ecb_deprecated_message (message)
261    
262 root 1.67 Same as C<ecb_deprecated>, but if possible, the specified diagnostic is
263 root 1.62 used instead of a generic depreciation message when the object is being
264     used.
265    
266 root 1.31 =item ecb_inline
267 root 1.29
268 root 1.73 Expands either to (a compiler-specific equivalent of) C<static inline> or
269     to just C<static>, if inline isn't supported. It should be used to declare
270     functions that should be inlined, for code size or speed reasons.
271 root 1.29
272     Example: inline this function, it surely will reduce codesize.
273    
274 root 1.31 ecb_inline int
275 root 1.29 negmul (int a, int b)
276     {
277     return - (a * b);
278     }
279    
280 root 1.2 =item ecb_noinline
281    
282 sf-exg 1.66 Prevents a function from being inlined - it might be optimised away, but
283 root 1.3 not inlined into other functions. This is useful if you know your function
284     is rarely called and large enough for inlining not to be helpful.
285    
286 root 1.2 =item ecb_noreturn
287    
288 root 1.17 Marks a function as "not returning, ever". Some typical functions that
289     don't return are C<exit> or C<abort> (which really works hard to not
290     return), and now you can make your own:
291    
292     ecb_noreturn void
293     my_abort (const char *errline)
294     {
295     puts (errline);
296     abort ();
297     }
298    
299 sf-exg 1.19 In this case, the compiler would probably be smart enough to deduce it on
300     its own, so this is mainly useful for declarations.
301 root 1.17
302 root 1.53 =item ecb_restrict
303    
304     Expands to the C<restrict> keyword or equivalent on compilers that support
305     them, and to nothing on others. Must be specified on a pointer type or
306     an array index to indicate that the memory doesn't alias with any other
307     restricted pointer in the same scope.
308    
309     Example: multiply a vector, and allow the compiler to parallelise the
310     loop, because it knows it doesn't overwrite input values.
311    
312     void
313 sf-exg 1.61 multiply (ecb_restrict float *src,
314     ecb_restrict float *dst,
315 root 1.53 int len, float factor)
316     {
317     int i;
318    
319     for (i = 0; i < len; ++i)
320     dst [i] = src [i] * factor;
321     }
322    
323 root 1.2 =item ecb_const
324    
325 sf-exg 1.19 Declares that the function only depends on the values of its arguments,
326 root 1.17 much like a mathematical function. It specifically does not read or write
327     any memory any arguments might point to, global variables, or call any
328     non-const functions. It also must not have any side effects.
329    
330     Such a function can be optimised much more aggressively by the compiler -
331     for example, multiple calls with the same arguments can be optimised into
332     a single call, which wouldn't be possible if the compiler would have to
333     expect any side effects.
334    
335     It is best suited for functions in the sense of mathematical functions,
336 sf-exg 1.19 such as a function returning the square root of its input argument.
337 root 1.17
338     Not suited would be a function that calculates the hash of some memory
339     area you pass in, prints some messages or looks at a global variable to
340     decide on rounding.
341    
342     See C<ecb_pure> for a slightly less restrictive class of functions.
343    
344 root 1.2 =item ecb_pure
345    
346 root 1.17 Similar to C<ecb_const>, declares a function that has no side
347     effects. Unlike C<ecb_const>, the function is allowed to examine global
348     variables and any other memory areas (such as the ones passed to it via
349     pointers).
350    
351     While these functions cannot be optimised as aggressively as C<ecb_const>
352     functions, they can still be optimised away in many occasions, and the
353     compiler has more freedom in moving calls to them around.
354    
355     Typical examples for such functions would be C<strlen> or C<memcmp>. A
356     function that calculates the MD5 sum of some input and updates some MD5
357     state passed as argument would I<NOT> be pure, however, as it would modify
358     some memory area that is not the return value.
359    
360 root 1.2 =item ecb_hot
361    
362 root 1.17 This declares a function as "hot" with regards to the cache - the function
363     is used so often, that it is very beneficial to keep it in the cache if
364     possible.
365    
366     The compiler reacts by trying to place hot functions near to each other in
367     memory.
368    
369 sf-exg 1.19 Whether a function is hot or not often depends on the whole program,
370 root 1.17 and less on the function itself. C<ecb_cold> is likely more useful in
371     practise.
372    
373 root 1.2 =item ecb_cold
374    
375 root 1.17 The opposite of C<ecb_hot> - declares a function as "cold" with regards to
376     the cache, or in other words, this function is not called often, or not at
377     speed-critical times, and keeping it in the cache might be a waste of said
378     cache.
379    
380     In addition to placing cold functions together (or at least away from hot
381     functions), this knowledge can be used in other ways, for example, the
382     function will be optimised for size, as opposed to speed, and codepaths
383     leading to calls to those functions can automatically be marked as if
384 root 1.27 C<ecb_expect_false> had been used to reach them.
385 root 1.17
386     Good examples for such functions would be error reporting functions, or
387     functions only called in exceptional or rare cases.
388    
389 root 1.2 =item ecb_artificial
390    
391 root 1.17 Declares the function as "artificial", in this case meaning that this
392 root 1.52 function is not really meant to be a function, but more like an accessor
393 root 1.17 - many methods in C++ classes are mere accessor functions, and having a
394     crash reported in such a method, or single-stepping through them, is not
395     usually so helpful, especially when it's inlined to just a few instructions.
396    
397     Marking them as artificial will instruct the debugger about just this,
398     leading to happier debugging and thus happier lives.
399    
400     Example: in some kind of smart-pointer class, mark the pointer accessor as
401     artificial, so that the whole class acts more like a pointer and less like
402     some C++ abstraction monster.
403    
404     template<typename T>
405     struct my_smart_ptr
406     {
407     T *value;
408    
409     ecb_artificial
410     operator T *()
411     {
412     return value;
413     }
414     };
415    
416 root 1.2 =back
417 root 1.1
418     =head2 OPTIMISATION HINTS
419    
420     =over 4
421    
422 root 1.58 =item bool ecb_is_constant (expr)
423 root 1.1
424 root 1.3 Returns true iff the expression can be deduced to be a compile-time
425     constant, and false otherwise.
426    
427     For example, when you have a C<rndm16> function that returns a 16 bit
428     random number, and you have a function that maps this to a range from
429 root 1.5 0..n-1, then you could use this inline function in a header file:
430 root 1.3
431     ecb_inline uint32_t
432     rndm (uint32_t n)
433     {
434 root 1.6 return (n * (uint32_t)rndm16 ()) >> 16;
435 root 1.3 }
436    
437     However, for powers of two, you could use a normal mask, but that is only
438     worth it if, at compile time, you can detect this case. This is the case
439     when the passed number is a constant and also a power of two (C<n & (n -
440     1) == 0>):
441    
442     ecb_inline uint32_t
443     rndm (uint32_t n)
444     {
445     return is_constant (n) && !(n & (n - 1))
446     ? rndm16 () & (num - 1)
447 root 1.6 : (n * (uint32_t)rndm16 ()) >> 16;
448 root 1.3 }
449    
450 root 1.62 =item ecb_expect (expr, value)
451 root 1.1
452 root 1.7 Evaluates C<expr> and returns it. In addition, it tells the compiler that
453     the C<expr> evaluates to C<value> a lot, which can be used for static
454     branch optimisations.
455 root 1.1
456 root 1.27 Usually, you want to use the more intuitive C<ecb_expect_true> and
457     C<ecb_expect_false> functions instead.
458 root 1.1
459 root 1.27 =item bool ecb_expect_true (cond)
460 root 1.1
461 root 1.27 =item bool ecb_expect_false (cond)
462 root 1.1
463 root 1.7 These two functions expect a expression that is true or false and return
464     C<1> or C<0>, respectively, so when used in the condition of an C<if> or
465     other conditional statement, it will not change the program:
466    
467     /* these two do the same thing */
468     if (some_condition) ...;
469 root 1.27 if (ecb_expect_true (some_condition)) ...;
470 root 1.7
471 root 1.27 However, by using C<ecb_expect_true>, you tell the compiler that the
472     condition is likely to be true (and for C<ecb_expect_false>, that it is
473     unlikely to be true).
474 root 1.7
475 root 1.9 For example, when you check for a null pointer and expect this to be a
476 root 1.27 rare, exceptional, case, then use C<ecb_expect_false>:
477 root 1.7
478     void my_free (void *ptr)
479     {
480 root 1.27 if (ecb_expect_false (ptr == 0))
481 root 1.7 return;
482     }
483    
484     Consequent use of these functions to mark away exceptional cases or to
485     tell the compiler what the hot path through a function is can increase
486     performance considerably.
487    
488 root 1.27 You might know these functions under the name C<likely> and C<unlikely>
489     - while these are common aliases, we find that the expect name is easier
490     to understand when quickly skimming code. If you wish, you can use
491     C<ecb_likely> instead of C<ecb_expect_true> and C<ecb_unlikely> instead of
492     C<ecb_expect_false> - these are simply aliases.
493    
494 root 1.7 A very good example is in a function that reserves more space for some
495     memory block (for example, inside an implementation of a string stream) -
496 root 1.9 each time something is added, you have to check for a buffer overrun, but
497 root 1.7 you expect that most checks will turn out to be false:
498    
499     /* make sure we have "size" extra room in our buffer */
500     ecb_inline void
501     reserve (int size)
502     {
503 root 1.27 if (ecb_expect_false (current + size > end))
504 root 1.7 real_reserve_method (size); /* presumably noinline */
505     }
506    
507 root 1.62 =item ecb_assume (cond)
508 root 1.7
509 sf-exg 1.66 Tries to tell the compiler that some condition is true, even if it's not
510 root 1.65 obvious. This is not a function, but a statement: it cannot be used in
511     another expression.
512 root 1.7
513     This can be used to teach the compiler about invariants or other
514     conditions that might improve code generation, but which are impossible to
515     deduce form the code itself.
516    
517 root 1.27 For example, the example reservation function from the C<ecb_expect_false>
518 root 1.7 description could be written thus (only C<ecb_assume> was added):
519    
520     ecb_inline void
521     reserve (int size)
522     {
523 root 1.27 if (ecb_expect_false (current + size > end))
524 root 1.7 real_reserve_method (size); /* presumably noinline */
525    
526     ecb_assume (current + size <= end);
527     }
528    
529     If you then call this function twice, like this:
530    
531     reserve (10);
532     reserve (1);
533    
534     Then the compiler I<might> be able to optimise out the second call
535     completely, as it knows that C<< current + 1 > end >> is false and the
536     call will never be executed.
537    
538 root 1.62 =item ecb_unreachable ()
539 root 1.7
540     This function does nothing itself, except tell the compiler that it will
541 root 1.9 never be executed. Apart from suppressing a warning in some cases, this
542 root 1.65 function can be used to implement C<ecb_assume> or similar functionality.
543 root 1.7
544 root 1.62 =item ecb_prefetch (addr, rw, locality)
545 root 1.7
546     Tells the compiler to try to prefetch memory at the given C<addr>ess
547 root 1.10 for either reading (C<rw> = 0) or writing (C<rw> = 1). A C<locality> of
548 root 1.7 C<0> means that there will only be one access later, C<3> means that
549     the data will likely be accessed very often, and values in between mean
550     something... in between. The memory pointed to by the address does not
551     need to be accessible (it could be a null pointer for example), but C<rw>
552     and C<locality> must be compile-time constants.
553    
554 root 1.65 This is a statement, not a function: you cannot use it as part of an
555     expression.
556    
557 root 1.7 An obvious way to use this is to prefetch some data far away, in a big
558 root 1.9 array you loop over. This prefetches memory some 128 array elements later,
559 root 1.7 in the hope that it will be ready when the CPU arrives at that location.
560    
561     int sum = 0;
562    
563     for (i = 0; i < N; ++i)
564     {
565     sum += arr [i]
566     ecb_prefetch (arr + i + 128, 0, 0);
567     }
568    
569     It's hard to predict how far to prefetch, and most CPUs that can prefetch
570     are often good enough to predict this kind of behaviour themselves. It
571     gets more interesting with linked lists, especially when you do some fair
572     processing on each list element:
573    
574     for (node *n = start; n; n = n->next)
575     {
576     ecb_prefetch (n->next, 0, 0);
577     ... do medium amount of work with *n
578     }
579    
580     After processing the node, (part of) the next node might already be in
581     cache.
582 root 1.1
583 root 1.2 =back
584 root 1.1
585 root 1.36 =head2 BIT FIDDLING / BIT WIZARDRY
586 root 1.1
587 root 1.4 =over 4
588    
589 root 1.3 =item bool ecb_big_endian ()
590    
591     =item bool ecb_little_endian ()
592    
593 sf-exg 1.11 These two functions return true if the byte order is big endian
594     (most-significant byte first) or little endian (least-significant byte
595     first) respectively.
596    
597 root 1.24 On systems that are neither, their return values are unspecified.
598    
599 root 1.3 =item int ecb_ctz32 (uint32_t x)
600    
601 root 1.35 =item int ecb_ctz64 (uint64_t x)
602    
603 root 1.77 =item int ecb_ctz (T x) [C++]
604    
605 sf-exg 1.11 Returns the index of the least significant bit set in C<x> (or
606 root 1.24 equivalently the number of bits set to 0 before the least significant bit
607 root 1.35 set), starting from 0. If C<x> is 0 the result is undefined.
608    
609 root 1.36 For smaller types than C<uint32_t> you can safely use C<ecb_ctz32>.
610    
611 root 1.77 The overloaded C++ C<ecb_ctz> function supports C<uint8_t>, C<uint16_t>,
612     C<uint32_t> and C<uint64_t> types.
613    
614 root 1.35 For example:
615 sf-exg 1.11
616 root 1.15 ecb_ctz32 (3) = 0
617     ecb_ctz32 (6) = 1
618 sf-exg 1.11
619 root 1.41 =item bool ecb_is_pot32 (uint32_t x)
620    
621     =item bool ecb_is_pot64 (uint32_t x)
622    
623 root 1.77 =item bool ecb_is_pot (T x) [C++]
624    
625 sf-exg 1.66 Returns true iff C<x> is a power of two or C<x == 0>.
626 root 1.41
627 sf-exg 1.66 For smaller types than C<uint32_t> you can safely use C<ecb_is_pot32>.
628 root 1.41
629 root 1.77 The overloaded C++ C<ecb_is_pot> function supports C<uint8_t>, C<uint16_t>,
630     C<uint32_t> and C<uint64_t> types.
631    
632 root 1.35 =item int ecb_ld32 (uint32_t x)
633    
634     =item int ecb_ld64 (uint64_t x)
635    
636 root 1.77 =item int ecb_ld64 (T x) [C++]
637    
638 root 1.35 Returns the index of the most significant bit set in C<x>, or the number
639     of digits the number requires in binary (so that C<< 2**ld <= x <
640     2**(ld+1) >>). If C<x> is 0 the result is undefined. A common use case is
641     to compute the integer binary logarithm, i.e. C<floor (log2 (n))>, for
642     example to see how many bits a certain number requires to be encoded.
643    
644     This function is similar to the "count leading zero bits" function, except
645     that that one returns how many zero bits are "in front" of the number (in
646     the given data type), while C<ecb_ld> returns how many bits the number
647     itself requires.
648    
649 root 1.36 For smaller types than C<uint32_t> you can safely use C<ecb_ld32>.
650    
651 root 1.77 The overloaded C++ C<ecb_ld> function supports C<uint8_t>, C<uint16_t>,
652     C<uint32_t> and C<uint64_t> types.
653    
654 root 1.3 =item int ecb_popcount32 (uint32_t x)
655    
656 root 1.35 =item int ecb_popcount64 (uint64_t x)
657    
658 root 1.77 =item int ecb_popcount (T x) [C++]
659    
660 root 1.36 Returns the number of bits set to 1 in C<x>.
661    
662     For smaller types than C<uint32_t> you can safely use C<ecb_popcount32>.
663    
664 root 1.77 The overloaded C++ C<ecb_popcount> function supports C<uint8_t>, C<uint16_t>,
665     C<uint32_t> and C<uint64_t> types.
666    
667 root 1.36 For example:
668 sf-exg 1.11
669 root 1.15 ecb_popcount32 (7) = 3
670     ecb_popcount32 (255) = 8
671 sf-exg 1.11
672 root 1.39 =item uint8_t ecb_bitrev8 (uint8_t x)
673    
674     =item uint16_t ecb_bitrev16 (uint16_t x)
675    
676     =item uint32_t ecb_bitrev32 (uint32_t x)
677    
678 root 1.77 =item T ecb_bitrev (T x) [C++]
679    
680 root 1.39 Reverses the bits in x, i.e. the MSB becomes the LSB, MSB-1 becomes LSB+1
681     and so on.
682    
683 root 1.77 The overloaded C++ C<ecb_bitrev> function supports C<uint8_t>, C<uint16_t> and C<uint32_t> types.
684    
685 root 1.39 Example:
686    
687     ecb_bitrev8 (0xa7) = 0xea
688     ecb_bitrev32 (0xffcc4411) = 0x882233ff
689    
690 root 1.77 =item T ecb_bitrev (T x) [C++]
691    
692     Overloaded C++ bitrev function.
693    
694     C<T> must be one of C<uint8_t>, C<uint16_t> or C<uint32_t>.
695    
696 root 1.8 =item uint32_t ecb_bswap16 (uint32_t x)
697    
698 root 1.3 =item uint32_t ecb_bswap32 (uint32_t x)
699    
700 root 1.34 =item uint64_t ecb_bswap64 (uint64_t x)
701 sf-exg 1.13
702 root 1.78 =item T ecb_bswap (T x)
703    
704 root 1.34 These functions return the value of the 16-bit (32-bit, 64-bit) value
705     C<x> after reversing the order of bytes (0x11223344 becomes 0x44332211 in
706     C<ecb_bswap32>).
707    
708 root 1.77 The overloaded C++ C<ecb_bswap> function supports C<uint8_t>, C<uint16_t>,
709     C<uint32_t> and C<uint64_t> types.
710 root 1.76
711 root 1.34 =item uint8_t ecb_rotl8 (uint8_t x, unsigned int count)
712    
713     =item uint16_t ecb_rotl16 (uint16_t x, unsigned int count)
714 root 1.3
715     =item uint32_t ecb_rotl32 (uint32_t x, unsigned int count)
716    
717 root 1.34 =item uint64_t ecb_rotl64 (uint64_t x, unsigned int count)
718    
719     =item uint8_t ecb_rotr8 (uint8_t x, unsigned int count)
720    
721     =item uint16_t ecb_rotr16 (uint16_t x, unsigned int count)
722    
723     =item uint32_t ecb_rotr32 (uint32_t x, unsigned int count)
724    
725 root 1.33 =item uint64_t ecb_rotr64 (uint64_t x, unsigned int count)
726    
727 root 1.34 These two families of functions return the value of C<x> after rotating
728     all the bits by C<count> positions to the right (C<ecb_rotr>) or left
729     (C<ecb_rotl>).
730 sf-exg 1.11
731 root 1.20 Current GCC versions understand these functions and usually compile them
732 root 1.34 to "optimal" code (e.g. a single C<rol> or a combination of C<shld> on
733     x86).
734 root 1.20
735 root 1.77 =item T ecb_rotl (T x, unsigned int count) [C++]
736    
737     =item T ecb_rotr (T x, unsigned int count) [C++]
738    
739     Overloaded C++ rotl/rotr functions.
740    
741     C<T> must be one of C<uint8_t>, C<uint16_t>, C<uint32_t> or C<uint64_t>.
742    
743 root 1.3 =back
744 root 1.1
745 root 1.76 =head2 HOST ENDIANNESS CONVERSION
746    
747     =over 4
748    
749     =item uint_fast16_t ecb_be_u16_to_host (uint_fast16_t v)
750    
751     =item uint_fast32_t ecb_be_u32_to_host (uint_fast32_t v)
752    
753     =item uint_fast64_t ecb_be_u64_to_host (uint_fast64_t v)
754    
755     =item uint_fast16_t ecb_le_u16_to_host (uint_fast16_t v)
756    
757     =item uint_fast32_t ecb_le_u32_to_host (uint_fast32_t v)
758    
759     =item uint_fast64_t ecb_le_u64_to_host (uint_fast64_t v)
760    
761     Convert an unsigned 16, 32 or 64 bit value from big or little endian to host byte order.
762    
763     The naming convention is C<ecb_>(C<be>|C<le>)C<_u>C<16|32|64>C<_to_host>,
764 root 1.79 where C<be> and C<le> stand for big endian and little endian, respectively.
765 root 1.76
766     =item uint_fast16_t ecb_host_to_be_u16 (uint_fast16_t v)
767    
768     =item uint_fast32_t ecb_host_to_be_u32 (uint_fast32_t v)
769    
770     =item uint_fast64_t ecb_host_to_be_u64 (uint_fast64_t v)
771    
772     =item uint_fast16_t ecb_host_to_le_u16 (uint_fast16_t v)
773    
774     =item uint_fast32_t ecb_host_to_le_u32 (uint_fast32_t v)
775    
776     =item uint_fast64_t ecb_host_to_le_u64 (uint_fast64_t v)
777    
778     Like above, but converts I<from> host byte order to the specified
779     endianness.
780    
781     =back
782    
783 root 1.77 In C++ the following additional template functions are supported:
784 root 1.76
785     =over 4
786    
787     =item T ecb_be_to_host (T v)
788    
789     =item T ecb_le_to_host (T v)
790    
791     =item T ecb_host_to_be (T v)
792    
793     =item T ecb_host_to_le (T v)
794    
795 root 1.77 These functions work like their C counterparts, above, but use templates,
796     which make them useful in generic code.
797 root 1.76
798     C<T> must be one of C<uint8_t>, C<uint16_t>, C<uint32_t> or C<uint64_t>
799     (so unlike their C counterparts, there is a version for C<uint8_t>, which
800     again can be useful in generic code).
801    
802     =head2 UNALIGNED LOAD/STORE
803    
804     These function load or store unaligned multi-byte values.
805    
806     =over 4
807    
808     =item uint_fast16_t ecb_peek_u16_u (const void *ptr)
809    
810     =item uint_fast32_t ecb_peek_u32_u (const void *ptr)
811    
812     =item uint_fast64_t ecb_peek_u64_u (const void *ptr)
813    
814     These functions load an unaligned, unsigned 16, 32 or 64 bit value from
815     memory.
816    
817     =item uint_fast16_t ecb_peek_be_u16_u (const void *ptr)
818    
819     =item uint_fast32_t ecb_peek_be_u32_u (const void *ptr)
820    
821     =item uint_fast64_t ecb_peek_be_u64_u (const void *ptr)
822    
823     =item uint_fast16_t ecb_peek_le_u16_u (const void *ptr)
824    
825     =item uint_fast32_t ecb_peek_le_u32_u (const void *ptr)
826    
827     =item uint_fast64_t ecb_peek_le_u64_u (const void *ptr)
828    
829     Like above, but additionally convert from big endian (C<be>) or little
830     endian (C<le>) byte order to host byte order while doing so.
831    
832     =item ecb_poke_u16_u (void *ptr, uint16_t v)
833    
834     =item ecb_poke_u32_u (void *ptr, uint32_t v)
835    
836     =item ecb_poke_u64_u (void *ptr, uint64_t v)
837    
838     These functions store an unaligned, unsigned 16, 32 or 64 bit value to
839     memory.
840    
841     =item ecb_poke_be_u16_u (void *ptr, uint_fast16_t v)
842    
843     =item ecb_poke_be_u32_u (void *ptr, uint_fast32_t v)
844    
845     =item ecb_poke_be_u64_u (void *ptr, uint_fast64_t v)
846    
847     =item ecb_poke_le_u16_u (void *ptr, uint_fast16_t v)
848    
849     =item ecb_poke_le_u32_u (void *ptr, uint_fast32_t v)
850    
851     =item ecb_poke_le_u64_u (void *ptr, uint_fast64_t v)
852    
853     Like above, but additionally convert from host byte order to big endian
854     (C<be>) or little endian (C<le>) byte order while doing so.
855    
856     =back
857    
858 root 1.77 In C++ the following additional template functions are supported:
859 root 1.76
860     =over 4
861    
862 root 1.80 =item T ecb_peek<T> (const void *ptr)
863 root 1.76
864 root 1.80 =item T ecb_peek_be<T> (const void *ptr)
865 root 1.76
866 root 1.80 =item T ecb_peek_le<T> (const void *ptr)
867 root 1.76
868 root 1.80 =item T ecb_peek_u<T> (const void *ptr)
869 root 1.76
870 root 1.80 =item T ecb_peek_be_u<T> (const void *ptr)
871 root 1.76
872 root 1.80 =item T ecb_peek_le_u<T> (const void *ptr)
873 root 1.76
874     Similarly to their C counterparts, these functions load an unsigned 8, 16,
875     32 or 64 bit value from memory, with optional conversion from big/little
876     endian.
877    
878 root 1.80 Since the type cannot be deduced, it has to be specified explicitly, e.g.
879 root 1.76
880     uint_fast16_t v = ecb_peek<uint16_t> (ptr);
881    
882     C<T> must be one of C<uint8_t>, C<uint16_t>, C<uint32_t> or C<uint64_t>.
883    
884     Unlike their C counterparts, these functions support 8 bit quantities
885     (C<uint8_t>) and also have an aligned version (without the C<_u> prefix),
886     all of which hopefully makes them more useful in generic code.
887    
888     =item ecb_poke (void *ptr, T v)
889    
890     =item ecb_poke_be (void *ptr, T v)
891    
892     =item ecb_poke_le (void *ptr, T v)
893    
894     =item ecb_poke_u (void *ptr, T v)
895    
896     =item ecb_poke_be_u (void *ptr, T v)
897    
898     =item ecb_poke_le_u (void *ptr, T v)
899    
900     Again, similarly to their C counterparts, these functions store an
901     unsigned 8, 16, 32 or z64 bit value to memory, with optional conversion to
902     big/little endian.
903    
904     C<T> must be one of C<uint8_t>, C<uint16_t>, C<uint32_t> or C<uint64_t>.
905    
906     Unlike their C counterparts, these functions support 8 bit quantities
907     (C<uint8_t>) and also have an aligned version (without the C<_u> prefix),
908     all of which hopefully makes them more useful in generic code.
909    
910     =back
911    
912 root 1.50 =head2 FLOATING POINT FIDDLING
913    
914     =over 4
915    
916 root 1.71 =item ECB_INFINITY [-UECB_NO_LIBM]
917 root 1.62
918     Evaluates to positive infinity if supported by the platform, otherwise to
919     a truly huge number.
920    
921 root 1.71 =item ECB_NAN [-UECB_NO_LIBM]
922 root 1.62
923     Evaluates to a quiet NAN if supported by the platform, otherwise to
924     C<ECB_INFINITY>.
925    
926 root 1.71 =item float ecb_ldexpf (float x, int exp) [-UECB_NO_LIBM]
927 root 1.62
928     Same as C<ldexpf>, but always available.
929    
930 root 1.71 =item uint32_t ecb_float_to_binary16 (float x) [-UECB_NO_LIBM]
931    
932 root 1.50 =item uint32_t ecb_float_to_binary32 (float x) [-UECB_NO_LIBM]
933    
934     =item uint64_t ecb_double_to_binary64 (double x) [-UECB_NO_LIBM]
935    
936     These functions each take an argument in the native C<float> or C<double>
937 root 1.71 type and return the IEEE 754 bit representation of it (binary16/half,
938     binary32/single or binary64/double precision).
939 root 1.50
940     The bit representation is just as IEEE 754 defines it, i.e. the sign bit
941     will be the most significant bit, followed by exponent and mantissa.
942    
943     This function should work even when the native floating point format isn't
944     IEEE compliant, of course at a speed and code size penalty, and of course
945     also within reasonable limits (it tries to convert NaNs, infinities and
946     denormals, but will likely convert negative zero to positive zero).
947    
948     On all modern platforms (where C<ECB_STDFP> is true), the compiler should
949     be able to optimise away this function completely.
950    
951     These functions can be helpful when serialising floats to the network - you
952 root 1.71 can serialise the return value like a normal uint16_t/uint32_t/uint64_t.
953 root 1.50
954     Another use for these functions is to manipulate floating point values
955     directly.
956    
957     Silly example: toggle the sign bit of a float.
958    
959     /* On gcc-4.7 on amd64, */
960     /* this results in a single add instruction to toggle the bit, and 4 extra */
961     /* instructions to move the float value to an integer register and back. */
962    
963     x = ecb_binary32_to_float (ecb_float_to_binary32 (x) ^ 0x80000000U)
964    
965 root 1.58 =item float ecb_binary16_to_float (uint16_t x) [-UECB_NO_LIBM]
966    
967 root 1.50 =item float ecb_binary32_to_float (uint32_t x) [-UECB_NO_LIBM]
968    
969 root 1.70 =item double ecb_binary64_to_double (uint64_t x) [-UECB_NO_LIBM]
970 root 1.50
971 sf-exg 1.59 The reverse operation of the previous function - takes the bit
972 root 1.71 representation of an IEEE binary16, binary32 or binary64 number (half,
973     single or double precision) and converts it to the native C<float> or
974     C<double> format.
975 root 1.50
976     This function should work even when the native floating point format isn't
977     IEEE compliant, of course at a speed and code size penalty, and of course
978     also within reasonable limits (it tries to convert normals and denormals,
979     and might be lucky for infinities, and with extraordinary luck, also for
980     negative zero).
981    
982     On all modern platforms (where C<ECB_STDFP> is true), the compiler should
983     be able to optimise away this function completely.
984    
985 root 1.71 =item uint16_t ecb_binary32_to_binary16 (uint32_t x)
986    
987     =item uint32_t ecb_binary16_to_binary32 (uint16_t x)
988    
989     Convert a IEEE binary32/single precision to binary16/half format, and vice
990 root 1.72 versa, handling all details (round-to-nearest-even, subnormals, infinity
991     and NaNs) correctly.
992 root 1.71
993     These are functions are available under C<-DECB_NO_LIBM>, since
994     they do not rely on the platform floating point format. The
995     C<ecb_float_to_binary16> and C<ecb_binary16_to_float> functions are
996     usually what you want.
997    
998 root 1.50 =back
999    
1000 root 1.1 =head2 ARITHMETIC
1001    
1002 root 1.3 =over 4
1003    
1004 root 1.14 =item x = ecb_mod (m, n)
1005 root 1.3
1006 root 1.25 Returns C<m> modulo C<n>, which is the same as the positive remainder
1007     of the division operation between C<m> and C<n>, using floored
1008     division. Unlike the C remainder operator C<%>, this function ensures that
1009     the return value is always positive and that the two numbers I<m> and
1010     I<m' = m + i * n> result in the same value modulo I<n> - in other words,
1011     C<ecb_mod> implements the mathematical modulo operation, which is missing
1012     in the language.
1013 root 1.14
1014 sf-exg 1.23 C<n> must be strictly positive (i.e. C<< >= 1 >>), while C<m> must be
1015 root 1.14 negatable, that is, both C<m> and C<-m> must be representable in its
1016 root 1.30 type (this typically excludes the minimum signed integer value, the same
1017 root 1.25 limitation as for C</> and C<%> in C).
1018 sf-exg 1.11
1019 root 1.24 Current GCC versions compile this into an efficient branchless sequence on
1020 root 1.28 almost all CPUs.
1021 root 1.24
1022     For example, when you want to rotate forward through the members of an
1023     array for increasing C<m> (which might be negative), then you should use
1024     C<ecb_mod>, as the C<%> operator might give either negative results, or
1025     change direction for negative values:
1026    
1027     for (m = -100; m <= 100; ++m)
1028     int elem = myarray [ecb_mod (m, ecb_array_length (myarray))];
1029    
1030 sf-exg 1.37 =item x = ecb_div_rd (val, div)
1031    
1032     =item x = ecb_div_ru (val, div)
1033    
1034     Returns C<val> divided by C<div> rounded down or up, respectively.
1035     C<val> and C<div> must have integer types and C<div> must be strictly
1036 sf-exg 1.38 positive. Note that these functions are implemented with macros in C
1037     and with function templates in C++.
1038 sf-exg 1.37
1039 root 1.3 =back
1040 root 1.1
1041     =head2 UTILITY
1042    
1043 root 1.3 =over 4
1044    
1045 sf-exg 1.23 =item element_count = ecb_array_length (name)
1046 root 1.3
1047 sf-exg 1.13 Returns the number of elements in the array C<name>. For example:
1048    
1049     int primes[] = { 2, 3, 5, 7, 11 };
1050     int sum = 0;
1051    
1052     for (i = 0; i < ecb_array_length (primes); i++)
1053     sum += primes [i];
1054    
1055 root 1.3 =back
1056 root 1.1
1057 root 1.43 =head2 SYMBOLS GOVERNING COMPILATION OF ECB.H ITSELF
1058    
1059     These symbols need to be defined before including F<ecb.h> the first time.
1060    
1061     =over 4
1062    
1063 root 1.51 =item ECB_NO_THREADS
1064 root 1.43
1065     If F<ecb.h> is never used from multiple threads, then this symbol can
1066     be defined, in which case memory fences (and similar constructs) are
1067     completely removed, leading to more efficient code and fewer dependencies.
1068    
1069     Setting this symbol to a true value implies C<ECB_NO_SMP>.
1070    
1071     =item ECB_NO_SMP
1072    
1073     The weaker version of C<ECB_NO_THREADS> - if F<ecb.h> is used from
1074     multiple threads, but never concurrently (e.g. if the system the program
1075     runs on has only a single CPU with a single core, no hyperthreading and so
1076     on), then this symbol can be defined, leading to more efficient code and
1077     fewer dependencies.
1078    
1079 root 1.50 =item ECB_NO_LIBM
1080    
1081     When defined to C<1>, do not export any functions that might introduce
1082     dependencies on the math library (usually called F<-lm>) - these are
1083     marked with [-UECB_NO_LIBM].
1084    
1085 sf-exg 1.69 =back
1086    
1087 root 1.68 =head1 UNDOCUMENTED FUNCTIONALITY
1088    
1089     F<ecb.h> is full of undocumented functionality as well, some of which is
1090     intended to be internal-use only, some of which we forgot to document, and
1091     some of which we hide because we are not sure we will keep the interface
1092     stable.
1093    
1094     While you are welcome to rummage around and use whatever you find useful
1095     (we can't stop you), keep in mind that we will change undocumented
1096     functionality in incompatible ways without thinking twice, while we are
1097     considerably more conservative with documented things.
1098    
1099     =head1 AUTHORS
1100    
1101     C<libecb> is designed and maintained by:
1102    
1103     Emanuele Giaquinta <e.giaquinta@glauco.it>
1104     Marc Alexander Lehmann <schmorp@schmorp.de>
1105    
1106 root 1.1