ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/libecb/ecb.pod
(Generate patch)

Comparing libecb/ecb.pod (file contents):
Revision 1.1 by root, Thu May 26 19:39:40 2011 UTC vs.
Revision 1.43 by root, Tue May 29 14:09:49 2012 UTC

1=head1 LIBECB - e-C-Builtins
2
3=head2 ABOUT LIBECB
4
5Libecb is currently a simple header file that doesn't require any
6configuration to use or include in your project.
7
8It's part of the e-suite of libraries, other members of which include
9libev and libeio.
10
11Its homepage can be found here:
12
13 http://software.schmorp.de/pkg/libecb
14
15It mainly provides a number of wrappers around GCC built-ins, together
16with replacement functions for other compilers. In addition to this,
17it provides a number of other lowlevel C utilities, such as endianness
18detection, byte swapping or bit rotations.
19
20Or in other words, things that should be built into any standard C system,
21but aren't, implemented as efficient as possible with GCC, and still
22correct with other compilers.
23
24More might come.
25
26=head2 ABOUT THE HEADER
27
28At the moment, all you have to do is copy F<ecb.h> somewhere where your
29compiler can find it and include it:
30
31 #include <ecb.h>
32
33The header should work fine for both C and C++ compilation, and gives you
34all of F<inttypes.h> in addition to the ECB symbols.
35
36There are currently no object files to link to - future versions might
37come with an (optional) object code library to link against, to reduce
38code size or gain access to additional features.
39
40It also currently includes everything from F<inttypes.h>.
41
42=head2 ABOUT THIS MANUAL / CONVENTIONS
43
44This manual mainly describes each (public) function available after
45including the F<ecb.h> header. The header might define other symbols than
46these, but these are not part of the public API, and not supported in any
47way.
48
49When the manual mentions a "function" then this could be defined either as
50as inline function, a macro, or an external symbol.
51
52When functions use a concrete standard type, such as C<int> or
53C<uint32_t>, then the corresponding function works only with that type. If
54only a generic name is used (C<expr>, C<cond>, C<value> and so on), then
55the corresponding function relies on C to implement the correct types, and
56is usually implemented as a macro. Specifically, a "bool" in this manual
57refers to any kind of boolean value, not a specific type.
58
59=head2 TYPES / TYPE SUPPORT
60
61ecb.h makes sure that the following types are defined (in the expected way):
62
63 int8_t uint8_t int16_t uint16_t
64 int32_t uint32_t int64_t uint64_t
65 intptr_t uintptr_t ptrdiff_t
66
67The macro C<ECB_PTRSIZE> is defined to the size of a pointer on this
68platform (currently C<4> or C<8>).
69
70=head2 LANGUAGE/COMPILER VERSIONS
71
72=over 4
73
74=item ECB_C99
75
76Expands to a true value (suitable for testing in by the preprocessor)
77if the environment claims to be C99 compliant.
78
79=item ECB_C11
80
81Expands to a true value (suitable for testing in by the preprocessor)
82if the environment claims to be C11 compliant.
83
84=item ECB_GCC_VERSION(major,minor)
85
86Expands to a true value (suitable for testing in by the preprocessor)
87if the compiler used is GNU C and the version is the givne version, or
88higher.
89
90This macro tries to return false on compilers that claim to be GCC
91compatible but aren't.
92
93=back
1 94
2=head2 GCC ATTRIBUTES 95=head2 GCC ATTRIBUTES
3 96
97A major part of libecb deals with GCC attributes. These are additional
98attributes that you can assign to functions, variables and sometimes even
99types - much like C<const> or C<volatile> in C.
100
101While GCC allows declarations to show up in many surprising places,
102but not in many expected places, the safest way is to put attribute
103declarations before the whole declaration:
104
105 ecb_const int mysqrt (int a);
106 ecb_unused int i;
107
108For variables, it is often nicer to put the attribute after the name, and
109avoid multiple declarations using commas:
110
111 int i ecb_unused;
112
4=over 4 113=over 4
5 114
6=item ecb_attribute(attrlist) 115=item ecb_attribute ((attrs...))
7=item ecb_noinline ecb_attribute ((noinline))
8=item ecb_noreturn ecb_attribute ((noreturn))
9=item ecb_unused ecb_attribute ((unused))
10=item ecb_const ecb_attribute ((const))
11=item ecb_pure ecb_attribute ((pure))
12=item ecb_hot ecb_attribute ((hot)) /* 4.3 */
13=item ecb_cold ecb_attribute ((cold)) /* 4.3 */
14 116
117A simple wrapper that expands to C<__attribute__((attrs))> on GCC, and to
118nothing on other compilers, so the effect is that only GCC sees these.
119
120Example: use the C<deprecated> attribute on a function.
121
122 ecb_attribute((__deprecated__)) void
123 do_not_use_me_anymore (void);
124
125=item ecb_unused
126
127Marks a function or a variable as "unused", which simply suppresses a
128warning by GCC when it detects it as unused. This is useful when you e.g.
129declare a variable but do not always use it:
130
131 {
132 int var ecb_unused;
133
134 #ifdef SOMECONDITION
135 var = ...;
136 return var;
137 #else
138 return 0;
139 #endif
140 }
141
142=item ecb_inline
143
144This is not actually an attribute, but you use it like one. It expands
145either to C<static inline> or to just C<static>, if inline isn't
146supported. It should be used to declare functions that should be inlined,
147for code size or speed reasons.
148
149Example: inline this function, it surely will reduce codesize.
150
151 ecb_inline int
152 negmul (int a, int b)
153 {
154 return - (a * b);
155 }
156
157=item ecb_noinline
158
159Prevent a function from being inlined - it might be optimised away, but
160not inlined into other functions. This is useful if you know your function
161is rarely called and large enough for inlining not to be helpful.
162
163=item ecb_noreturn
164
165Marks a function as "not returning, ever". Some typical functions that
166don't return are C<exit> or C<abort> (which really works hard to not
167return), and now you can make your own:
168
169 ecb_noreturn void
170 my_abort (const char *errline)
171 {
172 puts (errline);
173 abort ();
174 }
175
176In this case, the compiler would probably be smart enough to deduce it on
177its own, so this is mainly useful for declarations.
178
179=item ecb_const
180
181Declares that the function only depends on the values of its arguments,
182much like a mathematical function. It specifically does not read or write
183any memory any arguments might point to, global variables, or call any
184non-const functions. It also must not have any side effects.
185
186Such a function can be optimised much more aggressively by the compiler -
187for example, multiple calls with the same arguments can be optimised into
188a single call, which wouldn't be possible if the compiler would have to
189expect any side effects.
190
191It is best suited for functions in the sense of mathematical functions,
192such as a function returning the square root of its input argument.
193
194Not suited would be a function that calculates the hash of some memory
195area you pass in, prints some messages or looks at a global variable to
196decide on rounding.
197
198See C<ecb_pure> for a slightly less restrictive class of functions.
199
200=item ecb_pure
201
202Similar to C<ecb_const>, declares a function that has no side
203effects. Unlike C<ecb_const>, the function is allowed to examine global
204variables and any other memory areas (such as the ones passed to it via
205pointers).
206
207While these functions cannot be optimised as aggressively as C<ecb_const>
208functions, they can still be optimised away in many occasions, and the
209compiler has more freedom in moving calls to them around.
210
211Typical examples for such functions would be C<strlen> or C<memcmp>. A
212function that calculates the MD5 sum of some input and updates some MD5
213state passed as argument would I<NOT> be pure, however, as it would modify
214some memory area that is not the return value.
215
216=item ecb_hot
217
218This declares a function as "hot" with regards to the cache - the function
219is used so often, that it is very beneficial to keep it in the cache if
220possible.
221
222The compiler reacts by trying to place hot functions near to each other in
223memory.
224
225Whether a function is hot or not often depends on the whole program,
226and less on the function itself. C<ecb_cold> is likely more useful in
227practise.
228
229=item ecb_cold
230
231The opposite of C<ecb_hot> - declares a function as "cold" with regards to
232the cache, or in other words, this function is not called often, or not at
233speed-critical times, and keeping it in the cache might be a waste of said
234cache.
235
236In addition to placing cold functions together (or at least away from hot
237functions), this knowledge can be used in other ways, for example, the
238function will be optimised for size, as opposed to speed, and codepaths
239leading to calls to those functions can automatically be marked as if
240C<ecb_expect_false> had been used to reach them.
241
242Good examples for such functions would be error reporting functions, or
243functions only called in exceptional or rare cases.
244
245=item ecb_artificial
246
247Declares the function as "artificial", in this case meaning that this
248function is not really mean to be a function, but more like an accessor
249- many methods in C++ classes are mere accessor functions, and having a
250crash reported in such a method, or single-stepping through them, is not
251usually so helpful, especially when it's inlined to just a few instructions.
252
253Marking them as artificial will instruct the debugger about just this,
254leading to happier debugging and thus happier lives.
255
256Example: in some kind of smart-pointer class, mark the pointer accessor as
257artificial, so that the whole class acts more like a pointer and less like
258some C++ abstraction monster.
259
260 template<typename T>
261 struct my_smart_ptr
262 {
263 T *value;
264
265 ecb_artificial
266 operator T *()
267 {
268 return value;
269 }
270 };
271
15 =back 272=back
16 273
17=head2 OPTIMISATION HINTS 274=head2 OPTIMISATION HINTS
18 275
19=over 4 276=over 4
20 277
21=item bool ecb_is_constant(expr) 278=item bool ecb_is_constant(expr)
22 279
280Returns true iff the expression can be deduced to be a compile-time
281constant, and false otherwise.
282
283For example, when you have a C<rndm16> function that returns a 16 bit
284random number, and you have a function that maps this to a range from
2850..n-1, then you could use this inline function in a header file:
286
287 ecb_inline uint32_t
288 rndm (uint32_t n)
289 {
290 return (n * (uint32_t)rndm16 ()) >> 16;
291 }
292
293However, for powers of two, you could use a normal mask, but that is only
294worth it if, at compile time, you can detect this case. This is the case
295when the passed number is a constant and also a power of two (C<n & (n -
2961) == 0>):
297
298 ecb_inline uint32_t
299 rndm (uint32_t n)
300 {
301 return is_constant (n) && !(n & (n - 1))
302 ? rndm16 () & (num - 1)
303 : (n * (uint32_t)rndm16 ()) >> 16;
304 }
305
23=item bool ecb_expect(expr,value) 306=item bool ecb_expect (expr, value)
24 307
25=item bool ecb_unlikely(bool) 308Evaluates C<expr> and returns it. In addition, it tells the compiler that
309the C<expr> evaluates to C<value> a lot, which can be used for static
310branch optimisations.
26 311
27=item bool ecb_likely(bool) 312Usually, you want to use the more intuitive C<ecb_expect_true> and
313C<ecb_expect_false> functions instead.
28 314
315=item bool ecb_expect_true (cond)
316
317=item bool ecb_expect_false (cond)
318
319These two functions expect a expression that is true or false and return
320C<1> or C<0>, respectively, so when used in the condition of an C<if> or
321other conditional statement, it will not change the program:
322
323 /* these two do the same thing */
324 if (some_condition) ...;
325 if (ecb_expect_true (some_condition)) ...;
326
327However, by using C<ecb_expect_true>, you tell the compiler that the
328condition is likely to be true (and for C<ecb_expect_false>, that it is
329unlikely to be true).
330
331For example, when you check for a null pointer and expect this to be a
332rare, exceptional, case, then use C<ecb_expect_false>:
333
334 void my_free (void *ptr)
335 {
336 if (ecb_expect_false (ptr == 0))
337 return;
338 }
339
340Consequent use of these functions to mark away exceptional cases or to
341tell the compiler what the hot path through a function is can increase
342performance considerably.
343
344You might know these functions under the name C<likely> and C<unlikely>
345- while these are common aliases, we find that the expect name is easier
346to understand when quickly skimming code. If you wish, you can use
347C<ecb_likely> instead of C<ecb_expect_true> and C<ecb_unlikely> instead of
348C<ecb_expect_false> - these are simply aliases.
349
350A very good example is in a function that reserves more space for some
351memory block (for example, inside an implementation of a string stream) -
352each time something is added, you have to check for a buffer overrun, but
353you expect that most checks will turn out to be false:
354
355 /* make sure we have "size" extra room in our buffer */
356 ecb_inline void
357 reserve (int size)
358 {
359 if (ecb_expect_false (current + size > end))
360 real_reserve_method (size); /* presumably noinline */
361 }
362
29=item bool ecb_assume(cond) 363=item bool ecb_assume (cond)
30 364
365Try to tell the compiler that some condition is true, even if it's not
366obvious.
367
368This can be used to teach the compiler about invariants or other
369conditions that might improve code generation, but which are impossible to
370deduce form the code itself.
371
372For example, the example reservation function from the C<ecb_expect_false>
373description could be written thus (only C<ecb_assume> was added):
374
375 ecb_inline void
376 reserve (int size)
377 {
378 if (ecb_expect_false (current + size > end))
379 real_reserve_method (size); /* presumably noinline */
380
381 ecb_assume (current + size <= end);
382 }
383
384If you then call this function twice, like this:
385
386 reserve (10);
387 reserve (1);
388
389Then the compiler I<might> be able to optimise out the second call
390completely, as it knows that C<< current + 1 > end >> is false and the
391call will never be executed.
392
31=item bool ecb_unreachable() 393=item bool ecb_unreachable ()
32 394
395This function does nothing itself, except tell the compiler that it will
396never be executed. Apart from suppressing a warning in some cases, this
397function can be used to implement C<ecb_assume> or similar functions.
398
33=item bool ecb_prefetch(addr,rw,locality) 399=item bool ecb_prefetch (addr, rw, locality)
34 400
401Tells the compiler to try to prefetch memory at the given C<addr>ess
402for either reading (C<rw> = 0) or writing (C<rw> = 1). A C<locality> of
403C<0> means that there will only be one access later, C<3> means that
404the data will likely be accessed very often, and values in between mean
405something... in between. The memory pointed to by the address does not
406need to be accessible (it could be a null pointer for example), but C<rw>
407and C<locality> must be compile-time constants.
408
409An obvious way to use this is to prefetch some data far away, in a big
410array you loop over. This prefetches memory some 128 array elements later,
411in the hope that it will be ready when the CPU arrives at that location.
412
413 int sum = 0;
414
415 for (i = 0; i < N; ++i)
416 {
417 sum += arr [i]
418 ecb_prefetch (arr + i + 128, 0, 0);
419 }
420
421It's hard to predict how far to prefetch, and most CPUs that can prefetch
422are often good enough to predict this kind of behaviour themselves. It
423gets more interesting with linked lists, especially when you do some fair
424processing on each list element:
425
426 for (node *n = start; n; n = n->next)
427 {
428 ecb_prefetch (n->next, 0, 0);
429 ... do medium amount of work with *n
430 }
431
432After processing the node, (part of) the next node might already be in
433cache.
434
35 =back 435=back
36 436
37=head2 BIT FIDDLING / BITSTUFFS 437=head2 BIT FIDDLING / BIT WIZARDRY
38 438
439=over 4
440
39bool ecb_big_endian (); 441=item bool ecb_big_endian ()
442
40bool ecb_little_endian (); 443=item bool ecb_little_endian ()
444
445These two functions return true if the byte order is big endian
446(most-significant byte first) or little endian (least-significant byte
447first) respectively.
448
449On systems that are neither, their return values are unspecified.
450
41int ecb_ctz32 (uint32_t x); 451=item int ecb_ctz32 (uint32_t x)
452
453=item int ecb_ctz64 (uint64_t x)
454
455Returns the index of the least significant bit set in C<x> (or
456equivalently the number of bits set to 0 before the least significant bit
457set), starting from 0. If C<x> is 0 the result is undefined.
458
459For smaller types than C<uint32_t> you can safely use C<ecb_ctz32>.
460
461For example:
462
463 ecb_ctz32 (3) = 0
464 ecb_ctz32 (6) = 1
465
466=item bool ecb_is_pot32 (uint32_t x)
467
468=item bool ecb_is_pot64 (uint32_t x)
469
470Return true iff C<x> is a power of two or C<x == 0>.
471
472For smaller types then C<uint32_t> you can safely use C<ecb_is_pot32>.
473
474=item int ecb_ld32 (uint32_t x)
475
476=item int ecb_ld64 (uint64_t x)
477
478Returns the index of the most significant bit set in C<x>, or the number
479of digits the number requires in binary (so that C<< 2**ld <= x <
4802**(ld+1) >>). If C<x> is 0 the result is undefined. A common use case is
481to compute the integer binary logarithm, i.e. C<floor (log2 (n))>, for
482example to see how many bits a certain number requires to be encoded.
483
484This function is similar to the "count leading zero bits" function, except
485that that one returns how many zero bits are "in front" of the number (in
486the given data type), while C<ecb_ld> returns how many bits the number
487itself requires.
488
489For smaller types than C<uint32_t> you can safely use C<ecb_ld32>.
490
42int ecb_popcount32 (uint32_t x); 491=item int ecb_popcount32 (uint32_t x)
492
493=item int ecb_popcount64 (uint64_t x)
494
495Returns the number of bits set to 1 in C<x>.
496
497For smaller types than C<uint32_t> you can safely use C<ecb_popcount32>.
498
499For example:
500
501 ecb_popcount32 (7) = 3
502 ecb_popcount32 (255) = 8
503
504=item uint8_t ecb_bitrev8 (uint8_t x)
505
506=item uint16_t ecb_bitrev16 (uint16_t x)
507
43uint32_t ecb_bswap32 (uint32_t x); 508=item uint32_t ecb_bitrev32 (uint32_t x)
509
510Reverses the bits in x, i.e. the MSB becomes the LSB, MSB-1 becomes LSB+1
511and so on.
512
513Example:
514
515 ecb_bitrev8 (0xa7) = 0xea
516 ecb_bitrev32 (0xffcc4411) = 0x882233ff
517
44uint32_t ecb_bswap16 (uint32_t x); 518=item uint32_t ecb_bswap16 (uint32_t x)
519
520=item uint32_t ecb_bswap32 (uint32_t x)
521
522=item uint64_t ecb_bswap64 (uint64_t x)
523
524These functions return the value of the 16-bit (32-bit, 64-bit) value
525C<x> after reversing the order of bytes (0x11223344 becomes 0x44332211 in
526C<ecb_bswap32>).
527
45uint32_t ecb_rotr32 (uint32_t x, unsigned int count); 528=item uint8_t ecb_rotl8 (uint8_t x, unsigned int count)
529
530=item uint16_t ecb_rotl16 (uint16_t x, unsigned int count)
531
46uint32_t ecb_rotl32 (uint32_t x, unsigned int count); 532=item uint32_t ecb_rotl32 (uint32_t x, unsigned int count)
533
534=item uint64_t ecb_rotl64 (uint64_t x, unsigned int count)
535
536=item uint8_t ecb_rotr8 (uint8_t x, unsigned int count)
537
538=item uint16_t ecb_rotr16 (uint16_t x, unsigned int count)
539
540=item uint32_t ecb_rotr32 (uint32_t x, unsigned int count)
541
542=item uint64_t ecb_rotr64 (uint64_t x, unsigned int count)
543
544These two families of functions return the value of C<x> after rotating
545all the bits by C<count> positions to the right (C<ecb_rotr>) or left
546(C<ecb_rotl>).
547
548Current GCC versions understand these functions and usually compile them
549to "optimal" code (e.g. a single C<rol> or a combination of C<shld> on
550x86).
551
552=back
47 553
48=head2 ARITHMETIC 554=head2 ARITHMETIC
49 555
556=over 4
557
50x = ecb_mod (m, n) 558=item x = ecb_mod (m, n)
559
560Returns C<m> modulo C<n>, which is the same as the positive remainder
561of the division operation between C<m> and C<n>, using floored
562division. Unlike the C remainder operator C<%>, this function ensures that
563the return value is always positive and that the two numbers I<m> and
564I<m' = m + i * n> result in the same value modulo I<n> - in other words,
565C<ecb_mod> implements the mathematical modulo operation, which is missing
566in the language.
567
568C<n> must be strictly positive (i.e. C<< >= 1 >>), while C<m> must be
569negatable, that is, both C<m> and C<-m> must be representable in its
570type (this typically excludes the minimum signed integer value, the same
571limitation as for C</> and C<%> in C).
572
573Current GCC versions compile this into an efficient branchless sequence on
574almost all CPUs.
575
576For example, when you want to rotate forward through the members of an
577array for increasing C<m> (which might be negative), then you should use
578C<ecb_mod>, as the C<%> operator might give either negative results, or
579change direction for negative values:
580
581 for (m = -100; m <= 100; ++m)
582 int elem = myarray [ecb_mod (m, ecb_array_length (myarray))];
583
584=item x = ecb_div_rd (val, div)
585
586=item x = ecb_div_ru (val, div)
587
588Returns C<val> divided by C<div> rounded down or up, respectively.
589C<val> and C<div> must have integer types and C<div> must be strictly
590positive. Note that these functions are implemented with macros in C
591and with function templates in C++.
592
593=back
51 594
52=head2 UTILITY 595=head2 UTILITY
53 596
54ecb_array_length (name) 597=over 4
55 598
599=item element_count = ecb_array_length (name)
56 600
601Returns the number of elements in the array C<name>. For example:
602
603 int primes[] = { 2, 3, 5, 7, 11 };
604 int sum = 0;
605
606 for (i = 0; i < ecb_array_length (primes); i++)
607 sum += primes [i];
608
609=back
610
611=head2 SYMBOLS GOVERNING COMPILATION OF ECB.H ITSELF
612
613These symbols need to be defined before including F<ecb.h> the first time.
614
615=over 4
616
617=item ECB_NO_THRADS
618
619If F<ecb.h> is never used from multiple threads, then this symbol can
620be defined, in which case memory fences (and similar constructs) are
621completely removed, leading to more efficient code and fewer dependencies.
622
623Setting this symbol to a true value implies C<ECB_NO_SMP>.
624
625=item ECB_NO_SMP
626
627The weaker version of C<ECB_NO_THREADS> - if F<ecb.h> is used from
628multiple threads, but never concurrently (e.g. if the system the program
629runs on has only a single CPU with a single core, no hyperthreading and so
630on), then this symbol can be defined, leading to more efficient code and
631fewer dependencies.
632
633=back
634
635

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines