ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/libecb/ecb.pod
Revision: 1.45
Committed: Tue May 29 14:35:43 2012 UTC (12 years ago) by root
Branch: MAIN
Changes since 1.44: +2 -1 lines
Log Message:
*** empty log message ***

File Contents

# Content
1 =head1 LIBECB - e-C-Builtins
2
3 =head2 ABOUT LIBECB
4
5 Libecb is currently a simple header file that doesn't require any
6 configuration to use or include in your project.
7
8 It's part of the e-suite of libraries, other members of which include
9 libev and libeio.
10
11 Its homepage can be found here:
12
13 http://software.schmorp.de/pkg/libecb
14
15 It mainly provides a number of wrappers around GCC built-ins, together
16 with replacement functions for other compilers. In addition to this,
17 it provides a number of other lowlevel C utilities, such as endianness
18 detection, byte swapping or bit rotations.
19
20 Or in other words, things that should be built into any standard C system,
21 but aren't, implemented as efficient as possible with GCC, and still
22 correct with other compilers.
23
24 More might come.
25
26 =head2 ABOUT THE HEADER
27
28 At the moment, all you have to do is copy F<ecb.h> somewhere where your
29 compiler can find it and include it:
30
31 #include <ecb.h>
32
33 The header should work fine for both C and C++ compilation, and gives you
34 all of F<inttypes.h> in addition to the ECB symbols.
35
36 There are currently no object files to link to - future versions might
37 come with an (optional) object code library to link against, to reduce
38 code size or gain access to additional features.
39
40 It also currently includes everything from F<inttypes.h>.
41
42 =head2 ABOUT THIS MANUAL / CONVENTIONS
43
44 This manual mainly describes each (public) function available after
45 including the F<ecb.h> header. The header might define other symbols than
46 these, but these are not part of the public API, and not supported in any
47 way.
48
49 When the manual mentions a "function" then this could be defined either as
50 as inline function, a macro, or an external symbol.
51
52 When functions use a concrete standard type, such as C<int> or
53 C<uint32_t>, then the corresponding function works only with that type. If
54 only a generic name is used (C<expr>, C<cond>, C<value> and so on), then
55 the corresponding function relies on C to implement the correct types, and
56 is usually implemented as a macro. Specifically, a "bool" in this manual
57 refers to any kind of boolean value, not a specific type.
58
59 =head2 TYPES / TYPE SUPPORT
60
61 ecb.h makes sure that the following types are defined (in the expected way):
62
63 int8_t uint8_t int16_t uint16_t
64 int32_t uint32_t int64_t uint64_t
65 intptr_t uintptr_t ptrdiff_t
66
67 The macro C<ECB_PTRSIZE> is defined to the size of a pointer on this
68 platform (currently C<4> or C<8>) and can be used in preprocessor
69 expressions.
70
71 =head2 LANGUAGE/COMPILER VERSIONS
72
73 All the following symbols expand to an expressionb that cna be tested in
74 preprocessor instructions as well as treated as a boolean (use C<!!> to
75 ensure it's either C<0> or C<1> if you need that).
76
77 =over 4
78
79 =item ECB_C
80
81 True if the implementation defines the C<__STDC__> macro to a true value,
82 which is typically true for both C and C++ compilers.
83
84 =item ECB_C99
85
86 True if the implementation claims to be C99 compliant.
87
88 =item ECB_C11
89
90 True if the implementation claims to be C11 compliant.
91
92 =item ECB_CPP
93
94 True if the implementation defines the C<__cplusplus__> macro to a true
95 value, which is typically true for C++ compilers.
96
97 =item ECB_CPP98
98
99 True if the implementation claims to be compliant to ISO/IEC 14882:1998
100 (the first C++ ISO standard) or any later vwersion. Typically true for all
101 C++ compilers.
102
103 =item ECB_CPP11
104
105 True if the implementation claims to be compliant to ISO/IEC 14882:2011
106 (C++11) or any later vwersion.
107
108 =item ECB_GCC_VERSION(major,minor)
109
110 Expands to a true value (suitable for testing in by the preprocessor)
111 if the compiler used is GNU C and the version is the givne version, or
112 higher.
113
114 This macro tries to return false on compilers that claim to be GCC
115 compatible but aren't.
116
117 =back
118
119 =head2 GCC ATTRIBUTES
120
121 A major part of libecb deals with GCC attributes. These are additional
122 attributes that you can assign to functions, variables and sometimes even
123 types - much like C<const> or C<volatile> in C.
124
125 While GCC allows declarations to show up in many surprising places,
126 but not in many expected places, the safest way is to put attribute
127 declarations before the whole declaration:
128
129 ecb_const int mysqrt (int a);
130 ecb_unused int i;
131
132 For variables, it is often nicer to put the attribute after the name, and
133 avoid multiple declarations using commas:
134
135 int i ecb_unused;
136
137 =over 4
138
139 =item ecb_attribute ((attrs...))
140
141 A simple wrapper that expands to C<__attribute__((attrs))> on GCC, and to
142 nothing on other compilers, so the effect is that only GCC sees these.
143
144 Example: use the C<deprecated> attribute on a function.
145
146 ecb_attribute((__deprecated__)) void
147 do_not_use_me_anymore (void);
148
149 =item ecb_unused
150
151 Marks a function or a variable as "unused", which simply suppresses a
152 warning by GCC when it detects it as unused. This is useful when you e.g.
153 declare a variable but do not always use it:
154
155 {
156 int var ecb_unused;
157
158 #ifdef SOMECONDITION
159 var = ...;
160 return var;
161 #else
162 return 0;
163 #endif
164 }
165
166 =item ecb_inline
167
168 This is not actually an attribute, but you use it like one. It expands
169 either to C<static inline> or to just C<static>, if inline isn't
170 supported. It should be used to declare functions that should be inlined,
171 for code size or speed reasons.
172
173 Example: inline this function, it surely will reduce codesize.
174
175 ecb_inline int
176 negmul (int a, int b)
177 {
178 return - (a * b);
179 }
180
181 =item ecb_noinline
182
183 Prevent a function from being inlined - it might be optimised away, but
184 not inlined into other functions. This is useful if you know your function
185 is rarely called and large enough for inlining not to be helpful.
186
187 =item ecb_noreturn
188
189 Marks a function as "not returning, ever". Some typical functions that
190 don't return are C<exit> or C<abort> (which really works hard to not
191 return), and now you can make your own:
192
193 ecb_noreturn void
194 my_abort (const char *errline)
195 {
196 puts (errline);
197 abort ();
198 }
199
200 In this case, the compiler would probably be smart enough to deduce it on
201 its own, so this is mainly useful for declarations.
202
203 =item ecb_const
204
205 Declares that the function only depends on the values of its arguments,
206 much like a mathematical function. It specifically does not read or write
207 any memory any arguments might point to, global variables, or call any
208 non-const functions. It also must not have any side effects.
209
210 Such a function can be optimised much more aggressively by the compiler -
211 for example, multiple calls with the same arguments can be optimised into
212 a single call, which wouldn't be possible if the compiler would have to
213 expect any side effects.
214
215 It is best suited for functions in the sense of mathematical functions,
216 such as a function returning the square root of its input argument.
217
218 Not suited would be a function that calculates the hash of some memory
219 area you pass in, prints some messages or looks at a global variable to
220 decide on rounding.
221
222 See C<ecb_pure> for a slightly less restrictive class of functions.
223
224 =item ecb_pure
225
226 Similar to C<ecb_const>, declares a function that has no side
227 effects. Unlike C<ecb_const>, the function is allowed to examine global
228 variables and any other memory areas (such as the ones passed to it via
229 pointers).
230
231 While these functions cannot be optimised as aggressively as C<ecb_const>
232 functions, they can still be optimised away in many occasions, and the
233 compiler has more freedom in moving calls to them around.
234
235 Typical examples for such functions would be C<strlen> or C<memcmp>. A
236 function that calculates the MD5 sum of some input and updates some MD5
237 state passed as argument would I<NOT> be pure, however, as it would modify
238 some memory area that is not the return value.
239
240 =item ecb_hot
241
242 This declares a function as "hot" with regards to the cache - the function
243 is used so often, that it is very beneficial to keep it in the cache if
244 possible.
245
246 The compiler reacts by trying to place hot functions near to each other in
247 memory.
248
249 Whether a function is hot or not often depends on the whole program,
250 and less on the function itself. C<ecb_cold> is likely more useful in
251 practise.
252
253 =item ecb_cold
254
255 The opposite of C<ecb_hot> - declares a function as "cold" with regards to
256 the cache, or in other words, this function is not called often, or not at
257 speed-critical times, and keeping it in the cache might be a waste of said
258 cache.
259
260 In addition to placing cold functions together (or at least away from hot
261 functions), this knowledge can be used in other ways, for example, the
262 function will be optimised for size, as opposed to speed, and codepaths
263 leading to calls to those functions can automatically be marked as if
264 C<ecb_expect_false> had been used to reach them.
265
266 Good examples for such functions would be error reporting functions, or
267 functions only called in exceptional or rare cases.
268
269 =item ecb_artificial
270
271 Declares the function as "artificial", in this case meaning that this
272 function is not really mean to be a function, but more like an accessor
273 - many methods in C++ classes are mere accessor functions, and having a
274 crash reported in such a method, or single-stepping through them, is not
275 usually so helpful, especially when it's inlined to just a few instructions.
276
277 Marking them as artificial will instruct the debugger about just this,
278 leading to happier debugging and thus happier lives.
279
280 Example: in some kind of smart-pointer class, mark the pointer accessor as
281 artificial, so that the whole class acts more like a pointer and less like
282 some C++ abstraction monster.
283
284 template<typename T>
285 struct my_smart_ptr
286 {
287 T *value;
288
289 ecb_artificial
290 operator T *()
291 {
292 return value;
293 }
294 };
295
296 =back
297
298 =head2 OPTIMISATION HINTS
299
300 =over 4
301
302 =item bool ecb_is_constant(expr)
303
304 Returns true iff the expression can be deduced to be a compile-time
305 constant, and false otherwise.
306
307 For example, when you have a C<rndm16> function that returns a 16 bit
308 random number, and you have a function that maps this to a range from
309 0..n-1, then you could use this inline function in a header file:
310
311 ecb_inline uint32_t
312 rndm (uint32_t n)
313 {
314 return (n * (uint32_t)rndm16 ()) >> 16;
315 }
316
317 However, for powers of two, you could use a normal mask, but that is only
318 worth it if, at compile time, you can detect this case. This is the case
319 when the passed number is a constant and also a power of two (C<n & (n -
320 1) == 0>):
321
322 ecb_inline uint32_t
323 rndm (uint32_t n)
324 {
325 return is_constant (n) && !(n & (n - 1))
326 ? rndm16 () & (num - 1)
327 : (n * (uint32_t)rndm16 ()) >> 16;
328 }
329
330 =item bool ecb_expect (expr, value)
331
332 Evaluates C<expr> and returns it. In addition, it tells the compiler that
333 the C<expr> evaluates to C<value> a lot, which can be used for static
334 branch optimisations.
335
336 Usually, you want to use the more intuitive C<ecb_expect_true> and
337 C<ecb_expect_false> functions instead.
338
339 =item bool ecb_expect_true (cond)
340
341 =item bool ecb_expect_false (cond)
342
343 These two functions expect a expression that is true or false and return
344 C<1> or C<0>, respectively, so when used in the condition of an C<if> or
345 other conditional statement, it will not change the program:
346
347 /* these two do the same thing */
348 if (some_condition) ...;
349 if (ecb_expect_true (some_condition)) ...;
350
351 However, by using C<ecb_expect_true>, you tell the compiler that the
352 condition is likely to be true (and for C<ecb_expect_false>, that it is
353 unlikely to be true).
354
355 For example, when you check for a null pointer and expect this to be a
356 rare, exceptional, case, then use C<ecb_expect_false>:
357
358 void my_free (void *ptr)
359 {
360 if (ecb_expect_false (ptr == 0))
361 return;
362 }
363
364 Consequent use of these functions to mark away exceptional cases or to
365 tell the compiler what the hot path through a function is can increase
366 performance considerably.
367
368 You might know these functions under the name C<likely> and C<unlikely>
369 - while these are common aliases, we find that the expect name is easier
370 to understand when quickly skimming code. If you wish, you can use
371 C<ecb_likely> instead of C<ecb_expect_true> and C<ecb_unlikely> instead of
372 C<ecb_expect_false> - these are simply aliases.
373
374 A very good example is in a function that reserves more space for some
375 memory block (for example, inside an implementation of a string stream) -
376 each time something is added, you have to check for a buffer overrun, but
377 you expect that most checks will turn out to be false:
378
379 /* make sure we have "size" extra room in our buffer */
380 ecb_inline void
381 reserve (int size)
382 {
383 if (ecb_expect_false (current + size > end))
384 real_reserve_method (size); /* presumably noinline */
385 }
386
387 =item bool ecb_assume (cond)
388
389 Try to tell the compiler that some condition is true, even if it's not
390 obvious.
391
392 This can be used to teach the compiler about invariants or other
393 conditions that might improve code generation, but which are impossible to
394 deduce form the code itself.
395
396 For example, the example reservation function from the C<ecb_expect_false>
397 description could be written thus (only C<ecb_assume> was added):
398
399 ecb_inline void
400 reserve (int size)
401 {
402 if (ecb_expect_false (current + size > end))
403 real_reserve_method (size); /* presumably noinline */
404
405 ecb_assume (current + size <= end);
406 }
407
408 If you then call this function twice, like this:
409
410 reserve (10);
411 reserve (1);
412
413 Then the compiler I<might> be able to optimise out the second call
414 completely, as it knows that C<< current + 1 > end >> is false and the
415 call will never be executed.
416
417 =item bool ecb_unreachable ()
418
419 This function does nothing itself, except tell the compiler that it will
420 never be executed. Apart from suppressing a warning in some cases, this
421 function can be used to implement C<ecb_assume> or similar functions.
422
423 =item bool ecb_prefetch (addr, rw, locality)
424
425 Tells the compiler to try to prefetch memory at the given C<addr>ess
426 for either reading (C<rw> = 0) or writing (C<rw> = 1). A C<locality> of
427 C<0> means that there will only be one access later, C<3> means that
428 the data will likely be accessed very often, and values in between mean
429 something... in between. The memory pointed to by the address does not
430 need to be accessible (it could be a null pointer for example), but C<rw>
431 and C<locality> must be compile-time constants.
432
433 An obvious way to use this is to prefetch some data far away, in a big
434 array you loop over. This prefetches memory some 128 array elements later,
435 in the hope that it will be ready when the CPU arrives at that location.
436
437 int sum = 0;
438
439 for (i = 0; i < N; ++i)
440 {
441 sum += arr [i]
442 ecb_prefetch (arr + i + 128, 0, 0);
443 }
444
445 It's hard to predict how far to prefetch, and most CPUs that can prefetch
446 are often good enough to predict this kind of behaviour themselves. It
447 gets more interesting with linked lists, especially when you do some fair
448 processing on each list element:
449
450 for (node *n = start; n; n = n->next)
451 {
452 ecb_prefetch (n->next, 0, 0);
453 ... do medium amount of work with *n
454 }
455
456 After processing the node, (part of) the next node might already be in
457 cache.
458
459 =back
460
461 =head2 BIT FIDDLING / BIT WIZARDRY
462
463 =over 4
464
465 =item bool ecb_big_endian ()
466
467 =item bool ecb_little_endian ()
468
469 These two functions return true if the byte order is big endian
470 (most-significant byte first) or little endian (least-significant byte
471 first) respectively.
472
473 On systems that are neither, their return values are unspecified.
474
475 =item int ecb_ctz32 (uint32_t x)
476
477 =item int ecb_ctz64 (uint64_t x)
478
479 Returns the index of the least significant bit set in C<x> (or
480 equivalently the number of bits set to 0 before the least significant bit
481 set), starting from 0. If C<x> is 0 the result is undefined.
482
483 For smaller types than C<uint32_t> you can safely use C<ecb_ctz32>.
484
485 For example:
486
487 ecb_ctz32 (3) = 0
488 ecb_ctz32 (6) = 1
489
490 =item bool ecb_is_pot32 (uint32_t x)
491
492 =item bool ecb_is_pot64 (uint32_t x)
493
494 Return true iff C<x> is a power of two or C<x == 0>.
495
496 For smaller types then C<uint32_t> you can safely use C<ecb_is_pot32>.
497
498 =item int ecb_ld32 (uint32_t x)
499
500 =item int ecb_ld64 (uint64_t x)
501
502 Returns the index of the most significant bit set in C<x>, or the number
503 of digits the number requires in binary (so that C<< 2**ld <= x <
504 2**(ld+1) >>). If C<x> is 0 the result is undefined. A common use case is
505 to compute the integer binary logarithm, i.e. C<floor (log2 (n))>, for
506 example to see how many bits a certain number requires to be encoded.
507
508 This function is similar to the "count leading zero bits" function, except
509 that that one returns how many zero bits are "in front" of the number (in
510 the given data type), while C<ecb_ld> returns how many bits the number
511 itself requires.
512
513 For smaller types than C<uint32_t> you can safely use C<ecb_ld32>.
514
515 =item int ecb_popcount32 (uint32_t x)
516
517 =item int ecb_popcount64 (uint64_t x)
518
519 Returns the number of bits set to 1 in C<x>.
520
521 For smaller types than C<uint32_t> you can safely use C<ecb_popcount32>.
522
523 For example:
524
525 ecb_popcount32 (7) = 3
526 ecb_popcount32 (255) = 8
527
528 =item uint8_t ecb_bitrev8 (uint8_t x)
529
530 =item uint16_t ecb_bitrev16 (uint16_t x)
531
532 =item uint32_t ecb_bitrev32 (uint32_t x)
533
534 Reverses the bits in x, i.e. the MSB becomes the LSB, MSB-1 becomes LSB+1
535 and so on.
536
537 Example:
538
539 ecb_bitrev8 (0xa7) = 0xea
540 ecb_bitrev32 (0xffcc4411) = 0x882233ff
541
542 =item uint32_t ecb_bswap16 (uint32_t x)
543
544 =item uint32_t ecb_bswap32 (uint32_t x)
545
546 =item uint64_t ecb_bswap64 (uint64_t x)
547
548 These functions return the value of the 16-bit (32-bit, 64-bit) value
549 C<x> after reversing the order of bytes (0x11223344 becomes 0x44332211 in
550 C<ecb_bswap32>).
551
552 =item uint8_t ecb_rotl8 (uint8_t x, unsigned int count)
553
554 =item uint16_t ecb_rotl16 (uint16_t x, unsigned int count)
555
556 =item uint32_t ecb_rotl32 (uint32_t x, unsigned int count)
557
558 =item uint64_t ecb_rotl64 (uint64_t x, unsigned int count)
559
560 =item uint8_t ecb_rotr8 (uint8_t x, unsigned int count)
561
562 =item uint16_t ecb_rotr16 (uint16_t x, unsigned int count)
563
564 =item uint32_t ecb_rotr32 (uint32_t x, unsigned int count)
565
566 =item uint64_t ecb_rotr64 (uint64_t x, unsigned int count)
567
568 These two families of functions return the value of C<x> after rotating
569 all the bits by C<count> positions to the right (C<ecb_rotr>) or left
570 (C<ecb_rotl>).
571
572 Current GCC versions understand these functions and usually compile them
573 to "optimal" code (e.g. a single C<rol> or a combination of C<shld> on
574 x86).
575
576 =back
577
578 =head2 ARITHMETIC
579
580 =over 4
581
582 =item x = ecb_mod (m, n)
583
584 Returns C<m> modulo C<n>, which is the same as the positive remainder
585 of the division operation between C<m> and C<n>, using floored
586 division. Unlike the C remainder operator C<%>, this function ensures that
587 the return value is always positive and that the two numbers I<m> and
588 I<m' = m + i * n> result in the same value modulo I<n> - in other words,
589 C<ecb_mod> implements the mathematical modulo operation, which is missing
590 in the language.
591
592 C<n> must be strictly positive (i.e. C<< >= 1 >>), while C<m> must be
593 negatable, that is, both C<m> and C<-m> must be representable in its
594 type (this typically excludes the minimum signed integer value, the same
595 limitation as for C</> and C<%> in C).
596
597 Current GCC versions compile this into an efficient branchless sequence on
598 almost all CPUs.
599
600 For example, when you want to rotate forward through the members of an
601 array for increasing C<m> (which might be negative), then you should use
602 C<ecb_mod>, as the C<%> operator might give either negative results, or
603 change direction for negative values:
604
605 for (m = -100; m <= 100; ++m)
606 int elem = myarray [ecb_mod (m, ecb_array_length (myarray))];
607
608 =item x = ecb_div_rd (val, div)
609
610 =item x = ecb_div_ru (val, div)
611
612 Returns C<val> divided by C<div> rounded down or up, respectively.
613 C<val> and C<div> must have integer types and C<div> must be strictly
614 positive. Note that these functions are implemented with macros in C
615 and with function templates in C++.
616
617 =back
618
619 =head2 UTILITY
620
621 =over 4
622
623 =item element_count = ecb_array_length (name)
624
625 Returns the number of elements in the array C<name>. For example:
626
627 int primes[] = { 2, 3, 5, 7, 11 };
628 int sum = 0;
629
630 for (i = 0; i < ecb_array_length (primes); i++)
631 sum += primes [i];
632
633 =back
634
635 =head2 SYMBOLS GOVERNING COMPILATION OF ECB.H ITSELF
636
637 These symbols need to be defined before including F<ecb.h> the first time.
638
639 =over 4
640
641 =item ECB_NO_THRADS
642
643 If F<ecb.h> is never used from multiple threads, then this symbol can
644 be defined, in which case memory fences (and similar constructs) are
645 completely removed, leading to more efficient code and fewer dependencies.
646
647 Setting this symbol to a true value implies C<ECB_NO_SMP>.
648
649 =item ECB_NO_SMP
650
651 The weaker version of C<ECB_NO_THREADS> - if F<ecb.h> is used from
652 multiple threads, but never concurrently (e.g. if the system the program
653 runs on has only a single CPU with a single core, no hyperthreading and so
654 on), then this symbol can be defined, leading to more efficient code and
655 fewer dependencies.
656
657 =back
658
659