ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/libecb/ecb.pod
Revision: 1.43
Committed: Tue May 29 14:09:49 2012 UTC (12 years ago) by root
Branch: MAIN
Changes since 1.42: +49 -0 lines
Log Message:
*** empty log message ***

File Contents

# Content
1 =head1 LIBECB - e-C-Builtins
2
3 =head2 ABOUT LIBECB
4
5 Libecb is currently a simple header file that doesn't require any
6 configuration to use or include in your project.
7
8 It's part of the e-suite of libraries, other members of which include
9 libev and libeio.
10
11 Its homepage can be found here:
12
13 http://software.schmorp.de/pkg/libecb
14
15 It mainly provides a number of wrappers around GCC built-ins, together
16 with replacement functions for other compilers. In addition to this,
17 it provides a number of other lowlevel C utilities, such as endianness
18 detection, byte swapping or bit rotations.
19
20 Or in other words, things that should be built into any standard C system,
21 but aren't, implemented as efficient as possible with GCC, and still
22 correct with other compilers.
23
24 More might come.
25
26 =head2 ABOUT THE HEADER
27
28 At the moment, all you have to do is copy F<ecb.h> somewhere where your
29 compiler can find it and include it:
30
31 #include <ecb.h>
32
33 The header should work fine for both C and C++ compilation, and gives you
34 all of F<inttypes.h> in addition to the ECB symbols.
35
36 There are currently no object files to link to - future versions might
37 come with an (optional) object code library to link against, to reduce
38 code size or gain access to additional features.
39
40 It also currently includes everything from F<inttypes.h>.
41
42 =head2 ABOUT THIS MANUAL / CONVENTIONS
43
44 This manual mainly describes each (public) function available after
45 including the F<ecb.h> header. The header might define other symbols than
46 these, but these are not part of the public API, and not supported in any
47 way.
48
49 When the manual mentions a "function" then this could be defined either as
50 as inline function, a macro, or an external symbol.
51
52 When functions use a concrete standard type, such as C<int> or
53 C<uint32_t>, then the corresponding function works only with that type. If
54 only a generic name is used (C<expr>, C<cond>, C<value> and so on), then
55 the corresponding function relies on C to implement the correct types, and
56 is usually implemented as a macro. Specifically, a "bool" in this manual
57 refers to any kind of boolean value, not a specific type.
58
59 =head2 TYPES / TYPE SUPPORT
60
61 ecb.h makes sure that the following types are defined (in the expected way):
62
63 int8_t uint8_t int16_t uint16_t
64 int32_t uint32_t int64_t uint64_t
65 intptr_t uintptr_t ptrdiff_t
66
67 The macro C<ECB_PTRSIZE> is defined to the size of a pointer on this
68 platform (currently C<4> or C<8>).
69
70 =head2 LANGUAGE/COMPILER VERSIONS
71
72 =over 4
73
74 =item ECB_C99
75
76 Expands to a true value (suitable for testing in by the preprocessor)
77 if the environment claims to be C99 compliant.
78
79 =item ECB_C11
80
81 Expands to a true value (suitable for testing in by the preprocessor)
82 if the environment claims to be C11 compliant.
83
84 =item ECB_GCC_VERSION(major,minor)
85
86 Expands to a true value (suitable for testing in by the preprocessor)
87 if the compiler used is GNU C and the version is the givne version, or
88 higher.
89
90 This macro tries to return false on compilers that claim to be GCC
91 compatible but aren't.
92
93 =back
94
95 =head2 GCC ATTRIBUTES
96
97 A major part of libecb deals with GCC attributes. These are additional
98 attributes that you can assign to functions, variables and sometimes even
99 types - much like C<const> or C<volatile> in C.
100
101 While GCC allows declarations to show up in many surprising places,
102 but not in many expected places, the safest way is to put attribute
103 declarations before the whole declaration:
104
105 ecb_const int mysqrt (int a);
106 ecb_unused int i;
107
108 For variables, it is often nicer to put the attribute after the name, and
109 avoid multiple declarations using commas:
110
111 int i ecb_unused;
112
113 =over 4
114
115 =item ecb_attribute ((attrs...))
116
117 A simple wrapper that expands to C<__attribute__((attrs))> on GCC, and to
118 nothing on other compilers, so the effect is that only GCC sees these.
119
120 Example: use the C<deprecated> attribute on a function.
121
122 ecb_attribute((__deprecated__)) void
123 do_not_use_me_anymore (void);
124
125 =item ecb_unused
126
127 Marks a function or a variable as "unused", which simply suppresses a
128 warning by GCC when it detects it as unused. This is useful when you e.g.
129 declare a variable but do not always use it:
130
131 {
132 int var ecb_unused;
133
134 #ifdef SOMECONDITION
135 var = ...;
136 return var;
137 #else
138 return 0;
139 #endif
140 }
141
142 =item ecb_inline
143
144 This is not actually an attribute, but you use it like one. It expands
145 either to C<static inline> or to just C<static>, if inline isn't
146 supported. It should be used to declare functions that should be inlined,
147 for code size or speed reasons.
148
149 Example: inline this function, it surely will reduce codesize.
150
151 ecb_inline int
152 negmul (int a, int b)
153 {
154 return - (a * b);
155 }
156
157 =item ecb_noinline
158
159 Prevent a function from being inlined - it might be optimised away, but
160 not inlined into other functions. This is useful if you know your function
161 is rarely called and large enough for inlining not to be helpful.
162
163 =item ecb_noreturn
164
165 Marks a function as "not returning, ever". Some typical functions that
166 don't return are C<exit> or C<abort> (which really works hard to not
167 return), and now you can make your own:
168
169 ecb_noreturn void
170 my_abort (const char *errline)
171 {
172 puts (errline);
173 abort ();
174 }
175
176 In this case, the compiler would probably be smart enough to deduce it on
177 its own, so this is mainly useful for declarations.
178
179 =item ecb_const
180
181 Declares that the function only depends on the values of its arguments,
182 much like a mathematical function. It specifically does not read or write
183 any memory any arguments might point to, global variables, or call any
184 non-const functions. It also must not have any side effects.
185
186 Such a function can be optimised much more aggressively by the compiler -
187 for example, multiple calls with the same arguments can be optimised into
188 a single call, which wouldn't be possible if the compiler would have to
189 expect any side effects.
190
191 It is best suited for functions in the sense of mathematical functions,
192 such as a function returning the square root of its input argument.
193
194 Not suited would be a function that calculates the hash of some memory
195 area you pass in, prints some messages or looks at a global variable to
196 decide on rounding.
197
198 See C<ecb_pure> for a slightly less restrictive class of functions.
199
200 =item ecb_pure
201
202 Similar to C<ecb_const>, declares a function that has no side
203 effects. Unlike C<ecb_const>, the function is allowed to examine global
204 variables and any other memory areas (such as the ones passed to it via
205 pointers).
206
207 While these functions cannot be optimised as aggressively as C<ecb_const>
208 functions, they can still be optimised away in many occasions, and the
209 compiler has more freedom in moving calls to them around.
210
211 Typical examples for such functions would be C<strlen> or C<memcmp>. A
212 function that calculates the MD5 sum of some input and updates some MD5
213 state passed as argument would I<NOT> be pure, however, as it would modify
214 some memory area that is not the return value.
215
216 =item ecb_hot
217
218 This declares a function as "hot" with regards to the cache - the function
219 is used so often, that it is very beneficial to keep it in the cache if
220 possible.
221
222 The compiler reacts by trying to place hot functions near to each other in
223 memory.
224
225 Whether a function is hot or not often depends on the whole program,
226 and less on the function itself. C<ecb_cold> is likely more useful in
227 practise.
228
229 =item ecb_cold
230
231 The opposite of C<ecb_hot> - declares a function as "cold" with regards to
232 the cache, or in other words, this function is not called often, or not at
233 speed-critical times, and keeping it in the cache might be a waste of said
234 cache.
235
236 In addition to placing cold functions together (or at least away from hot
237 functions), this knowledge can be used in other ways, for example, the
238 function will be optimised for size, as opposed to speed, and codepaths
239 leading to calls to those functions can automatically be marked as if
240 C<ecb_expect_false> had been used to reach them.
241
242 Good examples for such functions would be error reporting functions, or
243 functions only called in exceptional or rare cases.
244
245 =item ecb_artificial
246
247 Declares the function as "artificial", in this case meaning that this
248 function is not really mean to be a function, but more like an accessor
249 - many methods in C++ classes are mere accessor functions, and having a
250 crash reported in such a method, or single-stepping through them, is not
251 usually so helpful, especially when it's inlined to just a few instructions.
252
253 Marking them as artificial will instruct the debugger about just this,
254 leading to happier debugging and thus happier lives.
255
256 Example: in some kind of smart-pointer class, mark the pointer accessor as
257 artificial, so that the whole class acts more like a pointer and less like
258 some C++ abstraction monster.
259
260 template<typename T>
261 struct my_smart_ptr
262 {
263 T *value;
264
265 ecb_artificial
266 operator T *()
267 {
268 return value;
269 }
270 };
271
272 =back
273
274 =head2 OPTIMISATION HINTS
275
276 =over 4
277
278 =item bool ecb_is_constant(expr)
279
280 Returns true iff the expression can be deduced to be a compile-time
281 constant, and false otherwise.
282
283 For example, when you have a C<rndm16> function that returns a 16 bit
284 random number, and you have a function that maps this to a range from
285 0..n-1, then you could use this inline function in a header file:
286
287 ecb_inline uint32_t
288 rndm (uint32_t n)
289 {
290 return (n * (uint32_t)rndm16 ()) >> 16;
291 }
292
293 However, for powers of two, you could use a normal mask, but that is only
294 worth it if, at compile time, you can detect this case. This is the case
295 when the passed number is a constant and also a power of two (C<n & (n -
296 1) == 0>):
297
298 ecb_inline uint32_t
299 rndm (uint32_t n)
300 {
301 return is_constant (n) && !(n & (n - 1))
302 ? rndm16 () & (num - 1)
303 : (n * (uint32_t)rndm16 ()) >> 16;
304 }
305
306 =item bool ecb_expect (expr, value)
307
308 Evaluates C<expr> and returns it. In addition, it tells the compiler that
309 the C<expr> evaluates to C<value> a lot, which can be used for static
310 branch optimisations.
311
312 Usually, you want to use the more intuitive C<ecb_expect_true> and
313 C<ecb_expect_false> functions instead.
314
315 =item bool ecb_expect_true (cond)
316
317 =item bool ecb_expect_false (cond)
318
319 These two functions expect a expression that is true or false and return
320 C<1> or C<0>, respectively, so when used in the condition of an C<if> or
321 other conditional statement, it will not change the program:
322
323 /* these two do the same thing */
324 if (some_condition) ...;
325 if (ecb_expect_true (some_condition)) ...;
326
327 However, by using C<ecb_expect_true>, you tell the compiler that the
328 condition is likely to be true (and for C<ecb_expect_false>, that it is
329 unlikely to be true).
330
331 For example, when you check for a null pointer and expect this to be a
332 rare, exceptional, case, then use C<ecb_expect_false>:
333
334 void my_free (void *ptr)
335 {
336 if (ecb_expect_false (ptr == 0))
337 return;
338 }
339
340 Consequent use of these functions to mark away exceptional cases or to
341 tell the compiler what the hot path through a function is can increase
342 performance considerably.
343
344 You might know these functions under the name C<likely> and C<unlikely>
345 - while these are common aliases, we find that the expect name is easier
346 to understand when quickly skimming code. If you wish, you can use
347 C<ecb_likely> instead of C<ecb_expect_true> and C<ecb_unlikely> instead of
348 C<ecb_expect_false> - these are simply aliases.
349
350 A very good example is in a function that reserves more space for some
351 memory block (for example, inside an implementation of a string stream) -
352 each time something is added, you have to check for a buffer overrun, but
353 you expect that most checks will turn out to be false:
354
355 /* make sure we have "size" extra room in our buffer */
356 ecb_inline void
357 reserve (int size)
358 {
359 if (ecb_expect_false (current + size > end))
360 real_reserve_method (size); /* presumably noinline */
361 }
362
363 =item bool ecb_assume (cond)
364
365 Try to tell the compiler that some condition is true, even if it's not
366 obvious.
367
368 This can be used to teach the compiler about invariants or other
369 conditions that might improve code generation, but which are impossible to
370 deduce form the code itself.
371
372 For example, the example reservation function from the C<ecb_expect_false>
373 description could be written thus (only C<ecb_assume> was added):
374
375 ecb_inline void
376 reserve (int size)
377 {
378 if (ecb_expect_false (current + size > end))
379 real_reserve_method (size); /* presumably noinline */
380
381 ecb_assume (current + size <= end);
382 }
383
384 If you then call this function twice, like this:
385
386 reserve (10);
387 reserve (1);
388
389 Then the compiler I<might> be able to optimise out the second call
390 completely, as it knows that C<< current + 1 > end >> is false and the
391 call will never be executed.
392
393 =item bool ecb_unreachable ()
394
395 This function does nothing itself, except tell the compiler that it will
396 never be executed. Apart from suppressing a warning in some cases, this
397 function can be used to implement C<ecb_assume> or similar functions.
398
399 =item bool ecb_prefetch (addr, rw, locality)
400
401 Tells the compiler to try to prefetch memory at the given C<addr>ess
402 for either reading (C<rw> = 0) or writing (C<rw> = 1). A C<locality> of
403 C<0> means that there will only be one access later, C<3> means that
404 the data will likely be accessed very often, and values in between mean
405 something... in between. The memory pointed to by the address does not
406 need to be accessible (it could be a null pointer for example), but C<rw>
407 and C<locality> must be compile-time constants.
408
409 An obvious way to use this is to prefetch some data far away, in a big
410 array you loop over. This prefetches memory some 128 array elements later,
411 in the hope that it will be ready when the CPU arrives at that location.
412
413 int sum = 0;
414
415 for (i = 0; i < N; ++i)
416 {
417 sum += arr [i]
418 ecb_prefetch (arr + i + 128, 0, 0);
419 }
420
421 It's hard to predict how far to prefetch, and most CPUs that can prefetch
422 are often good enough to predict this kind of behaviour themselves. It
423 gets more interesting with linked lists, especially when you do some fair
424 processing on each list element:
425
426 for (node *n = start; n; n = n->next)
427 {
428 ecb_prefetch (n->next, 0, 0);
429 ... do medium amount of work with *n
430 }
431
432 After processing the node, (part of) the next node might already be in
433 cache.
434
435 =back
436
437 =head2 BIT FIDDLING / BIT WIZARDRY
438
439 =over 4
440
441 =item bool ecb_big_endian ()
442
443 =item bool ecb_little_endian ()
444
445 These two functions return true if the byte order is big endian
446 (most-significant byte first) or little endian (least-significant byte
447 first) respectively.
448
449 On systems that are neither, their return values are unspecified.
450
451 =item int ecb_ctz32 (uint32_t x)
452
453 =item int ecb_ctz64 (uint64_t x)
454
455 Returns the index of the least significant bit set in C<x> (or
456 equivalently the number of bits set to 0 before the least significant bit
457 set), starting from 0. If C<x> is 0 the result is undefined.
458
459 For smaller types than C<uint32_t> you can safely use C<ecb_ctz32>.
460
461 For example:
462
463 ecb_ctz32 (3) = 0
464 ecb_ctz32 (6) = 1
465
466 =item bool ecb_is_pot32 (uint32_t x)
467
468 =item bool ecb_is_pot64 (uint32_t x)
469
470 Return true iff C<x> is a power of two or C<x == 0>.
471
472 For smaller types then C<uint32_t> you can safely use C<ecb_is_pot32>.
473
474 =item int ecb_ld32 (uint32_t x)
475
476 =item int ecb_ld64 (uint64_t x)
477
478 Returns the index of the most significant bit set in C<x>, or the number
479 of digits the number requires in binary (so that C<< 2**ld <= x <
480 2**(ld+1) >>). If C<x> is 0 the result is undefined. A common use case is
481 to compute the integer binary logarithm, i.e. C<floor (log2 (n))>, for
482 example to see how many bits a certain number requires to be encoded.
483
484 This function is similar to the "count leading zero bits" function, except
485 that that one returns how many zero bits are "in front" of the number (in
486 the given data type), while C<ecb_ld> returns how many bits the number
487 itself requires.
488
489 For smaller types than C<uint32_t> you can safely use C<ecb_ld32>.
490
491 =item int ecb_popcount32 (uint32_t x)
492
493 =item int ecb_popcount64 (uint64_t x)
494
495 Returns the number of bits set to 1 in C<x>.
496
497 For smaller types than C<uint32_t> you can safely use C<ecb_popcount32>.
498
499 For example:
500
501 ecb_popcount32 (7) = 3
502 ecb_popcount32 (255) = 8
503
504 =item uint8_t ecb_bitrev8 (uint8_t x)
505
506 =item uint16_t ecb_bitrev16 (uint16_t x)
507
508 =item uint32_t ecb_bitrev32 (uint32_t x)
509
510 Reverses the bits in x, i.e. the MSB becomes the LSB, MSB-1 becomes LSB+1
511 and so on.
512
513 Example:
514
515 ecb_bitrev8 (0xa7) = 0xea
516 ecb_bitrev32 (0xffcc4411) = 0x882233ff
517
518 =item uint32_t ecb_bswap16 (uint32_t x)
519
520 =item uint32_t ecb_bswap32 (uint32_t x)
521
522 =item uint64_t ecb_bswap64 (uint64_t x)
523
524 These functions return the value of the 16-bit (32-bit, 64-bit) value
525 C<x> after reversing the order of bytes (0x11223344 becomes 0x44332211 in
526 C<ecb_bswap32>).
527
528 =item uint8_t ecb_rotl8 (uint8_t x, unsigned int count)
529
530 =item uint16_t ecb_rotl16 (uint16_t x, unsigned int count)
531
532 =item uint32_t ecb_rotl32 (uint32_t x, unsigned int count)
533
534 =item uint64_t ecb_rotl64 (uint64_t x, unsigned int count)
535
536 =item uint8_t ecb_rotr8 (uint8_t x, unsigned int count)
537
538 =item uint16_t ecb_rotr16 (uint16_t x, unsigned int count)
539
540 =item uint32_t ecb_rotr32 (uint32_t x, unsigned int count)
541
542 =item uint64_t ecb_rotr64 (uint64_t x, unsigned int count)
543
544 These two families of functions return the value of C<x> after rotating
545 all the bits by C<count> positions to the right (C<ecb_rotr>) or left
546 (C<ecb_rotl>).
547
548 Current GCC versions understand these functions and usually compile them
549 to "optimal" code (e.g. a single C<rol> or a combination of C<shld> on
550 x86).
551
552 =back
553
554 =head2 ARITHMETIC
555
556 =over 4
557
558 =item x = ecb_mod (m, n)
559
560 Returns C<m> modulo C<n>, which is the same as the positive remainder
561 of the division operation between C<m> and C<n>, using floored
562 division. Unlike the C remainder operator C<%>, this function ensures that
563 the return value is always positive and that the two numbers I<m> and
564 I<m' = m + i * n> result in the same value modulo I<n> - in other words,
565 C<ecb_mod> implements the mathematical modulo operation, which is missing
566 in the language.
567
568 C<n> must be strictly positive (i.e. C<< >= 1 >>), while C<m> must be
569 negatable, that is, both C<m> and C<-m> must be representable in its
570 type (this typically excludes the minimum signed integer value, the same
571 limitation as for C</> and C<%> in C).
572
573 Current GCC versions compile this into an efficient branchless sequence on
574 almost all CPUs.
575
576 For example, when you want to rotate forward through the members of an
577 array for increasing C<m> (which might be negative), then you should use
578 C<ecb_mod>, as the C<%> operator might give either negative results, or
579 change direction for negative values:
580
581 for (m = -100; m <= 100; ++m)
582 int elem = myarray [ecb_mod (m, ecb_array_length (myarray))];
583
584 =item x = ecb_div_rd (val, div)
585
586 =item x = ecb_div_ru (val, div)
587
588 Returns C<val> divided by C<div> rounded down or up, respectively.
589 C<val> and C<div> must have integer types and C<div> must be strictly
590 positive. Note that these functions are implemented with macros in C
591 and with function templates in C++.
592
593 =back
594
595 =head2 UTILITY
596
597 =over 4
598
599 =item element_count = ecb_array_length (name)
600
601 Returns the number of elements in the array C<name>. For example:
602
603 int primes[] = { 2, 3, 5, 7, 11 };
604 int sum = 0;
605
606 for (i = 0; i < ecb_array_length (primes); i++)
607 sum += primes [i];
608
609 =back
610
611 =head2 SYMBOLS GOVERNING COMPILATION OF ECB.H ITSELF
612
613 These symbols need to be defined before including F<ecb.h> the first time.
614
615 =over 4
616
617 =item ECB_NO_THRADS
618
619 If F<ecb.h> is never used from multiple threads, then this symbol can
620 be defined, in which case memory fences (and similar constructs) are
621 completely removed, leading to more efficient code and fewer dependencies.
622
623 Setting this symbol to a true value implies C<ECB_NO_SMP>.
624
625 =item ECB_NO_SMP
626
627 The weaker version of C<ECB_NO_THREADS> - if F<ecb.h> is used from
628 multiple threads, but never concurrently (e.g. if the system the program
629 runs on has only a single CPU with a single core, no hyperthreading and so
630 on), then this symbol can be defined, leading to more efficient code and
631 fewer dependencies.
632
633 =back
634
635