ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/libecb/ecb.pod
Revision: 1.19
Committed: Fri May 27 00:04:05 2011 UTC (13 years ago) by sf-exg
Branch: MAIN
Changes since 1.18: +6 -6 lines
Log Message:
Fix typos.

File Contents

# User Rev Content
1 root 1.14 =head1 LIBECB - e-C-Builtins
2 root 1.3
3 root 1.14 =head2 ABOUT LIBECB
4    
5     Libecb is currently a simple header file that doesn't require any
6     configuration to use or include in your project.
7    
8 sf-exg 1.16 It's part of the e-suite of libraries, other members of which include
9 root 1.14 libev and libeio.
10    
11     Its homepage can be found here:
12    
13     http://software.schmorp.de/pkg/libecb
14    
15     It mainly provides a number of wrappers around GCC built-ins, together
16     with replacement functions for other compilers. In addition to this,
17 sf-exg 1.16 it provides a number of other lowlevel C utilities, such as endianness
18 root 1.14 detection, byte swapping or bit rotations.
19    
20 root 1.17 Or in other words, things that should be built-in into any standard C
21 root 1.18 system, but aren't.
22 root 1.17
23 root 1.14 More might come.
24 root 1.3
25     =head2 ABOUT THE HEADER
26    
27 root 1.14 At the moment, all you have to do is copy F<ecb.h> somewhere where your
28     compiler can find it and include it:
29    
30     #include <ecb.h>
31    
32     The header should work fine for both C and C++ compilation, and gives you
33     all of F<inttypes.h> in addition to the ECB symbols.
34    
35 sf-exg 1.16 There are currently no object files to link to - future versions might
36 root 1.14 come with an (optional) object code library to link against, to reduce
37     code size or gain access to additional features.
38    
39     It also currently includes everything from F<inttypes.h>.
40    
41     =head2 ABOUT THIS MANUAL / CONVENTIONS
42    
43     This manual mainly describes each (public) function available after
44     including the F<ecb.h> header. The header might define other symbols than
45     these, but these are not part of the public API, and not supported in any
46     way.
47    
48     When the manual mentions a "function" then this could be defined either as
49     as inline function, a macro, or an external symbol.
50    
51     When functions use a concrete standard type, such as C<int> or
52     C<uint32_t>, then the corresponding function works only with that type. If
53     only a generic name is used (C<expr>, C<cond>, C<value> and so on), then
54     the corresponding function relies on C to implement the correct types, and
55     is usually implemented as a macro. Specifically, a "bool" in this manual
56     refers to any kind of boolean value, not a specific type.
57 root 1.1
58     =head2 GCC ATTRIBUTES
59    
60 root 1.3 blabla where to put, what others
61    
62 root 1.1 =over 4
63    
64 root 1.2 =item ecb_attribute ((attrs...))
65 root 1.1
66 root 1.15 A simple wrapper that expands to C<__attribute__((attrs))> on GCC, and to
67     nothing on other compilers, so the effect is that only GCC sees these.
68    
69     Example: use the C<deprecated> attribute on a function.
70    
71     ecb_attribute((__deprecated__)) void
72     do_not_use_me_anymore (void);
73 root 1.2
74 root 1.3 =item ecb_unused
75    
76     Marks a function or a variable as "unused", which simply suppresses a
77     warning by GCC when it detects it as unused. This is useful when you e.g.
78     declare a variable but do not always use it:
79    
80 root 1.15 {
81     int var ecb_unused;
82 root 1.3
83 root 1.15 #ifdef SOMECONDITION
84     var = ...;
85     return var;
86     #else
87     return 0;
88     #endif
89     }
90 root 1.3
91 root 1.2 =item ecb_noinline
92    
93 root 1.9 Prevent a function from being inlined - it might be optimised away, but
94 root 1.3 not inlined into other functions. This is useful if you know your function
95     is rarely called and large enough for inlining not to be helpful.
96    
97 root 1.2 =item ecb_noreturn
98    
99 root 1.17 Marks a function as "not returning, ever". Some typical functions that
100     don't return are C<exit> or C<abort> (which really works hard to not
101     return), and now you can make your own:
102    
103     ecb_noreturn void
104     my_abort (const char *errline)
105     {
106     puts (errline);
107     abort ();
108     }
109    
110 sf-exg 1.19 In this case, the compiler would probably be smart enough to deduce it on
111     its own, so this is mainly useful for declarations.
112 root 1.17
113 root 1.2 =item ecb_const
114    
115 sf-exg 1.19 Declares that the function only depends on the values of its arguments,
116 root 1.17 much like a mathematical function. It specifically does not read or write
117     any memory any arguments might point to, global variables, or call any
118     non-const functions. It also must not have any side effects.
119    
120     Such a function can be optimised much more aggressively by the compiler -
121     for example, multiple calls with the same arguments can be optimised into
122     a single call, which wouldn't be possible if the compiler would have to
123     expect any side effects.
124    
125     It is best suited for functions in the sense of mathematical functions,
126 sf-exg 1.19 such as a function returning the square root of its input argument.
127 root 1.17
128     Not suited would be a function that calculates the hash of some memory
129     area you pass in, prints some messages or looks at a global variable to
130     decide on rounding.
131    
132     See C<ecb_pure> for a slightly less restrictive class of functions.
133    
134 root 1.2 =item ecb_pure
135    
136 root 1.17 Similar to C<ecb_const>, declares a function that has no side
137     effects. Unlike C<ecb_const>, the function is allowed to examine global
138     variables and any other memory areas (such as the ones passed to it via
139     pointers).
140    
141     While these functions cannot be optimised as aggressively as C<ecb_const>
142     functions, they can still be optimised away in many occasions, and the
143     compiler has more freedom in moving calls to them around.
144    
145     Typical examples for such functions would be C<strlen> or C<memcmp>. A
146     function that calculates the MD5 sum of some input and updates some MD5
147     state passed as argument would I<NOT> be pure, however, as it would modify
148     some memory area that is not the return value.
149    
150 root 1.2 =item ecb_hot
151    
152 root 1.17 This declares a function as "hot" with regards to the cache - the function
153     is used so often, that it is very beneficial to keep it in the cache if
154     possible.
155    
156     The compiler reacts by trying to place hot functions near to each other in
157     memory.
158    
159 sf-exg 1.19 Whether a function is hot or not often depends on the whole program,
160 root 1.17 and less on the function itself. C<ecb_cold> is likely more useful in
161     practise.
162    
163 root 1.2 =item ecb_cold
164    
165 root 1.17 The opposite of C<ecb_hot> - declares a function as "cold" with regards to
166     the cache, or in other words, this function is not called often, or not at
167     speed-critical times, and keeping it in the cache might be a waste of said
168     cache.
169    
170     In addition to placing cold functions together (or at least away from hot
171     functions), this knowledge can be used in other ways, for example, the
172     function will be optimised for size, as opposed to speed, and codepaths
173     leading to calls to those functions can automatically be marked as if
174 sf-exg 1.19 C<ecb_unlikely> had been used to reach them.
175 root 1.17
176     Good examples for such functions would be error reporting functions, or
177     functions only called in exceptional or rare cases.
178    
179 root 1.2 =item ecb_artificial
180    
181 root 1.17 Declares the function as "artificial", in this case meaning that this
182     function is not really mean to be a function, but more like an accessor
183     - many methods in C++ classes are mere accessor functions, and having a
184     crash reported in such a method, or single-stepping through them, is not
185     usually so helpful, especially when it's inlined to just a few instructions.
186    
187     Marking them as artificial will instruct the debugger about just this,
188     leading to happier debugging and thus happier lives.
189    
190     Example: in some kind of smart-pointer class, mark the pointer accessor as
191     artificial, so that the whole class acts more like a pointer and less like
192     some C++ abstraction monster.
193    
194     template<typename T>
195     struct my_smart_ptr
196     {
197     T *value;
198    
199     ecb_artificial
200     operator T *()
201     {
202     return value;
203     }
204     };
205    
206 root 1.2 =back
207 root 1.1
208     =head2 OPTIMISATION HINTS
209    
210     =over 4
211    
212 root 1.14 =item bool ecb_is_constant(expr)
213 root 1.1
214 root 1.3 Returns true iff the expression can be deduced to be a compile-time
215     constant, and false otherwise.
216    
217     For example, when you have a C<rndm16> function that returns a 16 bit
218     random number, and you have a function that maps this to a range from
219 root 1.5 0..n-1, then you could use this inline function in a header file:
220 root 1.3
221     ecb_inline uint32_t
222     rndm (uint32_t n)
223     {
224 root 1.6 return (n * (uint32_t)rndm16 ()) >> 16;
225 root 1.3 }
226    
227     However, for powers of two, you could use a normal mask, but that is only
228     worth it if, at compile time, you can detect this case. This is the case
229     when the passed number is a constant and also a power of two (C<n & (n -
230     1) == 0>):
231    
232     ecb_inline uint32_t
233     rndm (uint32_t n)
234     {
235     return is_constant (n) && !(n & (n - 1))
236     ? rndm16 () & (num - 1)
237 root 1.6 : (n * (uint32_t)rndm16 ()) >> 16;
238 root 1.3 }
239    
240 root 1.14 =item bool ecb_expect (expr, value)
241 root 1.1
242 root 1.7 Evaluates C<expr> and returns it. In addition, it tells the compiler that
243     the C<expr> evaluates to C<value> a lot, which can be used for static
244     branch optimisations.
245 root 1.1
246 root 1.7 Usually, you want to use the more intuitive C<ecb_likely> and
247     C<ecb_unlikely> functions instead.
248 root 1.1
249 root 1.15 =item bool ecb_likely (cond)
250 root 1.1
251 root 1.15 =item bool ecb_unlikely (cond)
252 root 1.1
253 root 1.7 These two functions expect a expression that is true or false and return
254     C<1> or C<0>, respectively, so when used in the condition of an C<if> or
255     other conditional statement, it will not change the program:
256    
257     /* these two do the same thing */
258     if (some_condition) ...;
259     if (ecb_likely (some_condition)) ...;
260    
261     However, by using C<ecb_likely>, you tell the compiler that the condition
262 sf-exg 1.11 is likely to be true (and for C<ecb_unlikely>, that it is unlikely to be
263 root 1.7 true).
264    
265 root 1.9 For example, when you check for a null pointer and expect this to be a
266     rare, exceptional, case, then use C<ecb_unlikely>:
267 root 1.7
268     void my_free (void *ptr)
269     {
270     if (ecb_unlikely (ptr == 0))
271     return;
272     }
273    
274     Consequent use of these functions to mark away exceptional cases or to
275     tell the compiler what the hot path through a function is can increase
276     performance considerably.
277    
278     A very good example is in a function that reserves more space for some
279     memory block (for example, inside an implementation of a string stream) -
280 root 1.9 each time something is added, you have to check for a buffer overrun, but
281 root 1.7 you expect that most checks will turn out to be false:
282    
283     /* make sure we have "size" extra room in our buffer */
284     ecb_inline void
285     reserve (int size)
286     {
287     if (ecb_unlikely (current + size > end))
288     real_reserve_method (size); /* presumably noinline */
289     }
290    
291 root 1.14 =item bool ecb_assume (cond)
292 root 1.7
293     Try to tell the compiler that some condition is true, even if it's not
294     obvious.
295    
296     This can be used to teach the compiler about invariants or other
297     conditions that might improve code generation, but which are impossible to
298     deduce form the code itself.
299    
300     For example, the example reservation function from the C<ecb_unlikely>
301     description could be written thus (only C<ecb_assume> was added):
302    
303     ecb_inline void
304     reserve (int size)
305     {
306     if (ecb_unlikely (current + size > end))
307     real_reserve_method (size); /* presumably noinline */
308    
309     ecb_assume (current + size <= end);
310     }
311    
312     If you then call this function twice, like this:
313    
314     reserve (10);
315     reserve (1);
316    
317     Then the compiler I<might> be able to optimise out the second call
318     completely, as it knows that C<< current + 1 > end >> is false and the
319     call will never be executed.
320    
321     =item bool ecb_unreachable ()
322    
323     This function does nothing itself, except tell the compiler that it will
324 root 1.9 never be executed. Apart from suppressing a warning in some cases, this
325 root 1.7 function can be used to implement C<ecb_assume> or similar functions.
326    
327 root 1.14 =item bool ecb_prefetch (addr, rw, locality)
328 root 1.7
329     Tells the compiler to try to prefetch memory at the given C<addr>ess
330 root 1.10 for either reading (C<rw> = 0) or writing (C<rw> = 1). A C<locality> of
331 root 1.7 C<0> means that there will only be one access later, C<3> means that
332     the data will likely be accessed very often, and values in between mean
333     something... in between. The memory pointed to by the address does not
334     need to be accessible (it could be a null pointer for example), but C<rw>
335     and C<locality> must be compile-time constants.
336    
337     An obvious way to use this is to prefetch some data far away, in a big
338 root 1.9 array you loop over. This prefetches memory some 128 array elements later,
339 root 1.7 in the hope that it will be ready when the CPU arrives at that location.
340    
341     int sum = 0;
342    
343     for (i = 0; i < N; ++i)
344     {
345     sum += arr [i]
346     ecb_prefetch (arr + i + 128, 0, 0);
347     }
348    
349     It's hard to predict how far to prefetch, and most CPUs that can prefetch
350     are often good enough to predict this kind of behaviour themselves. It
351     gets more interesting with linked lists, especially when you do some fair
352     processing on each list element:
353    
354     for (node *n = start; n; n = n->next)
355     {
356     ecb_prefetch (n->next, 0, 0);
357     ... do medium amount of work with *n
358     }
359    
360     After processing the node, (part of) the next node might already be in
361     cache.
362 root 1.1
363 root 1.2 =back
364 root 1.1
365     =head2 BIT FIDDLING / BITSTUFFS
366    
367 root 1.4 =over 4
368    
369 root 1.3 =item bool ecb_big_endian ()
370    
371     =item bool ecb_little_endian ()
372    
373 sf-exg 1.11 These two functions return true if the byte order is big endian
374     (most-significant byte first) or little endian (least-significant byte
375     first) respectively.
376    
377 root 1.3 =item int ecb_ctz32 (uint32_t x)
378    
379 sf-exg 1.11 Returns the index of the least significant bit set in C<x> (or
380     equivalently the number of bits set to 0 before the least significant
381     bit set), starting from 0. If C<x> is 0 the result is undefined. A
382     common use case is to compute the integer binary logarithm, i.e.,
383     floor(log2(n)). For example:
384    
385 root 1.15 ecb_ctz32 (3) = 0
386     ecb_ctz32 (6) = 1
387 sf-exg 1.11
388 root 1.3 =item int ecb_popcount32 (uint32_t x)
389    
390 sf-exg 1.11 Returns the number of bits set to 1 in C<x>. For example:
391    
392 root 1.15 ecb_popcount32 (7) = 3
393     ecb_popcount32 (255) = 8
394 sf-exg 1.11
395 root 1.8 =item uint32_t ecb_bswap16 (uint32_t x)
396    
397 root 1.3 =item uint32_t ecb_bswap32 (uint32_t x)
398    
399 sf-exg 1.13 These two functions return the value of the 16-bit (32-bit) variable
400     C<x> after reversing the order of bytes.
401    
402 root 1.3 =item uint32_t ecb_rotr32 (uint32_t x, unsigned int count)
403    
404     =item uint32_t ecb_rotl32 (uint32_t x, unsigned int count)
405    
406 sf-exg 1.11 These two functions return the value of C<x> after shifting all the bits
407     by C<count> positions to the right or left respectively.
408    
409 root 1.3 =back
410 root 1.1
411     =head2 ARITHMETIC
412    
413 root 1.3 =over 4
414    
415 root 1.14 =item x = ecb_mod (m, n)
416 root 1.3
417 root 1.14 Returns the positive remainder of the modulo operation between C<m> and
418 sf-exg 1.16 C<n>. Unlike the C modulo operator C<%>, this function ensures that the
419 root 1.14 return value is always positive).
420    
421     C<n> must be strictly positive (i.e. C<< >1 >>), while C<m> must be
422     negatable, that is, both C<m> and C<-m> must be representable in its
423     type.
424 sf-exg 1.11
425 root 1.3 =back
426 root 1.1
427     =head2 UTILITY
428    
429 root 1.3 =over 4
430    
431 root 1.8 =item element_count = ecb_array_length (name) [MACRO]
432 root 1.3
433 sf-exg 1.13 Returns the number of elements in the array C<name>. For example:
434    
435     int primes[] = { 2, 3, 5, 7, 11 };
436     int sum = 0;
437    
438     for (i = 0; i < ecb_array_length (primes); i++)
439     sum += primes [i];
440    
441 root 1.3 =back
442 root 1.1
443