ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/libecb/ecb.pod
Revision: 1.107
Committed: Fri Mar 25 15:28:08 2022 UTC (2 years, 1 month ago) by root
Branch: MAIN
CVS Tags: HEAD
Changes since 1.106: +8 -8 lines
Log Message:
*** empty log message ***

File Contents

# Content
1 =head1 LIBECB - e-C-Builtins
2
3 =head2 ABOUT LIBECB
4
5 Libecb is currently a simple header file that doesn't require any
6 configuration to use or include in your project.
7
8 It's part of the e-suite of libraries, other members of which include
9 libev and libeio.
10
11 Its homepage can be found here:
12
13 http://software.schmorp.de/pkg/libecb
14
15 It mainly provides a number of wrappers around many compiler built-ins,
16 together with replacement functions for other compilers. In addition
17 to this, it provides a number of other low-level C utilities, such as
18 endianness detection, byte swapping or bit rotations.
19
20 Or in other words, things that should be built into any standard C
21 system, but aren't, implemented as efficient as possible with GCC (clang,
22 MSVC...), and still correct with other compilers.
23
24 More might come.
25
26 =head2 ABOUT THE HEADER
27
28 At the moment, all you have to do is copy F<ecb.h> somewhere where your
29 compiler can find it and include it:
30
31 #include <ecb.h>
32
33 The header should work fine for both C and C++ compilation, and gives you
34 all of F<inttypes.h> in addition to the ECB symbols.
35
36 There are currently no object files to link to - future versions might
37 come with an (optional) object code library to link against, to reduce
38 code size or gain access to additional features.
39
40 It also currently includes everything from F<inttypes.h>.
41
42 =head2 ABOUT THIS MANUAL / CONVENTIONS
43
44 This manual mainly describes each (public) function available after
45 including the F<ecb.h> header. The header might define other symbols than
46 these, but these are not part of the public API, and not supported in any
47 way.
48
49 When the manual mentions a "function" then this could be defined either as
50 as inline function, a macro, or an external symbol.
51
52 When functions use a concrete standard type, such as C<int> or
53 C<uint32_t>, then the corresponding function works only with that type. If
54 only a generic name is used (C<expr>, C<cond>, C<value> and so on), then
55 the corresponding function relies on C to implement the correct types, and
56 is usually implemented as a macro. Specifically, a "bool" in this manual
57 refers to any kind of boolean value, not a specific type.
58
59 =head2 TYPES / TYPE SUPPORT
60
61 F<ecb.h> makes sure that the following types are defined (in the expected way):
62
63 int8_t uint8_
64 int16_t uint16_t
65 int32_t uint32_
66 int64_t uint64_t
67 int_fast8_t uint_fast8_t
68 int_fast16_t uint_fast16_t
69 int_fast32_t uint_fast32_t
70 int_fast64_t uint_fast64_t
71 intptr_t uintptr_t
72
73 The macro C<ECB_PTRSIZE> is defined to the size of a pointer on this
74 platform (currently C<4> or C<8>) and can be used in preprocessor
75 expressions.
76
77 For C<ptrdiff_t> and C<size_t> use C<stddef.h>/C<cstddef>.
78
79 =head2 LANGUAGE/ENVIRONMENT/COMPILER VERSIONS
80
81 All the following symbols expand to an expression that can be tested in
82 preprocessor instructions as well as treated as a boolean (use C<!!> to
83 ensure it's either C<0> or C<1> if you need that).
84
85 =over
86
87 =item ECB_C
88
89 True if the implementation defines the C<__STDC__> macro to a true value,
90 while not claiming to be C++, i..e C, but not C++.
91
92 =item ECB_C99
93
94 True if the implementation claims to be compliant to C99 (ISO/IEC
95 9899:1999) or any later version, while not claiming to be C++.
96
97 Note that later versions (ECB_C11) remove core features again (for
98 example, variable length arrays).
99
100 =item ECB_C11, ECB_C17
101
102 True if the implementation claims to be compliant to C11/C17 (ISO/IEC
103 9899:2011, :20187) or any later version, while not claiming to be C++.
104
105 =item ECB_CPP
106
107 True if the implementation defines the C<__cplusplus__> macro to a true
108 value, which is typically true for C++ compilers.
109
110 =item ECB_CPP11, ECB_CPP14, ECB_CPP17
111
112 True if the implementation claims to be compliant to C++11/C++14/C++17
113 (ISO/IEC 14882:2011, :2014, :2017) or any later version.
114
115 Note that many C++20 features will likely have their own feature test
116 macros (see e.g. L<http://eel.is/c++draft/cpp.predefined#1.8>).
117
118 =item ECB_OPTIMIZE_SIZE
119
120 Is C<1> when the compiler optimizes for size, C<0> otherwise. This symbol
121 can also be defined before including F<ecb.h>, in which case it will be
122 unchanged.
123
124 =item ECB_GCC_VERSION (major, minor)
125
126 Expands to a true value (suitable for testing by the preprocessor) if the
127 compiler used is GNU C and the version is the given version, or higher.
128
129 This macro tries to return false on compilers that claim to be GCC
130 compatible but aren't.
131
132 =item ECB_EXTERN_C
133
134 Expands to C<extern "C"> in C++, and a simple C<extern> in C.
135
136 This can be used to declare a single external C function:
137
138 ECB_EXTERN_C int printf (const char *format, ...);
139
140 =item ECB_EXTERN_C_BEG / ECB_EXTERN_C_END
141
142 These two macros can be used to wrap multiple C<extern "C"> definitions -
143 they expand to nothing in C.
144
145 They are most useful in header files:
146
147 ECB_EXTERN_C_BEG
148
149 int mycfun1 (int x);
150 int mycfun2 (int x);
151
152 ECB_EXTERN_C_END
153
154 =item ECB_STDFP
155
156 If this evaluates to a true value (suitable for testing by the
157 preprocessor), then C<float> and C<double> use IEEE 754 single/binary32
158 and double/binary64 representations internally I<and> the endianness of
159 both types match the endianness of C<uint32_t> and C<uint64_t>.
160
161 This means you can just copy the bits of a C<float> (or C<double>) to an
162 C<uint32_t> (or C<uint64_t>) and get the raw IEEE 754 bit representation
163 without having to think about format or endianness.
164
165 This is true for basically all modern platforms, although F<ecb.h> might
166 not be able to deduce this correctly everywhere and might err on the safe
167 side.
168
169 =item ECB_64BIT_NATIVE
170
171 Evaluates to a true value (suitable for both preprocessor and C code
172 testing) if 64 bit integer types on this architecture are evaluated
173 "natively", that is, with similar speeds as 32 bit integers. While 64 bit
174 integer support is very common (and in fact required by libecb), 32 bit
175 CPUs have to emulate operations on them, so you might want to avoid them.
176
177 =item ECB_AMD64, ECB_AMD64_X32
178
179 These two macros are defined to C<1> on the x86_64/amd64 ABI and the X32
180 ABI, respectively, and undefined elsewhere.
181
182 The designers of the new X32 ABI for some inexplicable reason decided to
183 make it look exactly like amd64, even though it's completely incompatible
184 to that ABI, breaking about every piece of software that assumed that
185 C<__x86_64> stands for, well, the x86-64 ABI, making these macros
186 necessary.
187
188 =back
189
190 =head2 MACRO TRICKERY
191
192 =over
193
194 =item ECB_CONCAT (a, b)
195
196 Expands any macros in C<a> and C<b>, then concatenates the result to form
197 a single token. This is mainly useful to form identifiers from components,
198 e.g.:
199
200 #define S1 str
201 #define S2 cpy
202
203 ECB_CONCAT (S1, S2)(dst, src); // == strcpy (dst, src);
204
205 =item ECB_STRINGIFY (arg)
206
207 Expands any macros in C<arg> and returns the stringified version of
208 it. This is mainly useful to get the contents of a macro in string form,
209 e.g.:
210
211 #define SQL_LIMIT 100
212 sql_exec ("select * from table limit " ECB_STRINGIFY (SQL_LIMIT));
213
214 =item ECB_STRINGIFY_EXPR (expr)
215
216 Like C<ECB_STRINGIFY>, but additionally evaluates C<expr> to make sure it
217 is a valid expression. This is useful to catch typos or cases where the
218 macro isn't available:
219
220 #include <errno.h>
221
222 ECB_STRINGIFY (EDOM); // "33" (on my system at least)
223 ECB_STRINGIFY_EXPR (EDOM); // "33"
224
225 // now imagine we had a typo:
226
227 ECB_STRINGIFY (EDAM); // "EDAM"
228 ECB_STRINGIFY_EXPR (EDAM); // error: EDAM undefined
229
230 =back
231
232 =head2 ATTRIBUTES
233
234 A major part of libecb deals with additional attributes that can be
235 assigned to functions, variables and sometimes even types - much like
236 C<const> or C<volatile> in C. They are implemented using either GCC
237 attributes or other compiler/language specific features. Attributes
238 declarations must be put before the whole declaration:
239
240 ecb_const int mysqrt (int a);
241 ecb_unused int i;
242
243 =over
244
245 =item ecb_unused
246
247 Marks a function or a variable as "unused", which simply suppresses a
248 warning by the compiler when it detects it as unused. This is useful when
249 you e.g. declare a variable but do not always use it:
250
251 {
252 ecb_unused int var;
253
254 #ifdef SOMECONDITION
255 var = ...;
256 return var;
257 #else
258 return 0;
259 #endif
260 }
261
262 =item ecb_deprecated
263
264 Similar to C<ecb_unused>, but marks a function, variable or type as
265 deprecated. This makes some compilers warn when the type is used.
266
267 =item ecb_deprecated_message (message)
268
269 Same as C<ecb_deprecated>, but if possible, the specified diagnostic is
270 used instead of a generic depreciation message when the object is being
271 used.
272
273 =item ecb_inline
274
275 Expands either to (a compiler-specific equivalent of) C<static inline> or
276 to just C<static>, if inline isn't supported. It should be used to declare
277 functions that should be inlined, for code size or speed reasons.
278
279 Example: inline this function, it surely will reduce code size.
280
281 ecb_inline int
282 negmul (int a, int b)
283 {
284 return - (a * b);
285 }
286
287 =item ecb_noinline
288
289 Prevents a function from being inlined - it might be optimised away, but
290 not inlined into other functions. This is useful if you know your function
291 is rarely called and large enough for inlining not to be helpful.
292
293 =item ecb_noreturn
294
295 Marks a function as "not returning, ever". Some typical functions that
296 don't return are C<exit> or C<abort> (which really works hard to not
297 return), and now you can make your own:
298
299 ecb_noreturn void
300 my_abort (const char *errline)
301 {
302 puts (errline);
303 abort ();
304 }
305
306 In this case, the compiler would probably be smart enough to deduce it on
307 its own, so this is mainly useful for declarations.
308
309 =item ecb_restrict
310
311 Expands to the C<restrict> keyword or equivalent on compilers that support
312 them, and to nothing on others. Must be specified on a pointer type or
313 an array index to indicate that the memory doesn't alias with any other
314 restricted pointer in the same scope.
315
316 Example: multiply a vector, and allow the compiler to parallelise the
317 loop, because it knows it doesn't overwrite input values.
318
319 void
320 multiply (ecb_restrict float *src,
321 ecb_restrict float *dst,
322 int len, float factor)
323 {
324 int i;
325
326 for (i = 0; i < len; ++i)
327 dst [i] = src [i] * factor;
328 }
329
330 =item ecb_const
331
332 Declares that the function only depends on the values of its arguments,
333 much like a mathematical function. It specifically does not read or write
334 any memory any arguments might point to, global variables, or call any
335 non-const functions. It also must not have any side effects.
336
337 Such a function can be optimised much more aggressively by the compiler -
338 for example, multiple calls with the same arguments can be optimised into
339 a single call, which wouldn't be possible if the compiler would have to
340 expect any side effects.
341
342 It is best suited for functions in the sense of mathematical functions,
343 such as a function returning the square root of its input argument.
344
345 Not suited would be a function that calculates the hash of some memory
346 area you pass in, prints some messages or looks at a global variable to
347 decide on rounding.
348
349 See C<ecb_pure> for a slightly less restrictive class of functions.
350
351 =item ecb_pure
352
353 Similar to C<ecb_const>, declares a function that has no side
354 effects. Unlike C<ecb_const>, the function is allowed to examine global
355 variables and any other memory areas (such as the ones passed to it via
356 pointers).
357
358 While these functions cannot be optimised as aggressively as C<ecb_const>
359 functions, they can still be optimised away in many occasions, and the
360 compiler has more freedom in moving calls to them around.
361
362 Typical examples for such functions would be C<strlen> or C<memcmp>. A
363 function that calculates the MD5 sum of some input and updates some MD5
364 state passed as argument would I<NOT> be pure, however, as it would modify
365 some memory area that is not the return value.
366
367 =item ecb_hot
368
369 This declares a function as "hot" with regards to the cache - the function
370 is used so often, that it is very beneficial to keep it in the cache if
371 possible.
372
373 The compiler reacts by trying to place hot functions near to each other in
374 memory.
375
376 Whether a function is hot or not often depends on the whole program,
377 and less on the function itself. C<ecb_cold> is likely more useful in
378 practise.
379
380 =item ecb_cold
381
382 The opposite of C<ecb_hot> - declares a function as "cold" with regards to
383 the cache, or in other words, this function is not called often, or not at
384 speed-critical times, and keeping it in the cache might be a waste of said
385 cache.
386
387 In addition to placing cold functions together (or at least away from hot
388 functions), this knowledge can be used in other ways, for example, the
389 function will be optimised for size, as opposed to speed, and code paths
390 leading to calls to those functions can automatically be marked as if
391 C<ecb_expect_false> had been used to reach them.
392
393 Good examples for such functions would be error reporting functions, or
394 functions only called in exceptional or rare cases.
395
396 =item ecb_artificial
397
398 Declares the function as "artificial", in this case meaning that this
399 function is not really meant to be a function, but more like an accessor
400 - many methods in C++ classes are mere accessor functions, and having a
401 crash reported in such a method, or single-stepping through them, is not
402 usually so helpful, especially when it's inlined to just a few instructions.
403
404 Marking them as artificial will instruct the debugger about just this,
405 leading to happier debugging and thus happier lives.
406
407 Example: in some kind of smart-pointer class, mark the pointer accessor as
408 artificial, so that the whole class acts more like a pointer and less like
409 some C++ abstraction monster.
410
411 template<typename T>
412 struct my_smart_ptr
413 {
414 T *value;
415
416 ecb_artificial
417 operator T *()
418 {
419 return value;
420 }
421 };
422
423 =back
424
425 =head2 OPTIMISATION HINTS
426
427 =over
428
429 =item bool ecb_is_constant (expr)
430
431 Returns true iff the expression can be deduced to be a compile-time
432 constant, and false otherwise.
433
434 For example, when you have a C<rndm16> function that returns a 16 bit
435 random number, and you have a function that maps this to a range from
436 0..n-1, then you could use this inline function in a header file:
437
438 ecb_inline uint32_t
439 rndm (uint32_t n)
440 {
441 return (n * (uint32_t)rndm16 ()) >> 16;
442 }
443
444 However, for powers of two, you could use a normal mask, but that is only
445 worth it if, at compile time, you can detect this case. This is the case
446 when the passed number is a constant and also a power of two (C<n & (n -
447 1) == 0>):
448
449 ecb_inline uint32_t
450 rndm (uint32_t n)
451 {
452 return is_constant (n) && !(n & (n - 1))
453 ? rndm16 () & (num - 1)
454 : (n * (uint32_t)rndm16 ()) >> 16;
455 }
456
457 =item ecb_expect (expr, value)
458
459 Evaluates C<expr> and returns it. In addition, it tells the compiler that
460 the C<expr> evaluates to C<value> a lot, which can be used for static
461 branch optimisations.
462
463 Usually, you want to use the more intuitive C<ecb_expect_true> and
464 C<ecb_expect_false> functions instead.
465
466 =item bool ecb_expect_true (cond)
467
468 =item bool ecb_expect_false (cond)
469
470 These two functions expect a expression that is true or false and return
471 C<1> or C<0>, respectively, so when used in the condition of an C<if> or
472 other conditional statement, it will not change the program:
473
474 /* these two do the same thing */
475 if (some_condition) ...;
476 if (ecb_expect_true (some_condition)) ...;
477
478 However, by using C<ecb_expect_true>, you tell the compiler that the
479 condition is likely to be true (and for C<ecb_expect_false>, that it is
480 unlikely to be true).
481
482 For example, when you check for a null pointer and expect this to be a
483 rare, exceptional, case, then use C<ecb_expect_false>:
484
485 void my_free (void *ptr)
486 {
487 if (ecb_expect_false (ptr == 0))
488 return;
489 }
490
491 Consequent use of these functions to mark away exceptional cases or to
492 tell the compiler what the hot path through a function is can increase
493 performance considerably.
494
495 You might know these functions under the name C<likely> and C<unlikely>
496 - while these are common aliases, we find that the expect name is easier
497 to understand when quickly skimming code. If you wish, you can use
498 C<ecb_likely> instead of C<ecb_expect_true> and C<ecb_unlikely> instead of
499 C<ecb_expect_false> - these are simply aliases.
500
501 A very good example is in a function that reserves more space for some
502 memory block (for example, inside an implementation of a string stream) -
503 each time something is added, you have to check for a buffer overrun, but
504 you expect that most checks will turn out to be false:
505
506 /* make sure we have "size" extra room in our buffer */
507 ecb_inline void
508 reserve (int size)
509 {
510 if (ecb_expect_false (current + size > end))
511 real_reserve_method (size); /* presumably noinline */
512 }
513
514 =item ecb_assume (cond)
515
516 Tries to tell the compiler that some condition is true, even if it's not
517 obvious. This is not a function, but a statement: it cannot be used in
518 another expression.
519
520 This can be used to teach the compiler about invariants or other
521 conditions that might improve code generation, but which are impossible to
522 deduce form the code itself.
523
524 For example, the example reservation function from the C<ecb_expect_false>
525 description could be written thus (only C<ecb_assume> was added):
526
527 ecb_inline void
528 reserve (int size)
529 {
530 if (ecb_expect_false (current + size > end))
531 real_reserve_method (size); /* presumably noinline */
532
533 ecb_assume (current + size <= end);
534 }
535
536 If you then call this function twice, like this:
537
538 reserve (10);
539 reserve (1);
540
541 Then the compiler I<might> be able to optimise out the second call
542 completely, as it knows that C<< current + 1 > end >> is false and the
543 call will never be executed.
544
545 =item ecb_unreachable ()
546
547 This function does nothing itself, except tell the compiler that it will
548 never be executed. Apart from suppressing a warning in some cases, this
549 function can be used to implement C<ecb_assume> or similar functionality.
550
551 =item ecb_prefetch (addr, rw, locality)
552
553 Tells the compiler to try to prefetch memory at the given I<addr>ess
554 for either reading (I<rw> = 0) or writing (I<rw> = 1). A I<locality> of
555 C<0> means that there will only be one access later, C<3> means that
556 the data will likely be accessed very often, and values in between mean
557 something... in between. The memory pointed to by the address does not
558 need to be accessible (it could be a null pointer for example), but C<rw>
559 and C<locality> must be compile-time constants.
560
561 This is a statement, not a function: you cannot use it as part of an
562 expression.
563
564 An obvious way to use this is to prefetch some data far away, in a big
565 array you loop over. This prefetches memory some 128 array elements later,
566 in the hope that it will be ready when the CPU arrives at that location.
567
568 int sum = 0;
569
570 for (i = 0; i < N; ++i)
571 {
572 sum += arr [i]
573 ecb_prefetch (arr + i + 128, 0, 0);
574 }
575
576 It's hard to predict how far to prefetch, and most CPUs that can prefetch
577 are often good enough to predict this kind of behaviour themselves. It
578 gets more interesting with linked lists, especially when you do some fair
579 processing on each list element:
580
581 for (node *n = start; n; n = n->next)
582 {
583 ecb_prefetch (n->next, 0, 0);
584 ... do medium amount of work with *n
585 }
586
587 After processing the node, (part of) the next node might already be in
588 cache.
589
590 =back
591
592 =head2 BIT FIDDLING / BIT WIZARDRY
593
594 =over
595
596 =item bool ecb_big_endian ()
597
598 =item bool ecb_little_endian ()
599
600 These two functions return true if the byte order is big endian
601 (most-significant byte first) or little endian (least-significant byte
602 first) respectively.
603
604 On systems that are neither, their return values are unspecified.
605
606 =item int ecb_ctz32 (uint32_t x)
607
608 =item int ecb_ctz64 (uint64_t x)
609
610 =item int ecb_ctz (T x) [C++]
611
612 Returns the index of the least significant bit set in C<x> (or
613 equivalently the number of bits set to 0 before the least significant bit
614 set), starting from 0. If C<x> is 0 the result is undefined.
615
616 For smaller types than C<uint32_t> you can safely use C<ecb_ctz32>.
617
618 The overloaded C++ C<ecb_ctz> function supports C<uint8_t>, C<uint16_t>,
619 C<uint32_t> and C<uint64_t> types.
620
621 For example:
622
623 ecb_ctz32 (3) = 0
624 ecb_ctz32 (6) = 1
625
626 =item int ecb_clz32 (uint32_t x)
627
628 =item int ecb_clz64 (uint64_t x)
629
630 Counts the number of leading zero bits in C<x>. If C<x> is 0 the result is
631 undefined.
632
633 It is often simpler to use one of the C<ecb_ld*> functions instead, whose
634 result only depends on the value and not the size of the type. This is
635 also the reason why there is no C++ overload.
636
637 For example:
638
639 ecb_clz32 (3) = 30
640 ecb_clz32 (6) = 29
641
642 =item bool ecb_is_pot32 (uint32_t x)
643
644 =item bool ecb_is_pot64 (uint32_t x)
645
646 =item bool ecb_is_pot (T x) [C++]
647
648 Returns true iff C<x> is a power of two or C<x == 0>.
649
650 For smaller types than C<uint32_t> you can safely use C<ecb_is_pot32>.
651
652 The overloaded C++ C<ecb_is_pot> function supports C<uint8_t>, C<uint16_t>,
653 C<uint32_t> and C<uint64_t> types.
654
655 =item int ecb_ld32 (uint32_t x)
656
657 =item int ecb_ld64 (uint64_t x)
658
659 =item int ecb_ld64 (T x) [C++]
660
661 Returns the index of the most significant bit set in C<x>, or the number
662 of digits the number requires in binary (so that C<< 2**ld <= x <
663 2**(ld+1) >>). If C<x> is 0 the result is undefined. A common use case is
664 to compute the integer binary logarithm, i.e. C<floor (log2 (n))>, for
665 example to see how many bits a certain number requires to be encoded.
666
667 This function is similar to the "count leading zero bits" function, except
668 that that one returns how many zero bits are "in front" of the number (in
669 the given data type), while C<ecb_ld> returns how many bits the number
670 itself requires.
671
672 For smaller types than C<uint32_t> you can safely use C<ecb_ld32>.
673
674 The overloaded C++ C<ecb_ld> function supports C<uint8_t>, C<uint16_t>,
675 C<uint32_t> and C<uint64_t> types.
676
677 =item int ecb_popcount32 (uint32_t x)
678
679 =item int ecb_popcount64 (uint64_t x)
680
681 =item int ecb_popcount (T x) [C++]
682
683 Returns the number of bits set to 1 in C<x>.
684
685 For smaller types than C<uint32_t> you can safely use C<ecb_popcount32>.
686
687 The overloaded C++ C<ecb_popcount> function supports C<uint8_t>, C<uint16_t>,
688 C<uint32_t> and C<uint64_t> types.
689
690 For example:
691
692 ecb_popcount32 (7) = 3
693 ecb_popcount32 (255) = 8
694
695 =item uint8_t ecb_bitrev8 (uint8_t x)
696
697 =item uint16_t ecb_bitrev16 (uint16_t x)
698
699 =item uint32_t ecb_bitrev32 (uint32_t x)
700
701 =item T ecb_bitrev (T x) [C++]
702
703 Reverses the bits in x, i.e. the MSB becomes the LSB, MSB-1 becomes LSB+1
704 and so on.
705
706 The overloaded C++ C<ecb_bitrev> function supports C<uint8_t>, C<uint16_t> and C<uint32_t> types.
707
708 Example:
709
710 ecb_bitrev8 (0xa7) = 0xea
711 ecb_bitrev32 (0xffcc4411) = 0x882233ff
712
713 =item T ecb_bitrev (T x) [C++]
714
715 Overloaded C++ bitrev function.
716
717 C<T> must be one of C<uint8_t>, C<uint16_t> or C<uint32_t>.
718
719 =item uint32_t ecb_bswap16 (uint32_t x)
720
721 =item uint32_t ecb_bswap32 (uint32_t x)
722
723 =item uint64_t ecb_bswap64 (uint64_t x)
724
725 =item T ecb_bswap (T x)
726
727 These functions return the value of the 16-bit (32-bit, 64-bit) value
728 C<x> after reversing the order of bytes (0x11223344 becomes 0x44332211 in
729 C<ecb_bswap32>).
730
731 The overloaded C++ C<ecb_bswap> function supports C<uint8_t>, C<uint16_t>,
732 C<uint32_t> and C<uint64_t> types.
733
734 =item uint8_t ecb_rotl8 (uint8_t x, unsigned int count)
735
736 =item uint16_t ecb_rotl16 (uint16_t x, unsigned int count)
737
738 =item uint32_t ecb_rotl32 (uint32_t x, unsigned int count)
739
740 =item uint64_t ecb_rotl64 (uint64_t x, unsigned int count)
741
742 =item uint8_t ecb_rotr8 (uint8_t x, unsigned int count)
743
744 =item uint16_t ecb_rotr16 (uint16_t x, unsigned int count)
745
746 =item uint32_t ecb_rotr32 (uint32_t x, unsigned int count)
747
748 =item uint64_t ecb_rotr64 (uint64_t x, unsigned int count)
749
750 These two families of functions return the value of C<x> after rotating
751 all the bits by C<count> positions to the right (C<ecb_rotr>) or left
752 (C<ecb_rotl>). There are no restrictions on the value C<count>, i.e. both
753 zero and values equal or larger than the word width work correctly. Also,
754 notwithstanding C<count> being unsigned, negative numbers work and shift
755 to the opposite direction.
756
757 Current GCC/clang versions understand these functions and usually compile
758 them to "optimal" code (e.g. a single C<rol> or a combination of C<shld>
759 on x86).
760
761 =item T ecb_rotl (T x, unsigned int count) [C++]
762
763 =item T ecb_rotr (T x, unsigned int count) [C++]
764
765 Overloaded C++ rotl/rotr functions.
766
767 C<T> must be one of C<uint8_t>, C<uint16_t>, C<uint32_t> or C<uint64_t>.
768
769 =item uint_fast8_t ecb_gray_encode8 (uint_fast8_t b)
770
771 =item uint_fast16_t ecb_gray_encode16 (uint_fast16_t b)
772
773 =item uint_fast32_t ecb_gray_encode32 (uint_fast32_t b)
774
775 =item uint_fast64_t ecb_gray_encode64 (uint_fast64_t b)
776
777 Encode an unsigned into its corresponding (reflective) gray code - the
778 kind of gray code meant when just talking about "gray code". These
779 functions are very fast and all have identical implementation, so there is
780 no need to use a smaller type, as long as your CPU can handle it natively.
781
782 =item T ecb_gray_encode (T b) [C++]
783
784 Overloaded C++ version of the above, for C<uint{8,16,32,64}_t>.
785
786 =item uint_fast8_t ecb_gray_decode8 (uint_fast8_t b)
787
788 =item uint_fast16_t ecb_gray_decode16 (uint_fast16_t b)
789
790 =item uint_fast32_t ecb_gray_decode32 (uint_fast32_t b)
791
792 =item uint_fast64_t ecb_gray_decode64 (uint_fast64_t b)
793
794 Decode a gray code back into linear index form (the reverse of
795 C<ecb_gray*_encode>. Unlike the encode functions, the decode functions
796 have higher time complexity for larger types, so it can pay off to use a
797 smaller type here.
798
799 =item T ecb_gray_decode (T b) [C++]
800
801 Overloaded C++ version of the above, for C<uint{8,16,32,64}_t>.
802
803 =back
804
805 =head2 HILBERT CURVES
806
807 These functions deal with (square, pseudo) Hilbert curves. The parameter
808 I<order> indicates the size of the square and is specified in bits, that
809 means for order C<8>, the coordinates range from C<0>..C<255>, and the
810 curve index ranges from C<0>..C<65535>.
811
812 The 32 bit variants of these functions map a 32 bit index to two 16 bit
813 coordinates, stored in a 32 bit variable, where the high order bits are
814 the x-coordinate, and the low order bits are the y-coordinate, thus,
815 these functions map 32 bit linear index on the curve to a 32 bit packed
816 coordinate pair, and vice versa.
817
818 The 64 bit variants work similarly.
819
820 The I<order> can go from C<1> to C<16> for the 32 bit curve, and C<1> to
821 C<32> for the 64 bit curve.
822
823 When going from one order to the next higher order, these functions
824 replace the curve segments by smaller versions of the generating shape,
825 while doubling the size (since they use integer coordinates), which is
826 what you would expect mathematically. This means that the curve will be
827 mirrored at the diagonal. If your goal is to simply cover more area while
828 retaining existing point coordinates you should increase or decrease the
829 I<order> by C<2> or, in the case of C<ecb_hilbert2d_index_to_coord>,
830 simply specify the maximum I<order> of C<16> or C<32>, respectively, as
831 these are constant-time.
832
833 =over
834
835 =item uint32_t ecb_hilbert2d_index_to_coord32 (int order, uint32_t index)
836
837 =item uint64_t ecb_hilbert2d_index_to_coord64 (int order, uint64_t index)
838
839 Map a point on a pseudo Hilbert curve from its linear distance from the
840 origin on the curve to a x|y coordinate pair. The result is a packed
841 coordinate pair, to get the actual x and < coordinates, you could do
842 something like this:
843
844 uint32_t xy = ecb_hilbert2d_index_to_coord32 (16, 255);
845 uint16_t x = xy >> 16;
846 uint16_t y = xy & 0xffffU;
847
848 uint64_t xy = ecb_hilbert2d_index_to_coord64 (32, 255);
849 uint32_t x = xy >> 32;
850 uint32_t y = xy & 0xffffffffU;
851
852 These functions work in constant time, so for many applications it is
853 preferable to simply hard-code the order to the maximum (C<16> or C<32>).
854
855 This (production-ready, i.e. never run) example generates an SVG image of
856 an order 8 pseudo Hilbert curve:
857
858 printf ("<svg xmlns='http://www.w3.org/2000/svg' width='%d' height='%d'>\n", 64 * 8, 64 * 8);
859 printf ("<g transform='translate(4) scale(8)' stroke-width='0.25' stroke='black'>\n");
860 for (uint32_t i = 0; i < 64*64 - 1; ++i)
861 {
862 uint32_t p1 = ecb_hilbert2d_index_to_coord32 (6, i );
863 uint32_t p2 = ecb_hilbert2d_index_to_coord32 (6, i + 1);
864 printf ("<line x1='%d' y1='%d' x2='%d' y2='%d'/>\n",
865 p1 >> 16, p1 & 0xffff,
866 p2 >> 16, p2 & 0xffff);
867 }
868 printf ("</g>\n");
869 printf ("</svg>\n");
870
871 =item uint32_t ecb_hilbert2d_coord_to_index32 (int order, uint32_t xy)
872
873 =item uint64_t ecb_hilbert2d_coord_to_index64 (int order, uint64_t xy)
874
875 The reverse of C<ecb_hilbert2d_index_to_coord> - map a packed pair of
876 coordinates to their linear index on the pseudo Hilbert curve of order
877 I<order>.
878
879 They are an exact inverse of the C<ecb_hilbert2d_coord_to_index> functions
880 for the same I<order>:
881
882 assert (
883 u == ecb_hilbert2d_coord_to_index (32,
884 ecb_hilbert2d_index_to_coord32 (32,
885 u)));
886
887 Packing coordinates is done the same way, as well, from I<x> and I<y>:
888
889 uint32_t xy = ((uint32_t)x << 16) | y; // for ecb_hilbert2d_coord_to_index32
890 uint64_t xy = ((uint64_t)x << 32) | y; // for ecb_hilbert2d_coord_to_index64
891
892 Unlike C<ecb_hilbert2d_coord_to_index>, these functions are O(I<order>),
893 so it is preferable to use the lowest possible order.
894
895 =back
896
897 =head2 BIT MIXING, HASHING
898
899 Sometimes you have an integer and want to distribute its bits well, for
900 example, to use it as a hash in a hash table. A common example is pointer
901 values, which often only have a limited range (e.g. low and high bits are
902 often zero).
903
904 The following functions try to mix the bits to get a good bias-free
905 distribution. They were mainly made for pointers, but the underlying
906 integer functions are exposed as well.
907
908 As an added benefit, the functions are reversible, so if you find it
909 convenient to store only the hash value, you can recover the original
910 pointer from the hash ("unmix"), as long as your pointers are 32 or 64 bit
911 (if this isn't the case on your platform, drop us a note and we will add
912 functions for other bit widths).
913
914 The unmix functions are very slightly slower than the mix functions, so
915 it is equally very slightly preferable to store the original values wehen
916 convenient.
917
918 The underlying algorithm if subject to change, so currently these
919 functions are not suitable for persistent hash tables, as their result
920 value can change between different versions of libecb.
921
922 =over
923
924 =item uintptr_t ecb_ptrmix (void *ptr)
925
926 Mixes the bits of a pointer so the result is suitable for hash table
927 lookups. In other words, this hashes the pointer value.
928
929 =item uintptr_t ecb_ptrmix (T *ptr) [C++]
930
931 Overload the C<ecb_ptrmix> function to work for any pointer in C++.
932
933 =item void *ecb_ptrunmix (uintptr_t v)
934
935 Unmix the hash value into the original pointer. This only works as long
936 as the hash value is not truncated, i.e. you used C<uintptr_t> (or
937 equivalent) throughout to store it.
938
939 =item T *ecb_ptrunmix<T> (uintptr_t v) [C++]
940
941 The somewhat less useful template version of C<ecb_ptrunmix> for
942 C++. Example:
943
944 sometype *myptr;
945 uintptr_t hash = ecb_ptrmix (myptr);
946 sometype *orig = ecb_ptrunmix<sometype> (hash);
947
948 =item uint32_t ecb_mix32 (uint32_t v)
949
950 =item uint64_t ecb_mix64 (uint64_t v)
951
952 Sometimes you don't have a pointer but an integer whose values are very
953 badly distributed. In this case you can use these integer versions of the
954 mixing function. No C++ template is provided currently.
955
956 =item uint32_t ecb_unmix32 (uint32_t v)
957
958 =item uint64_t ecb_unmix64 (uint64_t v)
959
960 The reverse of the C<ecb_mix> functions - they take a mixed/hashed value
961 and recover the original value.
962
963 =back
964
965 =head2 HOST ENDIANNESS CONVERSION
966
967 =over
968
969 =item uint_fast16_t ecb_be_u16_to_host (uint_fast16_t v)
970
971 =item uint_fast32_t ecb_be_u32_to_host (uint_fast32_t v)
972
973 =item uint_fast64_t ecb_be_u64_to_host (uint_fast64_t v)
974
975 =item uint_fast16_t ecb_le_u16_to_host (uint_fast16_t v)
976
977 =item uint_fast32_t ecb_le_u32_to_host (uint_fast32_t v)
978
979 =item uint_fast64_t ecb_le_u64_to_host (uint_fast64_t v)
980
981 Convert an unsigned 16, 32 or 64 bit value from big or little endian to host byte order.
982
983 The naming convention is C<ecb_>(C<be>|C<le>)C<_u>C<16|32|64>C<_to_host>,
984 where C<be> and C<le> stand for big endian and little endian, respectively.
985
986 =item uint_fast16_t ecb_host_to_be_u16 (uint_fast16_t v)
987
988 =item uint_fast32_t ecb_host_to_be_u32 (uint_fast32_t v)
989
990 =item uint_fast64_t ecb_host_to_be_u64 (uint_fast64_t v)
991
992 =item uint_fast16_t ecb_host_to_le_u16 (uint_fast16_t v)
993
994 =item uint_fast32_t ecb_host_to_le_u32 (uint_fast32_t v)
995
996 =item uint_fast64_t ecb_host_to_le_u64 (uint_fast64_t v)
997
998 Like above, but converts I<from> host byte order to the specified
999 endianness.
1000
1001 =back
1002
1003 In C++ the following additional template functions are supported:
1004
1005 =over
1006
1007 =item T ecb_be_to_host (T v)
1008
1009 =item T ecb_le_to_host (T v)
1010
1011 =item T ecb_host_to_be (T v)
1012
1013 =item T ecb_host_to_le (T v)
1014
1015 =back
1016
1017 These functions work like their C counterparts, above, but use templates,
1018 which make them useful in generic code.
1019
1020 C<T> must be one of C<uint8_t>, C<uint16_t>, C<uint32_t> or C<uint64_t>
1021 (so unlike their C counterparts, there is a version for C<uint8_t>, which
1022 again can be useful in generic code).
1023
1024 =head2 UNALIGNED LOAD/STORE
1025
1026 These function load or store unaligned multi-byte values.
1027
1028 =over
1029
1030 =item uint_fast16_t ecb_peek_u16_u (const void *ptr)
1031
1032 =item uint_fast32_t ecb_peek_u32_u (const void *ptr)
1033
1034 =item uint_fast64_t ecb_peek_u64_u (const void *ptr)
1035
1036 These functions load an unaligned, unsigned 16, 32 or 64 bit value from
1037 memory.
1038
1039 =item uint_fast16_t ecb_peek_be_u16_u (const void *ptr)
1040
1041 =item uint_fast32_t ecb_peek_be_u32_u (const void *ptr)
1042
1043 =item uint_fast64_t ecb_peek_be_u64_u (const void *ptr)
1044
1045 =item uint_fast16_t ecb_peek_le_u16_u (const void *ptr)
1046
1047 =item uint_fast32_t ecb_peek_le_u32_u (const void *ptr)
1048
1049 =item uint_fast64_t ecb_peek_le_u64_u (const void *ptr)
1050
1051 Like above, but additionally convert from big endian (C<be>) or little
1052 endian (C<le>) byte order to host byte order while doing so.
1053
1054 =item ecb_poke_u16_u (void *ptr, uint16_t v)
1055
1056 =item ecb_poke_u32_u (void *ptr, uint32_t v)
1057
1058 =item ecb_poke_u64_u (void *ptr, uint64_t v)
1059
1060 These functions store an unaligned, unsigned 16, 32 or 64 bit value to
1061 memory.
1062
1063 =item ecb_poke_be_u16_u (void *ptr, uint_fast16_t v)
1064
1065 =item ecb_poke_be_u32_u (void *ptr, uint_fast32_t v)
1066
1067 =item ecb_poke_be_u64_u (void *ptr, uint_fast64_t v)
1068
1069 =item ecb_poke_le_u16_u (void *ptr, uint_fast16_t v)
1070
1071 =item ecb_poke_le_u32_u (void *ptr, uint_fast32_t v)
1072
1073 =item ecb_poke_le_u64_u (void *ptr, uint_fast64_t v)
1074
1075 Like above, but additionally convert from host byte order to big endian
1076 (C<be>) or little endian (C<le>) byte order while doing so.
1077
1078 =back
1079
1080 In C++ the following additional template functions are supported:
1081
1082 =over
1083
1084 =item T ecb_peek<T> (const void *ptr)
1085
1086 =item T ecb_peek_be<T> (const void *ptr)
1087
1088 =item T ecb_peek_le<T> (const void *ptr)
1089
1090 =item T ecb_peek_u<T> (const void *ptr)
1091
1092 =item T ecb_peek_be_u<T> (const void *ptr)
1093
1094 =item T ecb_peek_le_u<T> (const void *ptr)
1095
1096 Similarly to their C counterparts, these functions load an unsigned 8, 16,
1097 32 or 64 bit value from memory, with optional conversion from big/little
1098 endian.
1099
1100 Since the type cannot be deduced, it has to be specified explicitly, e.g.
1101
1102 uint_fast16_t v = ecb_peek<uint16_t> (ptr);
1103
1104 C<T> must be one of C<uint8_t>, C<uint16_t>, C<uint32_t> or C<uint64_t>.
1105
1106 Unlike their C counterparts, these functions support 8 bit quantities
1107 (C<uint8_t>) and also have an aligned version (without the C<_u> prefix),
1108 all of which hopefully makes them more useful in generic code.
1109
1110 =item ecb_poke (void *ptr, T v)
1111
1112 =item ecb_poke_be (void *ptr, T v)
1113
1114 =item ecb_poke_le (void *ptr, T v)
1115
1116 =item ecb_poke_u (void *ptr, T v)
1117
1118 =item ecb_poke_be_u (void *ptr, T v)
1119
1120 =item ecb_poke_le_u (void *ptr, T v)
1121
1122 Again, similarly to their C counterparts, these functions store an
1123 unsigned 8, 16, 32 or 64 bit value to memory, with optional conversion to
1124 big/little endian.
1125
1126 C<T> must be one of C<uint8_t>, C<uint16_t>, C<uint32_t> or C<uint64_t>.
1127
1128 Unlike their C counterparts, these functions support 8 bit quantities
1129 (C<uint8_t>) and also have an aligned version (without the C<_u> prefix),
1130 all of which hopefully makes them more useful in generic code.
1131
1132 =back
1133
1134 =head2 FAST INTEGER TO STRING
1135
1136 Libecb defines a set of very fast integer to decimal string (or integer
1137 to ASCII, short C<i2a>) functions. These work by converting the integer
1138 to a fixed point representation and then successively multiplying out
1139 the topmost digits. Unlike some other, also very fast, libraries, ecb's
1140 algorithm should be completely branchless per digit, and does not rely on
1141 the presence of special CPU functions (such as C<clz>).
1142
1143 There is a high level API that takes an C<int32_t>, C<uint32_t>,
1144 C<int64_t> or C<uint64_t> as argument, and a low-level API, which is
1145 harder to use but supports slightly more formatting options.
1146
1147 =head3 HIGH LEVEL API
1148
1149 The high level API consists of four functions, one each for C<int32_t>,
1150 C<uint32_t>, C<int64_t> and C<uint64_t>:
1151
1152 Example:
1153
1154 char buf[ECB_I2A_MAX_DIGITS + 1];
1155 char *end = ecb_i2a_i32 (buf, 17262);
1156 *end = 0;
1157 // buf now contains "17262"
1158
1159 =over
1160
1161 =item ECB_I2A_I32_DIGITS (=11)
1162
1163 =item char *ecb_i2a_u32 (char *ptr, uint32_t value)
1164
1165 Takes an C<uint32_t> I<value> and formats it as a decimal number starting
1166 at I<ptr>, using at most C<ECB_I2A_I32_DIGITS> characters. Returns a
1167 pointer to just after the generated string, where you would normally put
1168 the terminating C<0> character. This function outputs the minimum number
1169 of digits.
1170
1171 =item ECB_I2A_U32_DIGITS (=10)
1172
1173 =item char *ecb_i2a_i32 (char *ptr, int32_t value)
1174
1175 Same as C<ecb_i2a_u32>, but formats a C<int32_t> value, including a minus
1176 sign if needed.
1177
1178 =item ECB_I2A_I64_DIGITS (=20)
1179
1180 =item char *ecb_i2a_u64 (char *ptr, uint64_t value)
1181
1182 =item ECB_I2A_U64_DIGITS (=21)
1183
1184 =item char *ecb_i2a_i64 (char *ptr, int64_t value)
1185
1186 Similar to their 32 bit counterparts, these take a 64 bit argument.
1187
1188 =item ECB_I2A_MAX_DIGITS (=21)
1189
1190 Instead of using a type specific length macro, you can just use
1191 C<ECB_I2A_MAX_DIGITS>, which is good enough for any C<ecb_i2a> function.
1192
1193 =back
1194
1195 =head3 LOW-LEVEL API
1196
1197 The functions above use a number of low-level APIs which have some strict
1198 limitations, but can be used as building blocks (studying C<ecb_i2a_i32>
1199 and related functions is recommended).
1200
1201 There are three families of functions: functions that convert a number
1202 to a fixed number of digits with leading zeroes (C<ecb_i2a_0N>, C<0>
1203 for "leading zeroes"), functions that generate up to N digits, skipping
1204 leading zeroes (C<_N>), and functions that can generate more digits, but
1205 the leading digit has limited range (C<_xN>).
1206
1207 None of the functions deal with negative numbers.
1208
1209 Example: convert an IP address in an C<uint32_t> into dotted-quad:
1210
1211 uint32_t ip = 0x0a000164; // 10.0.1.100
1212 char ips[3 * 4 + 3 + 1];
1213 char *ptr = ips;
1214 ptr = ecb_i2a_3 (ptr, ip >> 24 ); *ptr++ = '.';
1215 ptr = ecb_i2a_3 (ptr, (ip >> 16) & 0xff); *ptr++ = '.';
1216 ptr = ecb_i2a_3 (ptr, (ip >> 8) & 0xff); *ptr++ = '.';
1217 ptr = ecb_i2a_3 (ptr, ip & 0xff); *ptr++ = 0;
1218 printf ("ip: %s\n", ips); // prints "ip: 10.0.1.100"
1219
1220 =over
1221
1222 =item char *ecb_i2a_02 (char *ptr, uint32_t value) // 32 bit
1223
1224 =item char *ecb_i2a_03 (char *ptr, uint32_t value) // 32 bit
1225
1226 =item char *ecb_i2a_04 (char *ptr, uint32_t value) // 32 bit
1227
1228 =item char *ecb_i2a_05 (char *ptr, uint32_t value) // 64 bit
1229
1230 =item char *ecb_i2a_06 (char *ptr, uint32_t value) // 64 bit
1231
1232 =item char *ecb_i2a_07 (char *ptr, uint32_t value) // 64 bit
1233
1234 =item char *ecb_i2a_08 (char *ptr, uint32_t value) // 64 bit
1235
1236 =item char *ecb_i2a_09 (char *ptr, uint32_t value) // 64 bit
1237
1238 The C<< ecb_i2a_0I<N> >> functions take an unsigned I<value> and convert
1239 them to exactly I<N> digits, returning a pointer to the first character
1240 after the digits. The I<value> must be in range. The functions marked with
1241 I<32 bit> do their calculations internally in 32 bit, the ones marked with
1242 I<64 bit> internally use 64 bit integers, which might be slow on 32 bit
1243 architectures (the high level API decides on 32 vs. 64 bit versions using
1244 C<ECB_64BIT_NATIVE>).
1245
1246 =item char *ecb_i2a_2 (char *ptr, uint32_t value) // 32 bit
1247
1248 =item char *ecb_i2a_3 (char *ptr, uint32_t value) // 32 bit
1249
1250 =item char *ecb_i2a_4 (char *ptr, uint32_t value) // 32 bit
1251
1252 =item char *ecb_i2a_5 (char *ptr, uint32_t value) // 64 bit
1253
1254 =item char *ecb_i2a_6 (char *ptr, uint32_t value) // 64 bit
1255
1256 =item char *ecb_i2a_7 (char *ptr, uint32_t value) // 64 bit
1257
1258 =item char *ecb_i2a_8 (char *ptr, uint32_t value) // 64 bit
1259
1260 =item char *ecb_i2a_9 (char *ptr, uint32_t value) // 64 bit
1261
1262 Similarly, the C<< ecb_i2a_I<N> >> functions take an unsigned I<value>
1263 and convert them to at most I<N> digits, suppressing leading zeroes, and
1264 returning a pointer to the first character after the digits.
1265
1266 =item ECB_I2A_MAX_X5 (=59074)
1267
1268 =item char *ecb_i2a_x5 (char *ptr, uint32_t value) // 32 bit
1269
1270 =item ECB_I2A_MAX_X10 (=2932500665)
1271
1272 =item char *ecb_i2a_x10 (char *ptr, uint32_t value) // 64 bit
1273
1274 The C<< ecb_i2a_xI<N> >> functions are similar to the C<< ecb_i2a_I<N> >>
1275 functions, but they can generate one digit more, as long as the number
1276 is within range, which is given by the symbols C<ECB_I2A_MAX_X5> (almost
1277 16 bit range) and C<ECB_I2A_MAX_X10> (a bit more than 31 bit range),
1278 respectively.
1279
1280 For example, the digit part of a 32 bit signed integer just fits into the
1281 C<ECB_I2A_MAX_X10> range, so while C<ecb_i2a_x10> cannot convert a 10
1282 digit number, it can convert all 32 bit signed numbers. Sadly, it's not
1283 good enough for 32 bit unsigned numbers.
1284
1285 =back
1286
1287 =head2 FLOATING POINT FIDDLING
1288
1289 =over
1290
1291 =item ECB_INFINITY [-UECB_NO_LIBM]
1292
1293 Evaluates to positive infinity if supported by the platform, otherwise to
1294 a truly huge number.
1295
1296 =item ECB_NAN [-UECB_NO_LIBM]
1297
1298 Evaluates to a quiet NAN if supported by the platform, otherwise to
1299 C<ECB_INFINITY>.
1300
1301 =item float ecb_ldexpf (float x, int exp) [-UECB_NO_LIBM]
1302
1303 Same as C<ldexpf>, but always available.
1304
1305 =item uint32_t ecb_float_to_binary16 (float x) [-UECB_NO_LIBM]
1306
1307 =item uint32_t ecb_float_to_binary32 (float x) [-UECB_NO_LIBM]
1308
1309 =item uint64_t ecb_double_to_binary64 (double x) [-UECB_NO_LIBM]
1310
1311 These functions each take an argument in the native C<float> or C<double>
1312 type and return the IEEE 754 bit representation of it (binary16/half,
1313 binary32/single or binary64/double precision).
1314
1315 The bit representation is just as IEEE 754 defines it, i.e. the sign bit
1316 will be the most significant bit, followed by exponent and mantissa.
1317
1318 This function should work even when the native floating point format isn't
1319 IEEE compliant, of course at a speed and code size penalty, and of course
1320 also within reasonable limits (it tries to convert NaNs, infinities and
1321 denormals, but will likely convert negative zero to positive zero).
1322
1323 On all modern platforms (where C<ECB_STDFP> is true), the compiler should
1324 be able to completely optimise away the 32 and 64 bit functions.
1325
1326 These functions can be helpful when serialising floats to the network - you
1327 can serialise the return value like a normal uint16_t/uint32_t/uint64_t.
1328
1329 Another use for these functions is to manipulate floating point values
1330 directly.
1331
1332 Silly example: toggle the sign bit of a float.
1333
1334 /* On gcc-4.7 on amd64, */
1335 /* this results in a single add instruction to toggle the bit, and 4 extra */
1336 /* instructions to move the float value to an integer register and back. */
1337
1338 x = ecb_binary32_to_float (ecb_float_to_binary32 (x) ^ 0x80000000U)
1339
1340 =item float ecb_binary16_to_float (uint16_t x) [-UECB_NO_LIBM]
1341
1342 =item float ecb_binary32_to_float (uint32_t x) [-UECB_NO_LIBM]
1343
1344 =item double ecb_binary64_to_double (uint64_t x) [-UECB_NO_LIBM]
1345
1346 The reverse operation of the previous function - takes the bit
1347 representation of an IEEE binary16, binary32 or binary64 number (half,
1348 single or double precision) and converts it to the native C<float> or
1349 C<double> format.
1350
1351 This function should work even when the native floating point format isn't
1352 IEEE compliant, of course at a speed and code size penalty, and of course
1353 also within reasonable limits (it tries to convert normals and denormals,
1354 and might be lucky for infinities, and with extraordinary luck, also for
1355 negative zero).
1356
1357 On all modern platforms (where C<ECB_STDFP> is true), the compiler should
1358 be able to optimise away this function completely.
1359
1360 =item uint16_t ecb_binary32_to_binary16 (uint32_t x)
1361
1362 =item uint32_t ecb_binary16_to_binary32 (uint16_t x)
1363
1364 Convert a IEEE binary32/single precision to binary16/half format, and vice
1365 versa, handling all details (round-to-nearest-even, subnormals, infinity
1366 and NaNs) correctly.
1367
1368 These are functions are available under C<-DECB_NO_LIBM>, since
1369 they do not rely on the platform floating point format. The
1370 C<ecb_float_to_binary16> and C<ecb_binary16_to_float> functions are
1371 usually what you want.
1372
1373 =back
1374
1375 =head2 ARITHMETIC
1376
1377 =over
1378
1379 =item x = ecb_mod (m, n)
1380
1381 Returns C<m> modulo C<n>, which is the same as the positive remainder
1382 of the division operation between C<m> and C<n>, using floored
1383 division. Unlike the C remainder operator C<%>, this function ensures that
1384 the return value is always positive and that the two numbers I<m> and
1385 I<m' = m + i * n> result in the same value modulo I<n> - in other words,
1386 C<ecb_mod> implements the mathematical modulo operation, which is missing
1387 in the language.
1388
1389 C<n> must be strictly positive (i.e. C<< >= 1 >>), while C<m> must be
1390 negatable, that is, both C<m> and C<-m> must be representable in its
1391 type (this typically excludes the minimum signed integer value, the same
1392 limitation as for C</> and C<%> in C).
1393
1394 Current GCC/clang versions compile this into an efficient branchless
1395 sequence on almost all CPUs.
1396
1397 For example, when you want to rotate forward through the members of an
1398 array for increasing C<m> (which might be negative), then you should use
1399 C<ecb_mod>, as the C<%> operator might give either negative results, or
1400 change direction for negative values:
1401
1402 for (m = -100; m <= 100; ++m)
1403 int elem = myarray [ecb_mod (m, ecb_array_length (myarray))];
1404
1405 =item x = ecb_div_rd (val, div)
1406
1407 =item x = ecb_div_ru (val, div)
1408
1409 Returns C<val> divided by C<div> rounded down or up, respectively.
1410 C<val> and C<div> must have integer types and C<div> must be strictly
1411 positive. Note that these functions are implemented with macros in C
1412 and with function templates in C++.
1413
1414 =back
1415
1416 =head2 UTILITY
1417
1418 =over
1419
1420 =item element_count = ecb_array_length (name)
1421
1422 Returns the number of elements in the array C<name>. For example:
1423
1424 int primes[] = { 2, 3, 5, 7, 11 };
1425 int sum = 0;
1426
1427 for (i = 0; i < ecb_array_length (primes); i++)
1428 sum += primes [i];
1429
1430 =back
1431
1432 =head2 SYMBOLS GOVERNING COMPILATION OF ECB.H ITSELF
1433
1434 These symbols need to be defined before including F<ecb.h> the first time.
1435
1436 =over
1437
1438 =item ECB_NO_THREADS
1439
1440 If F<ecb.h> is never used from multiple threads, then this symbol can
1441 be defined, in which case memory fences (and similar constructs) are
1442 completely removed, leading to more efficient code and fewer dependencies.
1443
1444 Setting this symbol to a true value implies C<ECB_NO_SMP>.
1445
1446 =item ECB_NO_SMP
1447
1448 The weaker version of C<ECB_NO_THREADS> - if F<ecb.h> is used from
1449 multiple threads, but never concurrently (e.g. if the system the program
1450 runs on has only a single CPU with a single core, no hyper-threading and so
1451 on), then this symbol can be defined, leading to more efficient code and
1452 fewer dependencies.
1453
1454 =item ECB_NO_LIBM
1455
1456 When defined to C<1>, do not export any functions that might introduce
1457 dependencies on the math library (usually called F<-lm>) - these are
1458 marked with [-UECB_NO_LIBM].
1459
1460 =back
1461
1462 =head1 UNDOCUMENTED FUNCTIONALITY
1463
1464 F<ecb.h> is full of undocumented functionality as well, some of which is
1465 intended to be internal-use only, some of which we forgot to document, and
1466 some of which we hide because we are not sure we will keep the interface
1467 stable.
1468
1469 While you are welcome to rummage around and use whatever you find useful
1470 (we don't want to stop you), keep in mind that we will change undocumented
1471 functionality in incompatible ways without thinking twice, while we are
1472 considerably more conservative with documented things.
1473
1474 =head1 AUTHORS
1475
1476 C<libecb> is designed and maintained by:
1477
1478 Emanuele Giaquinta <e.giaquinta@glauco.it>
1479 Marc Alexander Lehmann <schmorp@schmorp.de>