ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/libecb/ecb.pod
(Generate patch)

Comparing libecb/ecb.pod (file contents):
Revision 1.103 by root, Wed Mar 23 09:59:49 2022 UTC vs.
Revision 1.104 by root, Fri Mar 25 08:44:14 2022 UTC

12 12
13 http://software.schmorp.de/pkg/libecb 13 http://software.schmorp.de/pkg/libecb
14 14
15It mainly provides a number of wrappers around many compiler built-ins, 15It mainly provides a number of wrappers around many compiler built-ins,
16together with replacement functions for other compilers. In addition 16together with replacement functions for other compilers. In addition
17to this, it provides a number of other lowlevel C utilities, such as 17to this, it provides a number of other low-level C utilities, such as
18endianness detection, byte swapping or bit rotations. 18endianness detection, byte swapping or bit rotations.
19 19
20Or in other words, things that should be built into any standard C 20Or in other words, things that should be built into any standard C
21system, but aren't, implemented as efficient as possible with GCC (clang, 21system, but aren't, implemented as efficient as possible with GCC (clang,
22msvc...), and still correct with other compilers. 22MSVC...), and still correct with other compilers.
23 23
24More might come. 24More might come.
25 25
26=head2 ABOUT THE HEADER 26=head2 ABOUT THE HEADER
27 27
56is usually implemented as a macro. Specifically, a "bool" in this manual 56is usually implemented as a macro. Specifically, a "bool" in this manual
57refers to any kind of boolean value, not a specific type. 57refers to any kind of boolean value, not a specific type.
58 58
59=head2 TYPES / TYPE SUPPORT 59=head2 TYPES / TYPE SUPPORT
60 60
61ecb.h makes sure that the following types are defined (in the expected way): 61F<ecb.h> makes sure that the following types are defined (in the expected way):
62 62
63 int8_t uint8_ 63 int8_t uint8_
64 int16_t uint16_t 64 int16_t uint16_t
65 int32_t uint32_ 65 int32_t uint32_
66 int64_t uint64_t 66 int64_t uint64_t
170 170
171Evaluates to a true value (suitable for both preprocessor and C code 171Evaluates to a true value (suitable for both preprocessor and C code
172testing) if 64 bit integer types on this architecture are evaluated 172testing) if 64 bit integer types on this architecture are evaluated
173"natively", that is, with similar speeds as 32 bit integers. While 64 bit 173"natively", that is, with similar speeds as 32 bit integers. While 64 bit
174integer support is very common (and in fact required by libecb), 32 bit 174integer support is very common (and in fact required by libecb), 32 bit
175cpus have to emulate operations on them, so you might want to avoid them. 175CPUs have to emulate operations on them, so you might want to avoid them.
176 176
177=item ECB_AMD64, ECB_AMD64_X32 177=item ECB_AMD64, ECB_AMD64_X32
178 178
179These two macros are defined to C<1> on the x86_64/amd64 ABI and the X32 179These two macros are defined to C<1> on the x86_64/amd64 ABI and the X32
180ABI, respectively, and undefined elsewhere. 180ABI, respectively, and undefined elsewhere.
274 274
275Expands either to (a compiler-specific equivalent of) C<static inline> or 275Expands either to (a compiler-specific equivalent of) C<static inline> or
276to just C<static>, if inline isn't supported. It should be used to declare 276to just C<static>, if inline isn't supported. It should be used to declare
277functions that should be inlined, for code size or speed reasons. 277functions that should be inlined, for code size or speed reasons.
278 278
279Example: inline this function, it surely will reduce codesize. 279Example: inline this function, it surely will reduce code size.
280 280
281 ecb_inline int 281 ecb_inline int
282 negmul (int a, int b) 282 negmul (int a, int b)
283 { 283 {
284 return - (a * b); 284 return - (a * b);
384speed-critical times, and keeping it in the cache might be a waste of said 384speed-critical times, and keeping it in the cache might be a waste of said
385cache. 385cache.
386 386
387In addition to placing cold functions together (or at least away from hot 387In addition to placing cold functions together (or at least away from hot
388functions), this knowledge can be used in other ways, for example, the 388functions), this knowledge can be used in other ways, for example, the
389function will be optimised for size, as opposed to speed, and codepaths 389function will be optimised for size, as opposed to speed, and code paths
390leading to calls to those functions can automatically be marked as if 390leading to calls to those functions can automatically be marked as if
391C<ecb_expect_false> had been used to reach them. 391C<ecb_expect_false> had been used to reach them.
392 392
393Good examples for such functions would be error reporting functions, or 393Good examples for such functions would be error reporting functions, or
394functions only called in exceptional or rare cases. 394functions only called in exceptional or rare cases.
548never be executed. Apart from suppressing a warning in some cases, this 548never be executed. Apart from suppressing a warning in some cases, this
549function can be used to implement C<ecb_assume> or similar functionality. 549function can be used to implement C<ecb_assume> or similar functionality.
550 550
551=item ecb_prefetch (addr, rw, locality) 551=item ecb_prefetch (addr, rw, locality)
552 552
553Tells the compiler to try to prefetch memory at the given C<addr>ess 553Tells the compiler to try to prefetch memory at the given I<addr>ess
554for either reading (C<rw> = 0) or writing (C<rw> = 1). A C<locality> of 554for either reading (I<rw> = 0) or writing (I<rw> = 1). A I<locality> of
555C<0> means that there will only be one access later, C<3> means that 555C<0> means that there will only be one access later, C<3> means that
556the data will likely be accessed very often, and values in between mean 556the data will likely be accessed very often, and values in between mean
557something... in between. The memory pointed to by the address does not 557something... in between. The memory pointed to by the address does not
558need to be accessible (it could be a null pointer for example), but C<rw> 558need to be accessible (it could be a null pointer for example), but C<rw>
559and C<locality> must be compile-time constants. 559and C<locality> must be compile-time constants.
784 784
785Overloaded C++ version of the above, for C<uint{8,16,32,64}_t>. 785Overloaded C++ version of the above, for C<uint{8,16,32,64}_t>.
786 786
787=back 787=back
788 788
789=head2 HILBERT CURVES
790
791These functions deal with (square, pseudo) Hilbert curves. The parameter
792I<order> indicates the size of the square and is specified in bits, that
793means for order C<8>, the coordinates range from C<0>..C<255>, and the
794curve index ranges from C<0>..C<65535>.
795
796The 32 bit variants of these functions map a 32 bit index to two 16 bit
797coordinates, stored in a 32 bit variable, where the high order bits are
798the x-coordinate, and the low order bits are the y-coordinate, thus,
799these functions map 32 bit linear index on the curve to a 32 bit packed
800coordinate pair, and vice versa.
801
802The 64 bit variants work similarly.
803
804The I<order> can go from C<1> to C<16> for the 32 bit curve, and C<1> to
805C<32> for the 64 bit curve.
806
807When going from one order to the next higher order, these functions
808replace the curve segments by smaller versions of the generating shape,
809while doubling the size (since they use integer coordinates), which is
810what you would expect mathematically. This means that the curve will be
811mirrored at the diagonal. If your goal is to simply cover more area while
812retaining existing point coordinates you should increase or decrease the
813I<order> by C<2> or, in the case of C<ecb_hilbert2d_index_to_coord>,
814simply specify the maximum I<order> of C<16> or C<32>, respectively, as
815these are constant-time.
816
817=over
818
819=item uint32_t ecb_hilbert2d_index_to_coord32 (int order, uint32_t index)
820
821=item uint64_t ecb_hilbert2d_index_to_coord64 (int order, uint64_t index)
822
823Map a point on a pseudo Hilbert curve from its linear distance from the
824origin on the curve to a x|y coordinate pair. The result is a packed
825coordinate pair, to get the actual x and < coordinates, you could do
826something like this:
827
828 uint32_t xy = ecb_hilbert2d_index_to_coord32 (16, 255);
829 uint16_t x = xy >> 16;
830 uint16_t y = xy & 0xffffU;
831
832 uint64_t xy = ecb_hilbert2d_index_to_coord64 (32, 255);
833 uint32_t x = xy >> 32;
834 uint32_t y = xy & 0xffffffffU;
835
836These functions work in constant time, so for many applications it is
837preferable to simply hard-code the order to the maximum (C<16> or C<32>).
838
839This (production-ready, i.e. never run) example generates an SVG image of
840an order 8 pseudo Hilbert curve:
841
842 printf ("<svg xmlns='http://www.w3.org/2000/svg' width='%d' height='%d'>\n", 64 * 8, 64 * 8);
843 printf ("<g transform='translate(4) scale(8)' stroke-width='0.25' stroke='black'>\n");
844 for (uint32_t i = 0; i < 64*64 - 1; ++i)
845 {
846 uint32_t p1 = ecb_hilbert2d_index_to_coord32 (6, i );
847 uint32_t p2 = ecb_hilbert2d_index_to_coord32 (6, i + 1);
848 printf ("<line x1='%d' y1='%d' x2='%d' y2='%d'/>\n",
849 p1 >> 16, p1 & 0xffff,
850 p2 >> 16, p2 & 0xffff);
851 }
852 printf ("</g>\n");
853 printf ("</svg>\n");
854
855=item uint32_t ecb_hilbert2d_coord_to_index32 (int order, uint32_t xy)
856
857=item uint64_t ecb_hilbert2d_coord_to_index64 (int order, uint64_t xy)
858
859The reverse of C<ecb_hilbert2d_index_to_coord> - map a packed pair of
860coordinates to their linear index on the pseudo Hilbert curve of order
861I<order>.
862
863They are an exact inverse of the C<ecb_hilbert2d_coord_to_index> functions
864for the same I<order>:
865
866 assert (
867 u == ecb_hilbert2d_coord_to_index (32,
868 ecb_hilbert2d_index_to_coord32 (32,
869 u)));
870
871Packing coordinates is done the same way, as well, from I<x> and I<y>:
872
873 uint32_t xy = ((uint32_t)x << 16) | y; // for ecb_hilbert2d_coord_to_index32
874 uint64_t xy = ((uint64_t)x << 32) | y; // for ecb_hilbert2d_coord_to_index64
875
876Unlike C<ecb_hilbert2d_coord_to_index>, these functions are O(I<order>),
877so it is preferable to use the lowest possible order.
878
879=back
880
789=head2 BIT MIXING, HASHING 881=head2 BIT MIXING, HASHING
790 882
791Sometimes you have an integer and want to distribute its bits well, for 883Sometimes you have an integer and want to distribute its bits well, for
792example, to use it as a hash in a hashtable. A common example is pointer 884example, to use it as a hash in a hash table. A common example is pointer
793values, which often only have a limited range (e.g. low and high bits are 885values, which often only have a limited range (e.g. low and high bits are
794often zero). 886often zero).
795 887
796The following functions try to mix the bits to get a good bias-free 888The following functions try to mix the bits to get a good bias-free
797distribution. They were mainly made for pointers, but the underlying 889distribution. They were mainly made for pointers, but the underlying
798integer functions are exposed as well. 890integer functions are exposed as well.
799 891
800As an added benefit, the functions are reversible, so if you find it 892As an added benefit, the functions are reversible, so if you find it
801convenient to store only the hash value, you can recover the original 893convenient to store only the hash value, you can recover the original
802pointer from the hash ("unmix"), as long as your pinters are 32 or 64 bit 894pointer from the hash ("unmix"), as long as your pointers are 32 or 64 bit
803(if this isn't the case on your platform, drop us a note and we will add 895(if this isn't the case on your platform, drop us a note and we will add
804functions for other bit widths). 896functions for other bit widths).
805 897
806The unmix functions are very slightly slower than the mix functions, so 898The unmix functions are very slightly slower than the mix functions, so
807it is equally very slightly preferable to store the original values wehen 899it is equally very slightly preferable to store the original values wehen
808convenient. 900convenient.
809 901
810The underlying algorithm if subject to change, so currently these 902The underlying algorithm if subject to change, so currently these
811functions are not suitable for persistent hash tables, as their result 903functions are not suitable for persistent hash tables, as their result
812value can change between diferent versions of libecb. 904value can change between different versions of libecb.
813 905
814=over 906=over
815 907
816=item uintptr_t ecb_ptrmix (void *ptr) 908=item uintptr_t ecb_ptrmix (void *ptr)
817 909
840=item uint32_t ecb_mix32 (uint32_t v) 932=item uint32_t ecb_mix32 (uint32_t v)
841 933
842=item uint64_t ecb_mix64 (uint64_t v) 934=item uint64_t ecb_mix64 (uint64_t v)
843 935
844Sometimes you don't have a pointer but an integer whose values are very 936Sometimes you don't have a pointer but an integer whose values are very
845badly distributed. In this case you cna sue these integer versions of the 937badly distributed. In this case you can use these integer versions of the
846mixing function. No C++ template is provided currently. 938mixing function. No C++ template is provided currently.
847 939
848=item uint32_t ecb_unmix32 (uint32_t v) 940=item uint32_t ecb_unmix32 (uint32_t v)
849 941
850=item uint64_t ecb_unmix64 (uint64_t v) 942=item uint64_t ecb_unmix64 (uint64_t v)
1010=item ecb_poke_be_u (void *ptr, T v) 1102=item ecb_poke_be_u (void *ptr, T v)
1011 1103
1012=item ecb_poke_le_u (void *ptr, T v) 1104=item ecb_poke_le_u (void *ptr, T v)
1013 1105
1014Again, similarly to their C counterparts, these functions store an 1106Again, similarly to their C counterparts, these functions store an
1015unsigned 8, 16, 32 or z64 bit value to memory, with optional conversion to 1107unsigned 8, 16, 32 or 64 bit value to memory, with optional conversion to
1016big/little endian. 1108big/little endian.
1017 1109
1018C<T> must be one of C<uint8_t>, C<uint16_t>, C<uint32_t> or C<uint64_t>. 1110C<T> must be one of C<uint8_t>, C<uint16_t>, C<uint32_t> or C<uint64_t>.
1019 1111
1020Unlike their C counterparts, these functions support 8 bit quantities 1112Unlike their C counterparts, these functions support 8 bit quantities
1024=back 1116=back
1025 1117
1026=head2 FAST INTEGER TO STRING 1118=head2 FAST INTEGER TO STRING
1027 1119
1028Libecb defines a set of very fast integer to decimal string (or integer 1120Libecb defines a set of very fast integer to decimal string (or integer
1029to ascii, short C<i2a>) functions. These work by converting the integer 1121to ASCII, short C<i2a>) functions. These work by converting the integer
1030to a fixed point representation and then successively multiplying out 1122to a fixed point representation and then successively multiplying out
1031the topmost digits. Unlike some other, also very fast, libraries, ecb's 1123the topmost digits. Unlike some other, also very fast, libraries, ecb's
1032algorithm should be completely branchless per digit, and does not rely on 1124algorithm should be completely branchless per digit, and does not rely on
1033the presence of special cpu functions (such as clz). 1125the presence of special CPU functions (such as C<clz>).
1034 1126
1035There is a high level API that takes an C<int32_t>, C<uint32_t>, 1127There is a high level API that takes an C<int32_t>, C<uint32_t>,
1036C<int64_t> or C<uint64_t> as argument, and a low-level API, which is 1128C<int64_t> or C<uint64_t> as argument, and a low-level API, which is
1037harder to use but supports slightly more formatting options. 1129harder to use but supports slightly more formatting options.
1038 1130
1096leading zeroes (C<_N>), and functions that can generate more digits, but 1188leading zeroes (C<_N>), and functions that can generate more digits, but
1097the leading digit has limited range (C<_xN>). 1189the leading digit has limited range (C<_xN>).
1098 1190
1099None of the functions deal with negative numbers. 1191None of the functions deal with negative numbers.
1100 1192
1101Example: convert an IP address in an u32 into dotted-quad: 1193Example: convert an IP address in an C<uint32_t> into dotted-quad:
1102 1194
1103 uint32_t ip = 0x0a000164; // 10.0.1.100 1195 uint32_t ip = 0x0a000164; // 10.0.1.100
1104 char ips[3 * 4 + 3 + 1]; 1196 char ips[3 * 4 + 3 + 1];
1105 char *ptr = ips; 1197 char *ptr = ips;
1106 ptr = ecb_i2a_3 (ptr, ip >> 24 ); *ptr++ = '.'; 1198 ptr = ecb_i2a_3 (ptr, ip >> 24 ); *ptr++ = '.';
1337 1429
1338=item ECB_NO_SMP 1430=item ECB_NO_SMP
1339 1431
1340The weaker version of C<ECB_NO_THREADS> - if F<ecb.h> is used from 1432The weaker version of C<ECB_NO_THREADS> - if F<ecb.h> is used from
1341multiple threads, but never concurrently (e.g. if the system the program 1433multiple threads, but never concurrently (e.g. if the system the program
1342runs on has only a single CPU with a single core, no hyperthreading and so 1434runs on has only a single CPU with a single core, no hyper-threading and so
1343on), then this symbol can be defined, leading to more efficient code and 1435on), then this symbol can be defined, leading to more efficient code and
1344fewer dependencies. 1436fewer dependencies.
1345 1437
1346=item ECB_NO_LIBM 1438=item ECB_NO_LIBM
1347 1439

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines