… | |
… | |
10 | |
10 | |
11 | Its homepage can be found here: |
11 | Its homepage can be found here: |
12 | |
12 | |
13 | http://software.schmorp.de/pkg/libecb |
13 | http://software.schmorp.de/pkg/libecb |
14 | |
14 | |
15 | It mainly provides a number of wrappers around GCC built-ins, together |
15 | It mainly provides a number of wrappers around many compiler built-ins, |
16 | with replacement functions for other compilers. In addition to this, |
16 | together with replacement functions for other compilers. In addition |
17 | it provides a number of other lowlevel C utilities, such as endianness |
17 | to this, it provides a number of other lowlevel C utilities, such as |
18 | detection, byte swapping or bit rotations. |
18 | endianness detection, byte swapping or bit rotations. |
19 | |
19 | |
20 | Or in other words, things that should be built into any standard C system, |
20 | Or in other words, things that should be built into any standard C |
21 | but aren't, implemented as efficient as possible with GCC, and still |
21 | system, but aren't, implemented as efficient as possible with GCC (clang, |
22 | correct with other compilers. |
22 | msvc...), and still correct with other compilers. |
23 | |
23 | |
24 | More might come. |
24 | More might come. |
25 | |
25 | |
26 | =head2 ABOUT THE HEADER |
26 | =head2 ABOUT THE HEADER |
27 | |
27 | |
… | |
… | |
80 | |
80 | |
81 | All the following symbols expand to an expression that can be tested in |
81 | All the following symbols expand to an expression that can be tested in |
82 | preprocessor instructions as well as treated as a boolean (use C<!!> to |
82 | preprocessor instructions as well as treated as a boolean (use C<!!> to |
83 | ensure it's either C<0> or C<1> if you need that). |
83 | ensure it's either C<0> or C<1> if you need that). |
84 | |
84 | |
85 | =over 4 |
85 | =over |
86 | |
86 | |
87 | =item ECB_C |
87 | =item ECB_C |
88 | |
88 | |
89 | True if the implementation defines the C<__STDC__> macro to a true value, |
89 | True if the implementation defines the C<__STDC__> macro to a true value, |
90 | while not claiming to be C++, i..e C, but not C++. |
90 | while not claiming to be C++, i..e C, but not C++. |
… | |
… | |
163 | without having to think about format or endianness. |
163 | without having to think about format or endianness. |
164 | |
164 | |
165 | This is true for basically all modern platforms, although F<ecb.h> might |
165 | This is true for basically all modern platforms, although F<ecb.h> might |
166 | not be able to deduce this correctly everywhere and might err on the safe |
166 | not be able to deduce this correctly everywhere and might err on the safe |
167 | side. |
167 | side. |
|
|
168 | |
|
|
169 | =item ECB_64BIT_NATIVE |
|
|
170 | |
|
|
171 | Evaluates to a true value (suitable for both preprocessor and C code |
|
|
172 | testing) if 64 bit integer types on this architecture are evaluated |
|
|
173 | "natively", that is, with similar speeds as 32 bit integerss. While 64 bit |
|
|
174 | integer support is very common (and in fatc required by libecb), 32 bit |
|
|
175 | cpus have to emulate operations on them, so you might want to avoid them. |
168 | |
176 | |
169 | =item ECB_AMD64, ECB_AMD64_X32 |
177 | =item ECB_AMD64, ECB_AMD64_X32 |
170 | |
178 | |
171 | These two macros are defined to C<1> on the x86_64/amd64 ABI and the X32 |
179 | These two macros are defined to C<1> on the x86_64/amd64 ABI and the X32 |
172 | ABI, respectively, and undefined elsewhere. |
180 | ABI, respectively, and undefined elsewhere. |
… | |
… | |
179 | |
187 | |
180 | =back |
188 | =back |
181 | |
189 | |
182 | =head2 MACRO TRICKERY |
190 | =head2 MACRO TRICKERY |
183 | |
191 | |
184 | =over 4 |
192 | =over |
185 | |
193 | |
186 | =item ECB_CONCAT (a, b) |
194 | =item ECB_CONCAT (a, b) |
187 | |
195 | |
188 | Expands any macros in C<a> and C<b>, then concatenates the result to form |
196 | Expands any macros in C<a> and C<b>, then concatenates the result to form |
189 | a single token. This is mainly useful to form identifiers from components, |
197 | a single token. This is mainly useful to form identifiers from components, |
… | |
… | |
230 | declarations must be put before the whole declaration: |
238 | declarations must be put before the whole declaration: |
231 | |
239 | |
232 | ecb_const int mysqrt (int a); |
240 | ecb_const int mysqrt (int a); |
233 | ecb_unused int i; |
241 | ecb_unused int i; |
234 | |
242 | |
235 | =over 4 |
243 | =over |
236 | |
244 | |
237 | =item ecb_unused |
245 | =item ecb_unused |
238 | |
246 | |
239 | Marks a function or a variable as "unused", which simply suppresses a |
247 | Marks a function or a variable as "unused", which simply suppresses a |
240 | warning by GCC when it detects it as unused. This is useful when you e.g. |
248 | warning by the compiler when it detects it as unused. This is useful when |
241 | declare a variable but do not always use it: |
249 | you e.g. declare a variable but do not always use it: |
242 | |
250 | |
243 | { |
251 | { |
244 | ecb_unused int var; |
252 | ecb_unused int var; |
245 | |
253 | |
246 | #ifdef SOMECONDITION |
254 | #ifdef SOMECONDITION |
… | |
… | |
414 | |
422 | |
415 | =back |
423 | =back |
416 | |
424 | |
417 | =head2 OPTIMISATION HINTS |
425 | =head2 OPTIMISATION HINTS |
418 | |
426 | |
419 | =over 4 |
427 | =over |
420 | |
428 | |
421 | =item bool ecb_is_constant (expr) |
429 | =item bool ecb_is_constant (expr) |
422 | |
430 | |
423 | Returns true iff the expression can be deduced to be a compile-time |
431 | Returns true iff the expression can be deduced to be a compile-time |
424 | constant, and false otherwise. |
432 | constant, and false otherwise. |
… | |
… | |
581 | |
589 | |
582 | =back |
590 | =back |
583 | |
591 | |
584 | =head2 BIT FIDDLING / BIT WIZARDRY |
592 | =head2 BIT FIDDLING / BIT WIZARDRY |
585 | |
593 | |
586 | =over 4 |
594 | =over |
587 | |
595 | |
588 | =item bool ecb_big_endian () |
596 | =item bool ecb_big_endian () |
589 | |
597 | |
590 | =item bool ecb_little_endian () |
598 | =item bool ecb_little_endian () |
591 | |
599 | |
… | |
… | |
725 | |
733 | |
726 | These two families of functions return the value of C<x> after rotating |
734 | These two families of functions return the value of C<x> after rotating |
727 | all the bits by C<count> positions to the right (C<ecb_rotr>) or left |
735 | all the bits by C<count> positions to the right (C<ecb_rotr>) or left |
728 | (C<ecb_rotl>). |
736 | (C<ecb_rotl>). |
729 | |
737 | |
730 | Current GCC versions understand these functions and usually compile them |
738 | Current GCC/clang versions understand these functions and usually compile |
731 | to "optimal" code (e.g. a single C<rol> or a combination of C<shld> on |
739 | them to "optimal" code (e.g. a single C<rol> or a combination of C<shld> |
732 | x86). |
740 | on x86). |
733 | |
741 | |
734 | =item T ecb_rotl (T x, unsigned int count) [C++] |
742 | =item T ecb_rotl (T x, unsigned int count) [C++] |
735 | |
743 | |
736 | =item T ecb_rotr (T x, unsigned int count) [C++] |
744 | =item T ecb_rotr (T x, unsigned int count) [C++] |
737 | |
745 | |
… | |
… | |
741 | |
749 | |
742 | =back |
750 | =back |
743 | |
751 | |
744 | =head2 HOST ENDIANNESS CONVERSION |
752 | =head2 HOST ENDIANNESS CONVERSION |
745 | |
753 | |
746 | =over 4 |
754 | =over |
747 | |
755 | |
748 | =item uint_fast16_t ecb_be_u16_to_host (uint_fast16_t v) |
756 | =item uint_fast16_t ecb_be_u16_to_host (uint_fast16_t v) |
749 | |
757 | |
750 | =item uint_fast32_t ecb_be_u32_to_host (uint_fast32_t v) |
758 | =item uint_fast32_t ecb_be_u32_to_host (uint_fast32_t v) |
751 | |
759 | |
… | |
… | |
779 | |
787 | |
780 | =back |
788 | =back |
781 | |
789 | |
782 | In C++ the following additional template functions are supported: |
790 | In C++ the following additional template functions are supported: |
783 | |
791 | |
784 | =over 4 |
792 | =over |
785 | |
793 | |
786 | =item T ecb_be_to_host (T v) |
794 | =item T ecb_be_to_host (T v) |
787 | |
795 | |
788 | =item T ecb_le_to_host (T v) |
796 | =item T ecb_le_to_host (T v) |
789 | |
797 | |
790 | =item T ecb_host_to_be (T v) |
798 | =item T ecb_host_to_be (T v) |
791 | |
799 | |
792 | =item T ecb_host_to_le (T v) |
800 | =item T ecb_host_to_le (T v) |
|
|
801 | |
|
|
802 | =back |
793 | |
803 | |
794 | These functions work like their C counterparts, above, but use templates, |
804 | These functions work like their C counterparts, above, but use templates, |
795 | which make them useful in generic code. |
805 | which make them useful in generic code. |
796 | |
806 | |
797 | C<T> must be one of C<uint8_t>, C<uint16_t>, C<uint32_t> or C<uint64_t> |
807 | C<T> must be one of C<uint8_t>, C<uint16_t>, C<uint32_t> or C<uint64_t> |
… | |
… | |
800 | |
810 | |
801 | =head2 UNALIGNED LOAD/STORE |
811 | =head2 UNALIGNED LOAD/STORE |
802 | |
812 | |
803 | These function load or store unaligned multi-byte values. |
813 | These function load or store unaligned multi-byte values. |
804 | |
814 | |
805 | =over 4 |
815 | =over |
806 | |
816 | |
807 | =item uint_fast16_t ecb_peek_u16_u (const void *ptr) |
817 | =item uint_fast16_t ecb_peek_u16_u (const void *ptr) |
808 | |
818 | |
809 | =item uint_fast32_t ecb_peek_u32_u (const void *ptr) |
819 | =item uint_fast32_t ecb_peek_u32_u (const void *ptr) |
810 | |
820 | |
… | |
… | |
854 | |
864 | |
855 | =back |
865 | =back |
856 | |
866 | |
857 | In C++ the following additional template functions are supported: |
867 | In C++ the following additional template functions are supported: |
858 | |
868 | |
859 | =over 4 |
869 | =over |
860 | |
870 | |
861 | =item T ecb_peek<T> (const void *ptr) |
871 | =item T ecb_peek<T> (const void *ptr) |
862 | |
872 | |
863 | =item T ecb_peek_be<T> (const void *ptr) |
873 | =item T ecb_peek_be<T> (const void *ptr) |
864 | |
874 | |
… | |
… | |
906 | (C<uint8_t>) and also have an aligned version (without the C<_u> prefix), |
916 | (C<uint8_t>) and also have an aligned version (without the C<_u> prefix), |
907 | all of which hopefully makes them more useful in generic code. |
917 | all of which hopefully makes them more useful in generic code. |
908 | |
918 | |
909 | =back |
919 | =back |
910 | |
920 | |
|
|
921 | =head2 FAST INTEGER TO STRING |
|
|
922 | |
|
|
923 | Libecb defines a set of very fast integer to decimal string (or integer |
|
|
924 | to ascii, short C<i2a>) functions. These work by converting the integer |
|
|
925 | to a fixed point representation and then successively multiplying out |
|
|
926 | the topmost digits. Unlike some other, also very fast, libraries, ecb's |
|
|
927 | algorithm should be completely branchless per digit, and does not rely on |
|
|
928 | the presence of special cpu functions (such as clz). |
|
|
929 | |
|
|
930 | There is a high level API that takes an C<int32_t>, C<uint32_t>, |
|
|
931 | C<int64_t> or C<uint64_t> as argument, and a low-level API, which is |
|
|
932 | harder to use but supports slightly more formatting options. |
|
|
933 | |
|
|
934 | =head3 HIGH LEVEL API |
|
|
935 | |
|
|
936 | The high level API consists of four functions, one each for C<int32_t>, |
|
|
937 | C<uint32_t>, C<int64_t> and C<uint64_t>: |
|
|
938 | |
|
|
939 | =over |
|
|
940 | |
|
|
941 | =item ECB_I2A_I32_DIGITS (=11) |
|
|
942 | |
|
|
943 | =item char *ecb_i2a_u32 (char *ptr, uint32_t value) |
|
|
944 | |
|
|
945 | Takes an C<uint32_t> I<value> and formats it as a decimal number starting |
|
|
946 | at I<ptr>, using at most C<ECB_I2A_I32_DIGITS> characters. Returns a |
|
|
947 | pointer to just after the generated string, where you would normally put |
|
|
948 | the temrinating C<0> character. This function outputs the minimum number |
|
|
949 | of digits. |
|
|
950 | |
|
|
951 | =item ECB_I2A_U32_DIGITS (=10) |
|
|
952 | |
|
|
953 | =item char *ecb_i2a_i32 (char *ptr, int32_t value) |
|
|
954 | |
|
|
955 | Same as C<ecb_i2a_u32>, but formats a C<int32_t> value, including a minus |
|
|
956 | sign if needed. |
|
|
957 | |
|
|
958 | =item ECB_I2A_I64_DIGITS (=20) |
|
|
959 | |
|
|
960 | =item char *ecb_i2a_u64 (char *ptr, uint64_t value) |
|
|
961 | |
|
|
962 | =item ECB_I2A_U64_DIGITS (=21) |
|
|
963 | |
|
|
964 | =item char *ecb_i2a_i64 (char *ptr, int64_t value) |
|
|
965 | |
|
|
966 | Similar to their 32 bit counterparts, these take a 64 bit argument. |
|
|
967 | |
|
|
968 | =item ECB_I2A_DIGITS (=21) |
|
|
969 | |
|
|
970 | Instead of using a type specific length macro, youi can just use |
|
|
971 | C<ECB_I2A_DIGITS>, which is good enough for any C<ecb_i2a> function. |
|
|
972 | |
|
|
973 | =back |
|
|
974 | |
|
|
975 | =head3 LOW-LEVEL API |
|
|
976 | |
|
|
977 | The functions above use a number of low-level APIs which have some strict |
|
|
978 | limitaitons, but cna be used as building blocks (study of C<ecb_i2a_i32> |
|
|
979 | and related cunctions is recommended). |
|
|
980 | |
|
|
981 | There are three families of functions: functions that convert a number |
|
|
982 | to a fixed number of digits with leading zeroes (C<ecb_i2a_0N>, C<0> |
|
|
983 | for "leading zeroes"), functions that generate up to N digits, skipping |
|
|
984 | leading zeroes (C<_N>), and functions that can generate more digits, but |
|
|
985 | the leading digit has limited range (C<_xN>). |
|
|
986 | |
|
|
987 | None of the functions deal with negative numbera. |
|
|
988 | |
|
|
989 | =over |
|
|
990 | |
|
|
991 | =item char *ecb_i2a_02 (char *ptr, uint32_t value) // 32 bit |
|
|
992 | |
|
|
993 | =item char *ecb_i2a_03 (char *ptr, uint32_t value) // 32 bit |
|
|
994 | |
|
|
995 | =item char *ecb_i2a_04 (char *ptr, uint32_t value) // 32 bit |
|
|
996 | |
|
|
997 | =item char *ecb_i2a_05 (char *ptr, uint32_t value) // 64 bit |
|
|
998 | |
|
|
999 | =item char *ecb_i2a_06 (char *ptr, uint32_t value) // 64 bit |
|
|
1000 | |
|
|
1001 | =item char *ecb_i2a_07 (char *ptr, uint32_t value) // 64 bit |
|
|
1002 | |
|
|
1003 | =item char *ecb_i2a_08 (char *ptr, uint32_t value) // 64 bit |
|
|
1004 | |
|
|
1005 | =item char *ecb_i2a_09 (char *ptr, uint32_t value) // 64 bit |
|
|
1006 | |
|
|
1007 | The C<< ecb_i2a_0I<N> > functions take an unsigned I<value> and convert |
|
|
1008 | them to exactly I<N> digits, returning a pointer to the first character |
|
|
1009 | after the digits. The I<value> must be in range. The functions marked with |
|
|
1010 | I<32 bit> do their calculations internally in 32 bit, the ones marked with |
|
|
1011 | I<64 bit> internally use 64 bit integers, which might be slow on 32 bit |
|
|
1012 | architectures (the high level API decides on 32 vs. 64 bit versions using |
|
|
1013 | C<ECB_64BIT_NATIVE>). |
|
|
1014 | |
|
|
1015 | =item char *ecb_i2a_2 (char *ptr, uint32_t value) // 32 bit |
|
|
1016 | |
|
|
1017 | =item char *ecb_i2a_3 (char *ptr, uint32_t value) // 32 bit |
|
|
1018 | |
|
|
1019 | =item char *ecb_i2a_4 (char *ptr, uint32_t value) // 32 bit |
|
|
1020 | |
|
|
1021 | =item char *ecb_i2a_5 (char *ptr, uint32_t value) // 64 bit |
|
|
1022 | |
|
|
1023 | =item char *ecb_i2a_6 (char *ptr, uint32_t value) // 64 bit |
|
|
1024 | |
|
|
1025 | =item char *ecb_i2a_7 (char *ptr, uint32_t value) // 64 bit |
|
|
1026 | |
|
|
1027 | =item char *ecb_i2a_8 (char *ptr, uint32_t value) // 64 bit |
|
|
1028 | |
|
|
1029 | =item char *ecb_i2a_9 (char *ptr, uint32_t value) // 64 bit |
|
|
1030 | |
|
|
1031 | Similarly, the C<< ecb_i2a_I<N> > functions take an unsigned I<value> |
|
|
1032 | and convert them to at most I<N> digits, suppressing leading zeroes, and |
|
|
1033 | returning a pointer to the first character after the digits. |
|
|
1034 | |
|
|
1035 | =item ECB_I2A_MAX_X5 (=59074) |
|
|
1036 | |
|
|
1037 | =item char *ecb_i2a_x5 (char *ptr, uint32_t value) // 32 bit |
|
|
1038 | |
|
|
1039 | =item ECB_I2A_MAX_X10 (=2932500665) |
|
|
1040 | |
|
|
1041 | =item char *ecb_i2a_x10 (char *ptr, uint32_t value) // 64 bit |
|
|
1042 | |
|
|
1043 | The C<< ecb_i2a_xI<N> >> functions are similar to the C<< ecb_i2a_I<N> > |
|
|
1044 | functions, but they can generate one digit more, as long as the number |
|
|
1045 | is within range, which is given by the symbols C<ECB_I2A_MAX_X5> (almost |
|
|
1046 | 16 bit range) and C<ECB_I2A_MAX_X10> (a bit more than 31 bit range), |
|
|
1047 | respectively. |
|
|
1048 | |
|
|
1049 | For example, the sigit part of a 32 bit signed integer just fits into the |
|
|
1050 | C<ECB_I2A_MAX_X10> range, so while C<ecb_i2a_x10> cannot convert a 10 |
|
|
1051 | digit number, it can convert all 32 bit signed numbers. Sadly, it's not |
|
|
1052 | good enough for 32 bit unsigned numbers. |
|
|
1053 | |
|
|
1054 | =back |
|
|
1055 | |
911 | =head2 FLOATING POINT FIDDLING |
1056 | =head2 FLOATING POINT FIDDLING |
912 | |
1057 | |
913 | =over 4 |
1058 | =over |
914 | |
1059 | |
915 | =item ECB_INFINITY [-UECB_NO_LIBM] |
1060 | =item ECB_INFINITY [-UECB_NO_LIBM] |
916 | |
1061 | |
917 | Evaluates to positive infinity if supported by the platform, otherwise to |
1062 | Evaluates to positive infinity if supported by the platform, otherwise to |
918 | a truly huge number. |
1063 | a truly huge number. |
… | |
… | |
996 | |
1141 | |
997 | =back |
1142 | =back |
998 | |
1143 | |
999 | =head2 ARITHMETIC |
1144 | =head2 ARITHMETIC |
1000 | |
1145 | |
1001 | =over 4 |
1146 | =over |
1002 | |
1147 | |
1003 | =item x = ecb_mod (m, n) |
1148 | =item x = ecb_mod (m, n) |
1004 | |
1149 | |
1005 | Returns C<m> modulo C<n>, which is the same as the positive remainder |
1150 | Returns C<m> modulo C<n>, which is the same as the positive remainder |
1006 | of the division operation between C<m> and C<n>, using floored |
1151 | of the division operation between C<m> and C<n>, using floored |
… | |
… | |
1013 | C<n> must be strictly positive (i.e. C<< >= 1 >>), while C<m> must be |
1158 | C<n> must be strictly positive (i.e. C<< >= 1 >>), while C<m> must be |
1014 | negatable, that is, both C<m> and C<-m> must be representable in its |
1159 | negatable, that is, both C<m> and C<-m> must be representable in its |
1015 | type (this typically excludes the minimum signed integer value, the same |
1160 | type (this typically excludes the minimum signed integer value, the same |
1016 | limitation as for C</> and C<%> in C). |
1161 | limitation as for C</> and C<%> in C). |
1017 | |
1162 | |
1018 | Current GCC versions compile this into an efficient branchless sequence on |
1163 | Current GCC/clang versions compile this into an efficient branchless |
1019 | almost all CPUs. |
1164 | sequence on almost all CPUs. |
1020 | |
1165 | |
1021 | For example, when you want to rotate forward through the members of an |
1166 | For example, when you want to rotate forward through the members of an |
1022 | array for increasing C<m> (which might be negative), then you should use |
1167 | array for increasing C<m> (which might be negative), then you should use |
1023 | C<ecb_mod>, as the C<%> operator might give either negative results, or |
1168 | C<ecb_mod>, as the C<%> operator might give either negative results, or |
1024 | change direction for negative values: |
1169 | change direction for negative values: |
… | |
… | |
1037 | |
1182 | |
1038 | =back |
1183 | =back |
1039 | |
1184 | |
1040 | =head2 UTILITY |
1185 | =head2 UTILITY |
1041 | |
1186 | |
1042 | =over 4 |
1187 | =over |
1043 | |
1188 | |
1044 | =item element_count = ecb_array_length (name) |
1189 | =item element_count = ecb_array_length (name) |
1045 | |
1190 | |
1046 | Returns the number of elements in the array C<name>. For example: |
1191 | Returns the number of elements in the array C<name>. For example: |
1047 | |
1192 | |
… | |
… | |
1055 | |
1200 | |
1056 | =head2 SYMBOLS GOVERNING COMPILATION OF ECB.H ITSELF |
1201 | =head2 SYMBOLS GOVERNING COMPILATION OF ECB.H ITSELF |
1057 | |
1202 | |
1058 | These symbols need to be defined before including F<ecb.h> the first time. |
1203 | These symbols need to be defined before including F<ecb.h> the first time. |
1059 | |
1204 | |
1060 | =over 4 |
1205 | =over |
1061 | |
1206 | |
1062 | =item ECB_NO_THREADS |
1207 | =item ECB_NO_THREADS |
1063 | |
1208 | |
1064 | If F<ecb.h> is never used from multiple threads, then this symbol can |
1209 | If F<ecb.h> is never used from multiple threads, then this symbol can |
1065 | be defined, in which case memory fences (and similar constructs) are |
1210 | be defined, in which case memory fences (and similar constructs) are |