… | |
… | |
78 | |
78 | |
79 | Ports allow you to register C<rcv> handlers that can match all or just |
79 | Ports allow you to register C<rcv> handlers that can match all or just |
80 | some messages. Messages send to ports will not be queued, regardless of |
80 | some messages. Messages send to ports will not be queued, regardless of |
81 | anything was listening for them or not. |
81 | anything was listening for them or not. |
82 | |
82 | |
|
|
83 | Ports are represented by (printable) strings called "port IDs". |
|
|
84 | |
83 | =item port ID - C<nodeid#portname> |
85 | =item port ID - C<nodeid#portname> |
84 | |
86 | |
85 | A port ID is the concatenation of a node ID, a hash-mark (C<#>) as |
87 | A port ID is the concatenation of a node ID, a hash-mark (C<#>) as |
86 | separator, and a port name (a printable string of unspecified format). |
88 | separator, and a port name (a printable string of unspecified format). |
87 | |
89 | |
… | |
… | |
91 | which enables nodes to manage each other remotely, and to create new |
93 | which enables nodes to manage each other remotely, and to create new |
92 | ports. |
94 | ports. |
93 | |
95 | |
94 | Nodes are either public (have one or more listening ports) or private |
96 | Nodes are either public (have one or more listening ports) or private |
95 | (no listening ports). Private nodes cannot talk to other private nodes |
97 | (no listening ports). Private nodes cannot talk to other private nodes |
96 | currently. |
98 | currently, but all nodes can talk to public nodes. |
|
|
99 | |
|
|
100 | Nodes is represented by (printable) strings called "node IDs". |
97 | |
101 | |
98 | =item node ID - C<[A-Za-z0-9_\-.:]*> |
102 | =item node ID - C<[A-Za-z0-9_\-.:]*> |
99 | |
103 | |
100 | A node ID is a string that uniquely identifies the node within a |
104 | A node ID is a string that uniquely identifies the node within a |
101 | network. Depending on the configuration used, node IDs can look like a |
105 | network. Depending on the configuration used, node IDs can look like a |
102 | hostname, a hostname and a port, or a random string. AnyEvent::MP itself |
106 | hostname, a hostname and a port, or a random string. AnyEvent::MP itself |
103 | doesn't interpret node IDs in any way. |
107 | doesn't interpret node IDs in any way except to uniquely identify a node. |
104 | |
108 | |
105 | =item binds - C<ip:port> |
109 | =item binds - C<ip:port> |
106 | |
110 | |
107 | Nodes can only talk to each other by creating some kind of connection to |
111 | Nodes can only talk to each other by creating some kind of connection to |
108 | each other. To do this, nodes should listen on one or more local transport |
112 | each other. To do this, nodes should listen on one or more local transport |
|
|
113 | endpoints - binds. |
|
|
114 | |
109 | endpoints - binds. Currently, only standard C<ip:port> specifications can |
115 | Currently, only standard C<ip:port> specifications can be used, which |
110 | be used, which specify TCP ports to listen on. |
116 | specify TCP ports to listen on. So a bind is basically just a tcp socket |
|
|
117 | in listening mode thta accepts conenctions form other nodes. |
111 | |
118 | |
112 | =item seed nodes |
119 | =item seed nodes |
113 | |
120 | |
114 | When a node starts, it knows nothing about the network. To teach the node |
121 | When a node starts, it knows nothing about the network it is in - it |
115 | about the network it first has to contact some other node within the |
122 | needs to connect to at least one other node that is already in the |
116 | network. This node is called a seed. |
123 | network. These other nodes are called "seed nodes". |
117 | |
124 | |
118 | Apart from the fact that other nodes know them as seed nodes and they have |
125 | Seed nodes themselves are not special - they are seed nodes only because |
119 | to have fixed listening addresses, seed nodes are perfectly normal nodes - |
126 | some other node I<uses> them as such, but any node can be used as seed |
120 | any node can function as a seed node for others. |
127 | node for other nodes, and eahc node cna use a different set of seed nodes. |
121 | |
128 | |
122 | In addition to discovering the network, seed nodes are also used to |
129 | In addition to discovering the network, seed nodes are also used to |
123 | maintain the network and to connect nodes that otherwise would have |
130 | maintain the network - all nodes using the same seed node form are part of |
124 | trouble connecting. They form the backbone of an AnyEvent::MP network. |
131 | the same network. If a network is split into multiple subnets because e.g. |
|
|
132 | the network link between the parts goes down, then using the same seed |
|
|
133 | nodes for all nodes ensures that eventually the subnets get merged again. |
125 | |
134 | |
126 | Seed nodes are expected to be long-running, and at least one seed node |
135 | Seed nodes are expected to be long-running, and at least one seed node |
127 | should always be available. They should also be relatively responsive - a |
136 | should always be available. They should also be relatively responsive - a |
128 | seed node that blocks for long periods will slow down everybody else. |
137 | seed node that blocks for long periods will slow down everybody else. |
129 | |
138 | |
|
|
139 | For small networks, it's best if every node uses the same set of seed |
|
|
140 | nodes. For large networks, it can be useful to specify "regional" seed |
|
|
141 | nodes for most nodes in an area, and use all seed nodes as seed nodes for |
|
|
142 | each other. What's important is that all seed nodes connections form a |
|
|
143 | complete graph, so that the network cannot split into separate subnets |
|
|
144 | forever. |
|
|
145 | |
|
|
146 | Seed nodes are represented by seed IDs. |
|
|
147 | |
130 | =item seeds - C<host:port> |
148 | =item seed IDs - C<host:port> |
131 | |
149 | |
132 | Seeds are transport endpoint(s) (usually a hostname/IP address and a |
150 | Seed IDs are transport endpoint(s) (usually a hostname/IP address and a |
133 | TCP port) of nodes that should be used as seed nodes. |
151 | TCP port) of nodes that should be used as seed nodes. |
134 | |
152 | |
135 | The nodes listening on those endpoints are expected to be long-running, |
153 | =item global nodes |
136 | and at least one of those should always be available. When nodes run out |
154 | |
137 | of connections (e.g. due to a network error), they try to re-establish |
155 | An AEMP network needs a discovery service - nodes need to know how to |
138 | connections to some seednodes again to join the network. |
156 | connect to other nodes they only know by name. In addition, AEMP offers a |
|
|
157 | distributed "group database", which maps group names to a list of strings |
|
|
158 | - for example, to register worker ports. |
|
|
159 | |
|
|
160 | A network needs at least one global node to work, and allows every node to |
|
|
161 | be a global node. |
|
|
162 | |
|
|
163 | Any node that loads the L<AnyEvent::MP::Global> module becomes a global |
|
|
164 | node and tries to keep connections to all other nodes. So while it can |
|
|
165 | make sense to make every node "global" in small networks, it usually makes |
|
|
166 | sense to only make seed nodes into global nodes in large networks (nodes |
|
|
167 | keep connections to seed nodes and global nodes, so makign them the same |
|
|
168 | reduces overhead). |
139 | |
169 | |
140 | =back |
170 | =back |
141 | |
171 | |
142 | =head1 VARIABLES/FUNCTIONS |
172 | =head1 VARIABLES/FUNCTIONS |
143 | |
173 | |
… | |
… | |
145 | |
175 | |
146 | =cut |
176 | =cut |
147 | |
177 | |
148 | package AnyEvent::MP; |
178 | package AnyEvent::MP; |
149 | |
179 | |
|
|
180 | use AnyEvent::MP::Config (); |
150 | use AnyEvent::MP::Kernel; |
181 | use AnyEvent::MP::Kernel; |
|
|
182 | use AnyEvent::MP::Kernel qw(%NODE %PORT %PORT_DATA $UNIQ $RUNIQ $ID); |
151 | |
183 | |
152 | use common::sense; |
184 | use common::sense; |
153 | |
185 | |
154 | use Carp (); |
186 | use Carp (); |
155 | |
187 | |
156 | use AE (); |
188 | use AE (); |
157 | |
189 | |
158 | use base "Exporter"; |
190 | use base "Exporter"; |
159 | |
191 | |
160 | our $VERSION = '1.30'; |
192 | our $VERSION = $AnyEvent::MP::Config::VERSION; |
161 | |
193 | |
162 | our @EXPORT = qw( |
194 | our @EXPORT = qw( |
163 | NODE $NODE *SELF node_of after |
195 | NODE $NODE *SELF node_of after |
164 | configure |
196 | configure |
165 | snd rcv mon mon_guard kil psub peval spawn cal |
197 | snd rcv mon mon_guard kil psub peval spawn cal |
… | |
… | |
191 | Before a node can talk to other nodes on the network (i.e. enter |
223 | Before a node can talk to other nodes on the network (i.e. enter |
192 | "distributed mode") it has to configure itself - the minimum a node needs |
224 | "distributed mode") it has to configure itself - the minimum a node needs |
193 | to know is its own name, and optionally it should know the addresses of |
225 | to know is its own name, and optionally it should know the addresses of |
194 | some other nodes in the network to discover other nodes. |
226 | some other nodes in the network to discover other nodes. |
195 | |
227 | |
196 | The key/value pairs are basically the same ones as documented for the |
|
|
197 | F<aemp> command line utility (sans the set/del prefix). |
|
|
198 | |
|
|
199 | This function configures a node - it must be called exactly once (or |
228 | This function configures a node - it must be called exactly once (or |
200 | never) before calling other AnyEvent::MP functions. |
229 | never) before calling other AnyEvent::MP functions. |
|
|
230 | |
|
|
231 | The key/value pairs are basically the same ones as documented for the |
|
|
232 | F<aemp> command line utility (sans the set/del prefix), with two additions: |
|
|
233 | |
|
|
234 | =over 4 |
|
|
235 | |
|
|
236 | =item norc => $boolean (default false) |
|
|
237 | |
|
|
238 | If true, then the rc file (e.g. F<~/.perl-anyevent-mp>) will I<not> |
|
|
239 | be consulted - all configuraiton options must be specified in the |
|
|
240 | C<configure> call. |
|
|
241 | |
|
|
242 | =item force => $boolean (default false) |
|
|
243 | |
|
|
244 | IF true, then the values specified in the C<configure> will take |
|
|
245 | precedence over any values configured via the rc file. The default is for |
|
|
246 | the rc file to override any options specified in the program. |
|
|
247 | |
|
|
248 | =back |
201 | |
249 | |
202 | =over 4 |
250 | =over 4 |
203 | |
251 | |
204 | =item step 1, gathering configuration from profiles |
252 | =item step 1, gathering configuration from profiles |
205 | |
253 | |
… | |
… | |
219 | That means that the values specified in the profile have highest priority |
267 | That means that the values specified in the profile have highest priority |
220 | and the values specified directly via C<configure> have lowest priority, |
268 | and the values specified directly via C<configure> have lowest priority, |
221 | and can only be used to specify defaults. |
269 | and can only be used to specify defaults. |
222 | |
270 | |
223 | If the profile specifies a node ID, then this will become the node ID of |
271 | If the profile specifies a node ID, then this will become the node ID of |
224 | this process. If not, then the profile name will be used as node ID. The |
272 | this process. If not, then the profile name will be used as node ID, with |
225 | special node ID of C<anon/> will be replaced by a random node ID. |
273 | a slash (C</>) attached. |
|
|
274 | |
|
|
275 | If the node ID (or profile name) ends with a slash (C</>), then a random |
|
|
276 | string is appended to make it unique. |
226 | |
277 | |
227 | =item step 2, bind listener sockets |
278 | =item step 2, bind listener sockets |
228 | |
279 | |
229 | The next step is to look up the binds in the profile, followed by binding |
280 | The next step is to look up the binds in the profile, followed by binding |
230 | aemp protocol listeners on all binds specified (it is possible and valid |
281 | aemp protocol listeners on all binds specified (it is possible and valid |
… | |
… | |
236 | used, meaning the node will bind on a dynamically-assigned port on every |
287 | used, meaning the node will bind on a dynamically-assigned port on every |
237 | local IP address it finds. |
288 | local IP address it finds. |
238 | |
289 | |
239 | =item step 3, connect to seed nodes |
290 | =item step 3, connect to seed nodes |
240 | |
291 | |
241 | As the last step, the seeds list from the profile is passed to the |
292 | As the last step, the seed ID list from the profile is passed to the |
242 | L<AnyEvent::MP::Global> module, which will then use it to keep |
293 | L<AnyEvent::MP::Global> module, which will then use it to keep |
243 | connectivity with at least one node at any point in time. |
294 | connectivity with at least one node at any point in time. |
244 | |
295 | |
245 | =back |
296 | =back |
246 | |
297 | |
… | |
… | |
252 | Example: become an anonymous node. This form is often used for commandline |
303 | Example: become an anonymous node. This form is often used for commandline |
253 | clients. |
304 | clients. |
254 | |
305 | |
255 | configure nodeid => "anon/"; |
306 | configure nodeid => "anon/"; |
256 | |
307 | |
257 | Example: configure a node using a profile called seed, which si suitable |
308 | Example: configure a node using a profile called seed, which is suitable |
258 | for a seed node as it binds on all local addresses on a fixed port (4040, |
309 | for a seed node as it binds on all local addresses on a fixed port (4040, |
259 | customary for aemp). |
310 | customary for aemp). |
260 | |
311 | |
261 | # use the aemp commandline utility |
312 | # use the aemp commandline utility |
262 | # aemp profile seed nodeid anon/ binds '*:4040' |
313 | # aemp profile seed binds '*:4040' |
263 | |
314 | |
264 | # then use it |
315 | # then use it |
265 | configure profile => "seed"; |
316 | configure profile => "seed"; |
266 | |
317 | |
267 | # or simply use aemp from the shell again: |
318 | # or simply use aemp from the shell again: |
… | |
… | |
337 | sub _kilme { |
388 | sub _kilme { |
338 | die "received message on port without callback"; |
389 | die "received message on port without callback"; |
339 | } |
390 | } |
340 | |
391 | |
341 | sub port(;&) { |
392 | sub port(;&) { |
342 | my $id = "$UNIQ." . $ID++; |
393 | my $id = "$UNIQ." . ++$ID; |
343 | my $port = "$NODE#$id"; |
394 | my $port = "$NODE#$id"; |
344 | |
395 | |
345 | rcv $port, shift || \&_kilme; |
396 | rcv $port, shift || \&_kilme; |
346 | |
397 | |
347 | $port |
398 | $port |
… | |
… | |
734 | } |
785 | } |
735 | |
786 | |
736 | sub spawn(@) { |
787 | sub spawn(@) { |
737 | my ($nodeid, undef) = split /#/, shift, 2; |
788 | my ($nodeid, undef) = split /#/, shift, 2; |
738 | |
789 | |
739 | my $id = "$RUNIQ." . $ID++; |
790 | my $id = "$RUNIQ." . ++$ID; |
740 | |
791 | |
741 | $_[0] =~ /::/ |
792 | $_[0] =~ /::/ |
742 | or Carp::croak "spawn init function must be a fully-qualified name, caught"; |
793 | or Carp::croak "spawn init function must be a fully-qualified name, caught"; |
743 | |
794 | |
744 | snd_to_func $nodeid, "AnyEvent::MP::_spawn" => $id, @_; |
795 | snd_to_func $nodeid, "AnyEvent::MP::_spawn" => $id, @_; |
745 | |
796 | |
746 | "$nodeid#$id" |
797 | "$nodeid#$id" |
747 | } |
798 | } |
|
|
799 | |
748 | |
800 | |
749 | =item after $timeout, @msg |
801 | =item after $timeout, @msg |
750 | |
802 | |
751 | =item after $timeout, $callback |
803 | =item after $timeout, $callback |
752 | |
804 | |
… | |
… | |
862 | ports being the special case/exception, where transport errors cannot |
914 | ports being the special case/exception, where transport errors cannot |
863 | occur. |
915 | occur. |
864 | |
916 | |
865 | =item * Erlang uses processes and a mailbox, AEMP does not queue. |
917 | =item * Erlang uses processes and a mailbox, AEMP does not queue. |
866 | |
918 | |
867 | Erlang uses processes that selectively receive messages, and therefore |
919 | Erlang uses processes that selectively receive messages out of order, and |
868 | needs a queue. AEMP is event based, queuing messages would serve no |
920 | therefore needs a queue. AEMP is event based, queuing messages would serve |
869 | useful purpose. For the same reason the pattern-matching abilities of |
921 | no useful purpose. For the same reason the pattern-matching abilities |
870 | AnyEvent::MP are more limited, as there is little need to be able to |
922 | of AnyEvent::MP are more limited, as there is little need to be able to |
871 | filter messages without dequeuing them. |
923 | filter messages without dequeuing them. |
872 | |
924 | |
873 | (But see L<Coro::MP> for a more Erlang-like process model on top of AEMP). |
925 | This is not a philosophical difference, but simply stems from AnyEvent::MP |
|
|
926 | being event-based, while Erlang is process-based. |
|
|
927 | |
|
|
928 | You cna have a look at L<Coro::MP> for a more Erlang-like process model on |
|
|
929 | top of AEMP and Coro threads. |
874 | |
930 | |
875 | =item * Erlang sends are synchronous, AEMP sends are asynchronous. |
931 | =item * Erlang sends are synchronous, AEMP sends are asynchronous. |
876 | |
932 | |
877 | Sending messages in Erlang is synchronous and blocks the process (and |
933 | Sending messages in Erlang is synchronous and blocks the process until |
|
|
934 | a conenction has been established and the message sent (and so does not |
878 | so does not need a queue that can overflow). AEMP sends are immediate, |
935 | need a queue that can overflow). AEMP sends return immediately, connection |
879 | connection establishment is handled in the background. |
936 | establishment is handled in the background. |
880 | |
937 | |
881 | =item * Erlang suffers from silent message loss, AEMP does not. |
938 | =item * Erlang suffers from silent message loss, AEMP does not. |
882 | |
939 | |
883 | Erlang implements few guarantees on messages delivery - messages can get |
940 | Erlang implements few guarantees on messages delivery - messages can get |
884 | lost without any of the processes realising it (i.e. you send messages a, |
941 | lost without any of the processes realising it (i.e. you send messages a, |
… | |
… | |
887 | AEMP guarantees (modulo hardware errors) correct ordering, and the |
944 | AEMP guarantees (modulo hardware errors) correct ordering, and the |
888 | guarantee that after one message is lost, all following ones sent to the |
945 | guarantee that after one message is lost, all following ones sent to the |
889 | same port are lost as well, until monitoring raises an error, so there are |
946 | same port are lost as well, until monitoring raises an error, so there are |
890 | no silent "holes" in the message sequence. |
947 | no silent "holes" in the message sequence. |
891 | |
948 | |
|
|
949 | If you want your software to be very reliable, you have to cope with |
|
|
950 | corrupted and even out-of-order messages in both Erlang and AEMP. AEMP |
|
|
951 | simply tries to work better in common error cases, such as when a network |
|
|
952 | link goes down. |
|
|
953 | |
892 | =item * Erlang can send messages to the wrong port, AEMP does not. |
954 | =item * Erlang can send messages to the wrong port, AEMP does not. |
893 | |
955 | |
894 | In Erlang it is quite likely that a node that restarts reuses a process ID |
956 | In Erlang it is quite likely that a node that restarts reuses an Erlang |
895 | known to other nodes for a completely different process, causing messages |
957 | process ID known to other nodes for a completely different process, |
896 | destined for that process to end up in an unrelated process. |
958 | causing messages destined for that process to end up in an unrelated |
|
|
959 | process. |
897 | |
960 | |
898 | AEMP never reuses port IDs, so old messages or old port IDs floating |
961 | AEMP does not reuse port IDs, so old messages or old port IDs floating |
899 | around in the network will not be sent to an unrelated port. |
962 | around in the network will not be sent to an unrelated port. |
900 | |
963 | |
901 | =item * Erlang uses unprotected connections, AEMP uses secure |
964 | =item * Erlang uses unprotected connections, AEMP uses secure |
902 | authentication and can use TLS. |
965 | authentication and can use TLS. |
903 | |
966 | |
… | |
… | |
906 | |
969 | |
907 | =item * The AEMP protocol is optimised for both text-based and binary |
970 | =item * The AEMP protocol is optimised for both text-based and binary |
908 | communications. |
971 | communications. |
909 | |
972 | |
910 | The AEMP protocol, unlike the Erlang protocol, supports both programming |
973 | The AEMP protocol, unlike the Erlang protocol, supports both programming |
911 | language independent text-only protocols (good for debugging) and binary, |
974 | language independent text-only protocols (good for debugging), and binary, |
912 | language-specific serialisers (e.g. Storable). By default, unless TLS is |
975 | language-specific serialisers (e.g. Storable). By default, unless TLS is |
913 | used, the protocol is actually completely text-based. |
976 | used, the protocol is actually completely text-based. |
914 | |
977 | |
915 | It has also been carefully designed to be implementable in other languages |
978 | It has also been carefully designed to be implementable in other languages |
916 | with a minimum of work while gracefully degrading functionality to make the |
979 | with a minimum of work while gracefully degrading functionality to make the |
917 | protocol simple. |
980 | protocol simple. |
918 | |
981 | |
919 | =item * AEMP has more flexible monitoring options than Erlang. |
982 | =item * AEMP has more flexible monitoring options than Erlang. |
920 | |
983 | |
921 | In Erlang, you can chose to receive I<all> exit signals as messages |
984 | In Erlang, you can chose to receive I<all> exit signals as messages or |
922 | or I<none>, there is no in-between, so monitoring single processes is |
985 | I<none>, there is no in-between, so monitoring single Erlang processes is |
923 | difficult to implement. Monitoring in AEMP is more flexible than in |
986 | difficult to implement. |
924 | Erlang, as one can choose between automatic kill, exit message or callback |
987 | |
925 | on a per-process basis. |
988 | Monitoring in AEMP is more flexible than in Erlang, as one can choose |
|
|
989 | between automatic kill, exit message or callback on a per-port basis. |
926 | |
990 | |
927 | =item * Erlang tries to hide remote/local connections, AEMP does not. |
991 | =item * Erlang tries to hide remote/local connections, AEMP does not. |
928 | |
992 | |
929 | Monitoring in Erlang is not an indicator of process death/crashes, in the |
993 | Monitoring in Erlang is not an indicator of process death/crashes, in the |
930 | same way as linking is (except linking is unreliable in Erlang). |
994 | same way as linking is (except linking is unreliable in Erlang). |