División de Tecnología de la Información – JGroups
What is JGroups?
JGroups is a toolkit for reliable multicast communication (note that this doesn’t necessarily mean IP Multicast, JGroups can also use transports such as TCP).
It can be used to create groups of processes whose members can send messages to each other. The main features include:
-
Group creation and deletion. Group members can be spread across LANs or WANs.
-
Joining and leaving of groups.
-
Membership detection and notification about joined/left/crashed members.
-
Detection and removal of crashed members.
-
Sending and receiving of member-to-group messages (point-to-multipoint).
-
Sending and receiving of member-to-member messages (point-to-point).
Flexible Protocol Stack
The most powerful feature of JGroups is its flexible protocol stack, which allows developers to adapt it to exactly match their application requirements and network characteristics.
The benefit of this is that you only pay for what you use. By mixing and matching protocols, various differing application requirements can be satisfied.
JGroups comes with a number of protocols (but anyone can write their own), for example:
-
Transport protocols: UDP (IP Multicast), TCP, JMS.
-
Fragmentation of large messages.
-
Reliable unicast and multicast message transmission. Lost messages are retransmitted.
-
Failure detection: crashed members are excluded from the membership.
-
Ordering protocols: Atomic (all-or-none message delivery), Fifo, Causal, Total Order (sequencer or token based).
-
Membership.
-
Encryption.
Overview
JGroups is a reliable group communication toolkit written entirely in Java. It is based on IP multicast, but extends it with reliability and group membership.
Reliability includes (among other things): lossless transmission of a message to all recipients (with retransmission of missing messages).
fragmentation of large messages into smaller ones and reassembly at the receiver’s side.
ordering of messages, e.g. messages m1 and m2 sent by P will be received by all receivers in the same order, and not as m2, m1 (FIFO order).
atomicity: a message will be received by all receivers, or none.
Group Membership incluyes:
knowledge of who the members of a group are and
notification when a new member joins, an existing member leaves, or an existing member has crashed.
The table below shows where JGroups fits in: UDP (unreliable and unicast), TCP (reliable and unicast), IP Multicast (unreliable and multicast) and JGroups (reliable and multicast).
In unicast communication, where one sender sends a message to one receiver, there is UDP and TCP. UDP is unreliable, packets may get lost, duplicated, may arrive out of order, and there is a maximum packet size restriction. TCP is also unicast, but takes care of message retransmission for missing messages, weeds out duplicates, fragments packets that are too big and present messages to the application in the order in which they were sent. In the multicast case, where one sender sends a message to many receivers, IP Multicast extends UDP: a sender sends messages to a multicast address and the receivers have to join that multicast address to receive them. Like in UDP, message transmission is still unreliable, and there is no notion of membership (who has currently joined the multicast address).
JGroups extends reliable unicast message transmission (like in TCP) to multicast settings. As described above it provides reliability and group membership on top of IP Multicast. Since every application has different reliability needs, JGroups provides a flexible protocol stack architecture that allows users to put together custom-tailored stacks, ranging from unreliable but fast to highly reliable but slower stacks.
As an example, a user might start with a stack only containing IP Multicast. To add loss-less transmission, he might add the NAKACK protocol (which also weeds out duplicates). Now messages sent from a sender are always received by the recipients, but the order in which they will be received is undefined. Therefore, the user might choose to add the FIFO layer to impose per/sender ordering. If ordering should be imposed over all the senders, then the TOTAL protocol may be added. The Group Membership Service (GMS) and FLUSH protocols provide for group membership: they allow the user to register a callback function that will be invoked whenever the membership changes, e.g. a member joins, leaves or crashes. In the latter case, a failure detector (FD) protocol is needed by the GMS to announce crashed members. If new members want to obtain the current state of existing members, then the STATE_TRANSFER protocol has to be present in this custom-made stack. Finally, the user may want to encrypt messages, so he adds the CRYPT protocol (which encrypts/decrypts messages and takes care of re-keying using a key distribution toolkit).
Using composition of protocols (each taking care of a different aspect) to build reliable multicast communication has the benefit that:
-
users of a stack only pay for the protocols they use and
-
since protocols are independent of each other, they can be modified, replaced or new ones can be added, improving modularity and maintainability.
Application Programming Interface
The API of JGroups is very simple (similar to a UDP socket) and is always the same, regardless of how the underlying protocol stack is composed. To be able to send/receive messages, a channel has to be created. The reliability of a channel is specified as a string, which then causes the creation of the underlying protocol stack. The example below creates a channel and sends/receives 1 message:
String props="UDP:PING:FD:STABLE:NAKACK:UNICAST:" +
"FRAG:FLUSH:GMS:VIEW_ENFORCER:STATE_TRANSFER:QUEUE";
Message send_msg;
Object recv_msg;
Channel channel=new JChannel(props);
channel.connect("MyGroup");
send_msg=new Message(null, null, "Hello world");
channel.send(send_msg);
recv_msg=channel.receive(0);
System.out.println("Received " + recv_msg);
channel.disconnect();
channel.close();
The channel’s properties are specified when creating it. They include IP Multicast (UDP), failure detection (FD), distributed message garbage collection (STABLE), multicast (NAKACK) and unicast (UNICAST) loss-less and FIFO delivery, fragmentation (FRAG), group membership (GMS, FLUSH, VIEW_ENFORCER, QUEUE) and state transfer (STATE_TRANSFER). Each protocol "X" results in the creation of an instance of a Java class X. All protocol instances are kept in a linked list (the protocol stack), which messages traverse up/down.
To join a group, connect() is called. It returns when the member has successfully joined the group named "MyGroup", or when it has created a new group (if it is the first member).
Then a message is sent using the send() method. A message contains the receiver’s address (‘null’ = all group members), the sender’s address (‘null’, will be filled in by the protocol stack before sending the message) and a byte buffer. In the example, the String object "Hello world" is set to be the message’s contents. It is serialized into the message’s byte buffer.
Since the message is sent to all members, the sender will also receive it (unless the socket option ‘receive own multicasts’ is set to ‘true’). This is done using the receive() method.
Finally, the member disconnects from the channel and then closes it. This results in a notification being sent to all members who are registered for membership change notifications.
Building Blocks
Channels are deliberately simple so that all possible can use them. However, JGroups also provides high-level abstractions, so called building blocks. They can be put between the application and the channel. The application would then use the building blocks instead of channels. An example is RpcDispatcher which allows applications to make remote group method calls:
public class RpcDispatcherTest {
Channel channel;
RpcDispatcher disp;
RspList rsp_list;
String props="UDP:PING:FD:STABLE:NAKACK:UNICAST:FLUSH:" +
"GMS:VIEW_ENFORCER:QUEUE";
public int print(int number) {
System.out.println("print(" + number + ")");
return number * 2;
}
public void start() throws Exception {
channel=new JChannel(props);
disp=new RpcDispatcher(channel, null, null, this);
channel.connect("RpcDispatcherTestGroup");
for(int i=0; i callRemoteMethods (null, "print", new Integer(i),
GroupRequest.GET_ALL, 0);
System.out.println("Responses: " + rsp_list);
}
channel.close();
}
}
As before, the example creates a channel specifying the properties. It also defines a method print() which will be called by the RpcDispatcher later on. Then an instance of RpcDispatcher is created on top of the channel and the channel connected (this joins the new member to the group). Now, messages can be sent and received. But instead of sending/receiving messages using the channel, the application invokes remote method call using RpcDispatcher’s callRemoteMethods().
The first argument ‘null’ means send to all group members, "print" is the name of the method to be invoked, ‘new Integer(i)’ is the argument to the print() method, GET_ALL means wait until the responses from all group members have been received and ‘0’ specifies the timeout (in this case, it means wait forever). RpcDispatcher sends a multicast message (containing the method call) to all members (e.g. 4 members, including itself) and waits for 4 replies. If one or more of the members crash in the meantime, the call nevertheless returns and has those replies marked as ‘suspected’ in the response list. The response list contains an entry for each expected reply, which has the address of the replier, the value (if any, in our case it is an integer), and a flag (received, not received (in case of timeouts) or suspected). If this member is the only group member, then the method call will call its own print() method.
The current set of protocols shipped with JGroups provide Virtual Synchrony properties: messages are sent/received in views. Each member has a view which is a set of members that constitute the current membership. When the membership changes, a new view will be installed by all members. Views are installed in the same order at all members. The set of messages between 2 consecutive views will be the same at all receivers. Messages sent in view V1 will be received by all non-faulty members in view V1.
Papers
-
Filip Hanik: In Memory Session Replication in Tomcat 4. Discussion of session replication in Tomcat 4. Step-by-step instructions how to set up session replication in two Tomcat instances.
-
Vladimir Blagojevic: Implementing Totem’s total ordering protocol in JavaGroups reliable group communication toolkit. Detailed discussion of JGroups’ TOTAL_TOKEN protocol as implemented by Vladimir. Discussion of performance measurements.
-
Ananda Bollu: The JavaGroups FLOW_CONTROL Protocol Overview and configuration of the FLOW_CONTROL protocol
-
High-impact Web tier clustering, Part 1: Scaling Web services and applications with JavaGroups (IBM developWorks) Detailed discussion of JGroups’ concepts using session replication as an example.
-
Ken Birman: Building Secure and Reliable Network Applications. Excellent book on group communication. In-depth description of various protocols (atomic bcast, causal, total order, probabilistic broadcast). Very technical content.
Sources
-
Version 2.2 (2003-09-15)
-
Version 2.1.1 final (2003-09-08)
-
Version 2.1 (2003-07-05)
-
Version 2.0.6 (2003-01-25)
-
Version 2.0.3 (2002-09-20)