Protocol Sketch


Definitions
User: User of service provided by this reliable multicast protocol, in our case the ISABEL control manager. It can play two roles: sender and receiver.
Provider: Entity providing the reliable transfer service for the ISABEL control manager. Typical deployments involve a provider by host.
Information flow: Information transfer from a sender to a destination. It is defined by two parameters: sender identifier and destination group.

Description

This reliable multicast protocol tries to solve the main problems of the receiver-initiated approaches: NACK-implosion, the end-to-end latency and indefinite cache data for retransmissions. The way this protocol carries it out, is described below.

The reliable multicast providers detect lost packets within an information flow by looking for gaps in the numbering sequence of that flow.


A failure in a branch of the underlaying multicast tree, and subsequent packet looses, may be the cause of a NACK-implosion. To prevent this phenomenon, this protocol implements a NACK-supression method similar to the one implemented by SRM, which consists on:
NACKs are not only sent to senders but also multicast to the destination group.
NACKs are randomly delayed off and cancelled in case NACKS for the same missing packets (or the packets themselves) are received.
A single NACK is used to request a set of sequence numbers instead of a NACK for every lost packet. Although this makes more complex the NACK-suppression implementation, it improves the efficiency of the recovery mechanism when loosing bursts of packets.

As a result of a NACK reception, the reliable multicast provider connected to the sender retransmits the requested packets to the whole group. Retransmissions are not sent immediately, the reliable multicast providers wait a period of random delay to get possible replicated NACKs. The aggregation of NACKs is very important because it improves performance, avoiding repeated multicast retransmissions in a very short time. Hence the reliable multicast providers which are connected to senders must save a cache of sent packets to recover transmissions failures. Proper timers adjustment, required to avoid replicated NACKs in the reception side and to improve aggregation of NACKs in the emission side, may become a daunting task.

When a lost packet is the last one within a given traffic burst, it may become hard to detect the loss in a timely manner. For this reason, this protocol provides a heartbeat for every information flow and it is triggered when the flow is idle. Each time the heartbeat times out, the provider sends an idle packet informing the receivers which was the last sequence number used in this information flow. So providers in reception may detect losses in a period no longer than the heartbeat defined for an information flow. The heartbeat interval is reset to a minimum value after a packed is sent to a flow information and increases until a maximum value. Idle packets are not sequenced packets because they are periodically sent. This is the same technique LBRM uses. Group membership is fully dynamic. So any user can join or leave any group at any moment, there is not a registration entity managing the memberships. For this reason, receivers remain anonymous to the senders. Sending to a group is not bound to group membership nor to any other administrative requirement. The key idea introduced by our protocol is having two kind of data packets:
Data packet strict sense (from now only named data packets): data packets carry information from a sender to a destination group. They are strictly orderly delivered to receivers. They are normally used to transmit incremental application state information.
Synchronism packets: Their use marks a milestone within an information flow (synchronism points): this means this packet and subsequent ones supersede any previous information exchange.
This two kind of messages are directly related to the communication service model for coordination of cooperative aplications. Global states are mapped to synchronism packets and global state deltas are handled through data packets.

Synchronism packets divide an information flow in unrelated flow contexts. New users of a group can not build a flow context only from its data packets because they are inseparably related to a synchronism data packet. At least, new users need a synchronism packet and all packets sent after that. Normally, users want to build the latest context of a flow and they only need the last synchronism packet and all data packets after that. So providers connected to senders only keep a copy of each packet sent from the last synchronism packet. If a user requests a number of sequence minor than the last synchronism packet, the provider connected to the sender of this flow will retransmit all cached packets for this flow in order to let the user build the context of it. So, it is possible to assist every retransmission request if all packets sent from the last synchronism for every flow are cached. When a new synchronism packet is sent, the correspondent provider will free the cache of its flow and will store the new synchronism packet as the first packet in the cache.

When a user joins a group, it selects one of three ways to catch with the rest of the group:
Forced retransmission join. On behalf of a joining user, its provider requests the retransmission of all cached information from the last synchronism point. This mechanism is disregarded for most applications, given the fact that is neither efficient nor scalable and may lead to implosion states.
Advisory retransmission join. On behalf of a joining user, its provider requests a new synchronism point from all senders of this group. Although this request is not an enforcing one, senders should try to identify a synchronism point as soon as possible. This solution allows applications to select a suitable bootstrap procedure for newly joined users. Opposed to partial benefits in efficiency in contrast to the custom bootstrap, there is an increase in complexity for higher level protocols which should answer the synchronism requests.
Silent join. On behalf of a joining user, its provider does not take any special action during the join. The provider will skip any packets for an information flow until a synchronism packet for this flow is received. But, getting in synchronism with a source of a group is independent of the synchronism of any other source which is sending to the same group. When a receiver is in synchronism with all senders of a group then the receiver is in synchronism with the group. The main advantage of this procedure is the minimal message interchange. As a major drawback is the slowness of this join procedure, it only depends on the frequency every sender of a group are sending synchronism packets.

One of the main problems of receiver-initiated approach is the undefined data cache for retransmissions. There are two bounded solutions: first one, cache all packets and second one, discard packets when the cache is full. The first solution is based on an unlimited storing capacity. The second one does not provide the reliability service. This protocol uses synchronism packets to control the cache length of each information flow. Although, this approach only works if senders often use synchronism packets, if not, it is the same case as the unlimited storing capacity. To avoid this problem some mechanisms have been designed: sender (re)definable cache size and a sender specified watermark value.

Cache length indicates the maximum number of packets that the provider connected to a sender must store for potential retransmissions. Within this cache a watermark can be specified by the sender. When a cache grows over the specified watermark, the provider warns the sender the cache is filling and the sender should send a synchronism packet to flush cache.

This feedback mechanism that our protocol provides to inform the senders of possible cache overflow, is purely advisory. If senders choose to ignore this warning, caches would be completely filled and old packets would be replaced by new ones. In this situation, it is impossible to retransmit a consistent state to new users in the group. In the case a provider receives a NACK for an overflowed cache, it will warn, again, to the responsible sender to transmit urgently a synchronism packet.

Packets exchanged between providers have a maximum size (imposed by the underlying UDP transport protocol), a message from a sender may span several network packets. That means providers are required to be able to fragment data from senders and reassemble them on receipt from network.



home home Author: Eva M. Castro
eva@gsyc.escet.urjc.es