GlassFish Wiki : ReplicationRequirements

RESOLVED ISSUES

R.1 No support for transactions

We will not provide any support for transactions. Instead, we will follow a "best-effort" model, to make sure that error conditions can always be detected and reported as such.

ERIK (08/01/07): Maybe this is too strongly stated, since I'm not sure that we can always detect these error conditions. Unless we update the complete tree (SAS, protocol sessions, dialogues) with each replication, which is not what we want. Otherwise, we can lose some replication message (e.g., the SAS) and never be able to detect this if this is accessed via the SIP session. So also the detection of error conditions is probably best effort. But maybe we should move the consistent versioning to the open issues, I've added O.6.
JAN (08/01/07): Agreed. We should accept that in some cases, a best effort may not be good enough.
LARRY (08/01/07): In the case of frequent errors, we may want to react to them if possible, i.e., can we assume that a particular instance is no longer replicating, and should no longer be in the cluster in the first place?

R.2 No ServletContext replication

We will not be replicating ServletContext instances, as per the Servlet spec, see SRV.14.2.8 ("ServletContext"):

 
        In the case of a web application marked "distributed" in its deployment descriptor, there will be one context instance for each virtual machine. In this situation, the context cannot be used as a location to share global information (because the information won't be truly global). Use an external resource like a database instead. 
       

ERIK (08/01/07): Agreed
ERIK (08/02/07): I discussed this with Kristoffer as well. The idea is that we might not want to replicate this data, but instead use the shared hashmap functionality of SHOAL to accomplish this shared context. I will check with the current applications if anybody depends on this non-standard behaviour of the ServletContext. I think it is a closed item that we do NOT want to use replication for the ServletContext. I've added a new open item (O/10) to discuss the shared servlet context via SHOAL.
ALL (08/03/07): One thing to keep in mind: Shoal's shared hashtable facility is very lightweight, and has not been thoroughly tested. It is intended for global state that does not change often, and therefore lacks concurrency support: In case of concurrent updates, the first or last update may win.
The Presence Group Management (PGM), which is supposed to run on Sailfin, may have a dependency on this feature.
If there is a strong requirement for this feature, we may add it to a future release, since both IBM and BEA support it

R.3 Session migration after recovery

We are able to support the LB requirement that sessions that failed over will migrate back to the instance from which they originally failed over once that instance has rejoined the cluster.

However, the migration will occur lazily (or "on demand", as requests are being routed back to the recovered instance), instead of eagerly (as part of the instance's recovery).

Session migration must not be confused with session invalidation: During migration, the container will invoke sessionWillPassivate() on the SipSessionActivationListener instances attached to the "old" session, and sessionDidActivate() on the SipSessionActivationListener instances of the "new" session.

Any ongoing transactions will be lost during migration.

ERIK (08/01/07): Agree, mostly. Two remarks though:
- I thought that the LB team would investigate if it could find a strategy that would keep the stickiness established during failover also after recovery. This is probably not simple and this will require the LB to be aware of ongoing and invalidated sessions.
- The second remark is that I still want to investigate the concurrency issues here. Maybe we can postpone the migration until the transaction are finished?
JAN (08/01/07):
- Yes, we're still hoping that the LB team may come up with a solution that will support stickiness of failed-over sessions (so they remain sticky to the instance to which they failed over). I believe that would require an additional mapping layer in the LB, from consistent hashkeys to session ids to instances.
- Getting back to an older question: Can we assume that SIP clients cooperate when it comes to submitting versioning information, the same way as HTTP clients do? Both Binod and Erik confirmed that this assumption is accurate, that is, any versioning information returned to a SIP client will be included in a subsequent request by that client.
- If we decide to not make any changes to the current replication framework, and a session that originated on I1 fails over and is activated on I2, it will have been replicated to I3, so when I1 comes back alive, I3 will respond to I1's broadcast, and remove the session from its replica cache. In this case, no active session is being migrated. Active migration would occur only if we decided to (also) have I2 respond to I1's broadcast. In this case, we must decide when it is safe to migrate. We must wait for any doXXX() method to return, but what about any ongoing transactions associated with any of the SAS's SipSession children (for which SipSession.isInOngoingTransaction() returns TRUE)? Are those the kind of transactions Joel was referring to? Those transactions will be lost, and that's probably acceptable.
LARRY (08/01/07): Assume a cluster of 4 instances I1->I2->I3->I4->I1, with I1 going down and coming back up:
- If a session that originated on I1 and was replicated to I2 was never activated on I2 (since there were no requests attempting to resume it on I2), there will not be any issues with having this session activated on I1 after I1's recovery: In this case, the session will be restored to I1's active cache from I2's replica cache. We have an issue only if the session was accessed on I2 while I1 was down.
- If we wait for the session to migrate to I1 lazily (i.e., only when a request to the recovered I1 attempts to resume it), we're leaving an availability hole here (see O.8), since I2 may go down next, and if the session was never activated on I2, it will exist only in I2's replica cache (and therefore won't have been replicated to I3). This is probably the reason why the LB wants to migrate the session back to I1 as soon as possible.
- There has been a proposal that when I1 comes back up, we should repair it with the sessions that originated on it and were replicated to I2. However, when I1 goes down, I4 will start replicating into I2, meaning I2's replica cache will contain a mix of sessions from both I1 and I4. It will be impossible to distinguish which sessions originated from I1 (and therefore should migrate back to I1 as soon as its recovery has been completed), and which ones originated from I4.
- Need to drill down on locking semantics: We currently lock a session for foreground activity in one of the request processing valves, before we reach any application code. This is a very coarse-grained locking strategy. An active session must not migrate while it is being locked.
ERIK (08/02/07): The SIP transaction state is not replicated, so ongoing transactions can not be migrated or continued after a failover. Therefore, migration will loose any ongoing transactions. This should be no big problem, since the client will retransmit in that case. However, it can cause problems anyway; e.g., side effects can occur during the handling of the request, unnecessary retransmission can happen etc. So, it might still be beneficial to do the foreground locking during the handling of the message.
ERIK (08/02/07): On the subject of cookies; I'm not so sure anymore. I have to look deeper into the specs on the semantics of target updates. There might be problems with record-route where the client does not use the last received but the first received record-route information... The issue with multiple responses might not be a problem since SIP transactions are not replicated anyway. Anyway this has to be considers. I've added O.11

R.4 Dynamic reconfiguration of replication

EAS allows for dynamic enabling and disabling of replication. GlassFish does not (ie., app needs to be redeployed).
We will follow the GlassFish strategy and not the EAS 4.1.

ERIK (08/01/07): I hope that maybe Kristoffer can give an answer on this. I would like to move this to resolved as well.
ERIK (08/02/07): For MMAS the priority is set to low for this requirement at the moment. The requirements have not been reviewed yet, but let us move it to resolved (we can always move it back to open
ALL (08/03/07): Is this a realistic customer requirement?
GlassFish has a healthchecker built in, which has the capability to turn off replication, but not globally, only as a transient phenomenon, e.g., during a cluster shape change, when replication is suspended.
Even though GlassFish currently does not have the capability to enable or disable replication globally, it might be possible to add support for this feature, but never on a per application basis.

OPEN ISSUES

O.1 Data modeling and replication triggers

May want to closely align with cache trees (with the exception of ServletContext, see R.2) and trigger events defined in EAS (4.1?).

EVDV (08/01/07): That is the intention. I've started with this.

O.2 Coordination of timer firings across instances

O.2.1 During normal replication

When a session is replicated, it remains in the replica cache of the "replica buddy" instance until a request fails over, in which case it is moved from the buddy's replica to its active cache. However, its timers must be monitored, and may need to be fired while in the replica cache.

ERIK (08/01/07): The lazy re-activation strategy has other disadvantages as well. I've added a separate item on that, see O.8. However, I think that maybe both could have the same solution, Here I want to propose a sort of "reverse-repair", where the buddy activates the session after its counter-clockwise neighbor is gone.
ERIK (08/02/07): Discussed this with Kristoffer and we came up with an alternative. Each replica will contain a 'reactivationtime' field (in addition to a 'expires' field). The reactivationtime will be the time of the next timer expiry + an additional delay. After each timer expiry (of the sip application session timer or any of the application defined SIP timers), this will be updated in the replica. A thread, analogous to the reaper thread, will scan the replica cache, if there are replicas with a reactivationtime that has expired, these replicas will be activated on the buddy.
The additional delay is to give the active copy the chance to fire the timer and update the reactivationtime.
The advantage of this solution would be that it reactivates the replicas 'on-demand', i.e., either when an external event occurs, or when a timer expires.
The disadvantage is that this offers no way of preventing a stale replica from becoming active at timer expiry (maybe add a check where the replica actively checks if it is stale before activating the replica: i.e., check if there is an active copy somewhere in the cluster. If the active copy is not on its counter-clockwise neighbor it could remove itself. If there is an active copy on its counter-clockwise neighbor then the firing of the timer takes longer then the delay, in which case we add another delay ~~backoff~~ ??)
ALL (08/03/07): There seem to be 3 options for activating a session replica: lazy activation (i.e., at the time a request fails over to the replica instance), eager activation (i.e., immediately following a failover), and some kind of middle way (i.e., as the result of a timer trigger event).
As for timer firings, EJB follows a different philosopy: It does not mandate that a timer wake up, instead, its focus is on when the timer wakes up. The question to ask is: If a repeating timer should have fired 4 times during a given interval, should it fire 4 times or once when it wakes up?
Concern is that external events could be quite rare, whereas timer events (triggers) could be much more frequent. Example: Presence application, publishes presence state. Based on the presence state, different functions may be available. One example presence state would be "I'm at work", and if you haven't spoken for a while (or after a certain number of hours), you'll be moved from the "I'm at work" to the "I'm at home" state. With delayed timer firing, you may receive phone calls in the middle of the night.
We've agreed to use the "reactivationtime" approach outlined above, but instead of replicating an entire SAS (with an updated "reactivationtime") whenever any one of its contained timers has fired, we want to minimize replication to the Timer that's fired: We will replicate the timer that just fired, with an updated "nextFiringTime" attribute.
The timer's "nextFiringTime" will be replicated as an externally visible timer attribute (the same way as a session's "lastAccessedTime"), so that it is available without having to deserialize the Timer. May need to change replication SPI to make room for new "nextFiringTime" slot.
A new thread on the replica instance will enumerate all Timer replicas and decide which ones to fire, based on their "nextFiringTime" attribute.
Before firing a timer on a replica instance, we need to perform some kind of healthchecks to avoid false firings, e.g., if a "nextFiringTime" update message was lost. In other words, we need to check a timer replica for staleness before firing it.
ALL (08/06/07): Any check for staleness would require a non-destructive load ("peek"), which is a new requirement, i.e., not something that's supported right now in GlassFish.
What should we do if we detected that a timer with a higher version exists: destroy ourselves, and let the timer with the higher version fire? TBD.

O.2.2 After migration of active session

According to R.3, a migrated session will be available, at least temporarily, in two active caches: That of the instance to which the session failed over, and that of the recovered instance (once the session has been resumed there). We must prevent the session's timers from firing on the instance from which the session migrated.

ERIK (08/01/07): My suggested solution (tentative) is to remove any version (whether active or replica) during the broadcast as a result of the load.
ALL (08/06/07): New requirement: When respoding to a broadcast, also consider active (instead of just the replica) cache, and remove any match from the active cache. This should also deactivate any timers associated with the active session.
"load" must become a 3-way protocol (request - response - ack) so that an active session may be safely removed and migrated.
Migration of an active session must preserve its version number, i.e., must make an exact copy of the session.
To faciliate migration of an active session, we may have to serialize the entire graph, which is something we would like to avoid under normal circumstances.

O.2.3 After a network segmentation

During a network segmentation occurred, it can also happen that during the segmentation two active versions of the same session are established. This condition can remain even after the segments have merged again.

ERIK (08/02/07): We might need a reconciliation after a merge is detected; e.g., every instance broadcasts the ids of its sessions and their version and nodes that have stale objects detect this and remove those.
ALL (08/06/07): Example: 10 instance cluster, broken into two 5-instance clusters (GMS groups), instances in one of the clusters don't know anything about the instances in the other cluster.
During partitioning, each isolated part will fire. Cannot really prevent this.
A merge after network partitioning will result in 2 active sessions. Can we treat it the same way as O.2.2?
A node that was disabled during the partitioning and reenabled after the merge still has an active cache, but its versions are stale. How can we prevent its timers from firing? In O.2.1, we're doing the staleness check only on the replica, but we may have to include active cache as well.
Right now, GMS has no notion of network segmentation. jgroups have merge support with reconsiliation.
How frequent are network segmentations? What degree of support do we provide for network partitioning situations in GMS? Affects LB, replication, GMS, requirements will ripple thru all groups.
ALL (08/07/07): Detecting network partitioning is very complicated. GMS knows when an instance can no longer be reached, but that may be for a number of different reasons: the instance may have crashed, may have been brought down (in a controlled fashion), or may reside in a part of the network that is no longer reachable (following some network partitioning). GMS cannot detect the latter, unless there was a DB tier reachable from all parts of the (partitioned) network.
ERIK (08/10/07): However, detecting a merge should be possible. The GMS already has a join concept. If some server instance joins the cluster AND the instance itself already (or still) contains replication data, then we have a potential merge situation. So even if we can not avoid or even detect network partitioning, we might still ensure that the inconsistent situation that might be caused by the partitioning (e.g., multiple active versions) does not continue after the segments have merged.

O.3 "Out-of-band" migration

Should SipSessionsUtil.getApplicationSession() cause the requested session to migrate to the caller's instance?

PRO: If no migration, the returned session must be read-only, which may not be spec compliant

CON: LB will not be aware of this "out-of-band" migration and will continue to apply consistent hashing to determine target instance, so the requested session will migrate back and forth.

ERIK (08/01/07): An alternative could be to offer remote, but read/write access to the session. I do not really see how yet, though. Another issue here is the concurrency, similar to R.3. Migration should (preferably) be postponed during ongoing transactions.
ERIK (08/02/07): This needs more thought. Kristoffer came with some suggestions. Migration of the active session in this case might be unavoidable. But to avoid migrating back and forth, we might want to reroute the SIP and HTTP traffic to match the new location. The idea is that we might be able to control the routing of JCA requests etc with the out-of-band sasid in their protocol, but we do have some control over routing of SIP and HTTP. This could be based on a shared hashmap and maybe SIP and/or HTTP redirect or rerouting by the LB. Anyway, similar issues as discussed under R3.
ALL (08/06/07): Having all sessions retrieved via SipSessionsUtil.getApplicationSession() gravitate to the same (i.e., the caller's) instance will cause imbalance and defeat the purpose of a cluster, which is to

provide greater scalability.
What is the planned utilization of this API in the apps that Ericsson is planning to deploy?
Whatever approach we agree on, must be consistent with R.3, i.e., we either always remain sticky to the instance that we migrated to (which will require an override mechansim of the LB's consistent hashing algorithm), or we don't.

ERIK (08/10/07): GMS is going to use the @SipApplicationKey mechanism. They also are going to use the getApplicationSession() method (either with a key or an id; see O.10). Unclear in which context this is used, although they claim that it will never happen on an instance that does not have the data already when using UC routing. Question to PGM still open.

O.4 Uniformity of @SipApplicationKey across apps

Is it acceptable to require that all deployed apps share the same @SipApplicationKey implementation? (Currently, this is a requirement for the LB's consistent hashing algorithm to work reliably, since the LB has no knowledge of which apps have been deployed and which app(s) will be picked by the ApplicationRouter for a given request.)

ERIK (08/02/07): Using redirect might be a solution here as well. Redirect would give the AR the opportunity to do its application selection and the @SipApplicationKey method to do its work. Then the sasid is known and can be used in the redirect information to reroute the request to the correct server instance.
ALL (08/07/07): The LB team is thinking of replacing the "bekey", whose value corresponds to the initial request's consistent hashkey and is calculated only once (on the initial request that establishes a dialog) and makes it possible for any requests carrying a particular "bekey" to migrate back to their original instance after a failover recovery, with a "beroute", which would contain the address (id) of an instance and would allow requests to remain sticky to the instance to which the failed over.
The same request may end up invoking multiple apps, so we really would need to go back to the LB before invoking the 2nd app. Needs more investigation by LB team. Original design goal has been to keep LB as "stupid" as possible, and to not keep any state in the LB, but in order to support @SipApplicationKey, LB must consider application logic to some degree.
ERIK (08/10/07): In practice this might not be a big problem, since the session case will typically determine which fields are to be used. If we really leave this fully open to the application, then there is an additional problem that different applications, in the same chain as setup by the AR, have different SAS allocations. One solution to this would be to locate the AR and the @SipApplicationKey functionality in the FE layer and invoke the BE layer on a per application basis only. In a combined FE/BE scenario, this would not have that big impact. In a scenario where the FE and BE are in different clusters, then application chaining would involve remote invocations and round trips (even in the simple case where both apps have the same @SipApplicationKey implementation). I'm not sure about the implications of this but I think that for the moment we should continue working under the assumption that this is not an issue. I'll double check with Joel when he is back.

O.5 Dynamic reconfiguration of replication

CLOSED -> R.4

EAS allows for dynamic enabling and disabling of replication. GlassFish does not (ie., app needs to be redeployed). (I would like to move this to the "Resolved Issues" section, with the agreement that this feature will not be supported in SailFin.)

ERIK (08/01/07): I hope that maybe Kristoffer can give an answer on this. I would like to move this to resolved as well.
ERIK (08/02/07): For MMAS the priority is set to low for this requirement at the moment. The requirements have not been reviewed yet, but let us move it to resolved (we can always move it back to open

O.6 Consistency detection in the tree

The data structures consist of several related, but separate objects: SipApplicaitonSession, SipSession, HTTPSession, and SIP Dialogues. The objects have different triggers for replication and different lifecycles. However, the object do contain references to each other. How can we detect, after a failover, if all the objects have been replicated correctly. Using version numbers in the references is not an option.

ALL (08/07/07): Assumption: Likelihood that different parts of a SAS tree end up in different instances is unlikely. We must try to make sure that a SAS object graph stays together, i.e., avoid fragmentation of object graphs.
Each object in a SAS graph (tree) shall have its own version, but we don't want every reference to every versioned object to be versioned as well.
When we update an HTTP session child, we would like to increment only its version, and not that of the SAS parent. However, according to the spec, we would have to update the SAS parent's lastAccessedTime as well in this case, since a SAS parent's lastAccessedTime reflects the lastAccessedTime of its children!

O.7 Incremental updates

Should we use CompositeMetaData or SimpleMetaData? CompositeMetaData means that replication should be better
performant. However, it also means that if we lost an incremental update (e.g., due to a lost replication message) the replica will never be in sync again.

ERIK (08/02/07): Since each object contains a version which is incremented in every replication, the replica should be able to detect that an replication message was missed (or can we receive out of order messages as well?). In these cases you would like to trigger a partial repair-under-load, i.e., of one specific object. Would this be difficult to achieve?
ALL (08/07/07): We may want to avoid the use of CompositeMetaData. The original motivation for it was to update a parent and its children in a single replication message, and to enable modified attribute persistence of HTTP sessions. It was supposed to be more performant, but has not fully lived up to its promise in practice. We may want to use SimpleMetaData whereever possible.
For each SIP transaction, we want to increment (and replicate, as an attribute) its sequence counter only. But that's orthogonal to the Composite- vs. SimpleMetaData discussion. For example, we update a session's timestamp using SimpleMetaData.

O.8 Lazy re-activation and consecutive failures

The current solution to re-activation of a session after failover is to wait until a request is received on the session. Until that time there is only ONE replica copy of the session data. The advantage of this is that the reactivation is driven by the load balancer; i.e., the object is activated on the server instance that the LB selected. However, the lazy re-activation means that the system is vulnerable for consecutive failures. In SIP the requests sent in a session can be rather infrequent. This becomes more apparent in a rolling upgrade scenario, which, from a session replication point of view, may be treated as several consecutive failover cases.

This may call for forced instead of lazy re-activation. However, when the forced re-activation does not have the same distribution as the LB traffic distribution this lead to yet more migration of active objects.

ERIK (08/02/07): Suggestion from Kristoffer: this can also be solved by configuring multiple buddies. Is this currently supported or not? Or we could solve this in the same way as suggested for the timers (O.2.1). We could always set a reactivationtime chosen in some random interval and together with some staleness detection remove the replica, activate the replica or update the reactivationtime.
ALL (08/07/07): Unless a session in a replica cache is resumed (and therefore moved into the active cache) or reactivated by virtue of one of its timers firing, the session will remain in the replica cache, and may be lost if the instance on which the replica resides goes down. This vulnerability could be avoided by replicating to multiple buddies, or by reactivating a session even in the absence of any triggering events (such as session resumption or timer firings), but when?
Treat controlled, deliberate shutdown separate from failure; two different issues. GMS provides planned shutdown notification.
Pipes replicate only in one direction, not backwards.
During an instance's planned shutdown, replicate its replica cache (without activating any of its sessions). Do not replicate to more than one buddy ever (but tentatively, we could look into replicating to multiple buddies, but with a low priority).
Rolling upgrade example: Assume cluster I1->I2->I3->I4->I1. I1 goes down, I4 connects to I2, I2 pushes its replica cache to I3, I1 comes up, I4 will reconnect to I1 and repair it, I4 replicates to I1, I1 replicates to I2, now safe to bring I2 down.
Rolling upgrade has to be done in quiescing manner, which means that right before I1 is brought down, the LB must not route to it any new requests, only requests pertaining to ongoing sessions. May not work well with LB's user-centric approach, needs investigation.
Rolling upgrade: we don't support application versioning, sessions have to be backwards compatible.
EVDV (08/10/07): We have two issues. Vulnerability after a planned shutdown/restore of an instance (e.g., in case of upgrade) and vunerability for consecutive failures. From a ISP point of view both should probably be addressed. I'm still not convinced that a extra replica is a better alternative then a 'forced reactivation' or 'reverse repair' solution. This needs more discussion.

O.9 Optimization during upgrade

From a session replication point of view, a rolling upgrade may be regarded as a series of consecutive failures. However, we can use the knowledge that this is an upgrade scenario, to limit the unnessecary replication of data. This needs more investigation (maybe treat it the same as a temporary failure?).

ERIK (08/02/07): this was already discussed quite extensively. I only need to extract the solution from some people that are/have been on holiday. I hope they still remember or wrote it down

O.10 Shared ServletContext via SHOAL

We could use the shared hashmap functionality from Shoal to implement the ServletContext as shared data.
It is unclear whether this is required at all (Erik will check with PGM application).
If needed then it is still unclear how the concurrent access/updates are handled on such a shared hashmap.

ERIK (08/10/07): Whether PGM needs it or not depends on the discussions in JSR 289. There are two issues; the applicaiton can decide its own key for incoming sessions using the @SipApplicationKey mechanism. However, it can not create SipApplicationSessions out-of-the-blue with their own key, nor can they access the created SipApplicationSessions using the key, only the id can be used for this. If the application can not do both of these things, then the PGM needs to keep track of the relation between the key used internally (e.g., the to-field of the requests) and the SASid that corresponds to this. For this they need global shared storage and they currently use the ServletContext for this. Both issues are currently being addressed. The first already seems to be accepted. The second issue can be solved in two ways; either a direct way to access the session by key, or a way to associate additional keys with SASes. The latter has some other advantages, i.e., it can be used to tie other ids, such as a diameter session id, to the SAS. However, the latter also implies that the container does need some shared storage; i.e., we move the problem from the application to the container.

More information can be found here: SipServlet discussion thread

O.11 Version information in SIP

Can we assume that SIP clients cooperate when it comes to submitting versioning information, the same way as HTTP clients do? There might be some caveats in the specification with respect to updating cookie information. Also the inherent concurrency (both sides can simultaneously access the same SIP session) might complicate the version handling.

ERIK (08/13/07): We could store a request's Cseq in the SipSession, so when a request comes in, we compare its Cseq to the one that is stored in the SipSession, and if the one in the request "matches" (i.e., is equal to, or greater by one (in case it has been incremented by the client)) than the one in the SipSession, we know the SipSession is current.
Whenever a SipSession is updated with the latest Cseq, its version number is incremented.
When evaluating the result set from a broadcast, we continue to pick the SipSession with the greatest version number, and check it the CSeq stored in it "matches" (see above for definition of "match") the one in the request.

O.12 Session toggling after a failure

Since not all the instances of the LB will detect failure of an instance at the same time, it can happen that different LB instances route requests in the same SipApplicationSession to different BackEnds. This might result back and forth migration of the SipApplicationSession.
This is similar to O.3, so we might not want to treat it separately.
It is just included as a separate issue, since the solution can be different (e.g., a sort of transactional cluster reshape notification?)

(08/13/07): Issue is slightly misstated: It is not the failure that causes any problems, but the recovery of a previous failed instance. Some LBs may learn about the recovery in a delayed fashion, so they will continue to route requests to the failover instance during the delay, whereas others will route requests to the recovered instance.

O.13 Memory full error

During failures data has to be copied to other instances. What happens if this exceeds the memory?
Are the limits on the cache sizes? Will replication be switched off in these cases? Will we get alarms when the cache is almost full?

(08/13/07): Set limit on replication cache, get alarm if it starts filling up.
This boils down to a sizing capacity planning issue. We're excepting customers do to correct sizing, based on capacity planning.
What do you expect to happen if limit is reached? You've blown the capacity, you've mis-sized the capacity of your system.
Can we try not to push this requirement into the replication framework, but push it into load regulation?