GlassFish Server Open Source Edition 3.1 - In-memory Session Replication: High Availability
The In-Memory replication feature allows session availability by replicating session state to other instances in the cluster. By replicating session data, when the instance that was serving the request fails, the session can be restored by retrieving the session data from the replica instance. Also, instead of letting the containers to directly interact with the in-memory replication layer, a BackingStore SPI has been defined. The advantages of using this SPI are: In GlassFish Server Open Source Edition 3.1, we will implement the in-memory replication module as a service provider of BackingStore SPI. High Level Features
The general approach is to replicate session state from each instance in a cluster to a back-up / replica. Unlike V2, where HTTP and EJB sessions were replicated to the neighboring instance (buddy), the replication module will rely on an external object called ReplicaSelector to pick a replica. The input to the ReplicaSelector will be the SessionKey and the ReplicaSelector will return an instance name (that is alive) to which the data must be replicated. Each backup instance will store the replicated data in-memory. Upon a failure the instance now servicing the request (after failover) will either already have the necessary data or it will use the ReplicaSelector to locate the replica to obtain and take ownership of the data. Availability configuration will continue to work as it has for previous HA enabled releases. The existing persistence-type ("replicated") will continue to be supported. This will allow QA and performance tests to run as they do with V2.x. The replication layer itself will use the GMS communication APIs for transporting the data. We will use GMS's send() API to replicate data to a replica instance. The current plan is to leverage GMS for cluster group membership services including things like initial bootstrapping/startup and various cluster shape changes like instances being stopped and re-started, instances failing, new instances being added to the cluster, etc. Its is assumed that ReplicaSelector will register itself with GMS to know the current 'alive' members so that it can pick an 'alive' replica. Changes to Various Containers: 1. EJB Container: The EJB Container uses the store to (a) passivate EJBs when the cache overflows and (b) to checkpoint EJBs at the end of business method (or at the end of Tx). The EJB Container currently does not use the SPI to talk to a store. The EJB Container needs to use the BackingStore SPI to talk to in-memory replication layer. Since, the interface used by EJB Container is very similar to the BackingStore SPI, it should not be a major change for the container to switch to the SPI. 2. Web Container: The Web Container will continue to use the same BackingStore SPI to save session state. The Web Container persists session data using one of the two approaches: (a) Full session: The entire session will be saved to the BackingStore and (b) Modified Attributes: Only those HTTP Session attributes that were either deleted / added / updated will be saved. We will continue to support both approaches. 3. Metro: Metro will use the BackingStore SPI. It will save the following data in BackingStore: (a) Message, (b) SAAJMessage and (c) ProtocolSourceMessage. The above three messages will be serialized and the resulting byte[] will be persisted into BackingStore. Feature Overview
|
Task | start date | end date | QA handoff | Target Milestone | Owner(s) | Feature ID | Status / Comments |
---|---|---|---|---|---|---|---|
Convert BackingStore SPI into an OSGi module | MS1 | NO | MS1 | Mahesh | HA-1 | DONE. | |
Provide no-op BackingStore and BackingSoreFactory for Metro | MS2 | NO | MS2 | Mahesh | HA-6 | Done. The persistence type for this BackingStoreFactory is 'noop' | |
Retain v2.1 HA configuration elements and attributes in domain.xml | 6/21 | NO | MS2 | Mahesh | HA-7 | Done. No configuration changes are expected between v2.1 and v3.1 | |
Implement HA BackingStore SPI using shoal Replication store | MS2 | NO | MS2 | Mahesh | HA-5 | Done. shoal-backing-store module acts as the adapter. | |
Add File based SPi implementation | MS2 | NO | MS2 | Mahesh | Done. Ejb Container uses the file store through HA spi | ||
Provide store and remove operations in SHOAL replication store | MS2 | NO | MS2 | Mahesh | HA-2 | DONE. The shoal replication store already supports these operations. | |
Allow ReplicaSelector to listen for failure / join events | MS2 | NO | MS2 | Mahesh Kannan | HA-2 & HA-4 | Done. Reacts to failure and join and ready events. | |
Implement a consistent hash algorithm to locate replica instance | MS3 | NO | MS3 | Mahesh | HA-2 & HA - 4 | Done, but blocked by 12730 (which is an intermittent bug) | |
Implement load operation in SHOAL replication store | MS3 | NO | MS3 | Mahesh | HA-2 | Done | |
Implement load() operation to retrieve state from store | MS3 | NO | MS3 | Mahesh | HA-3 | Done. Again blocked by 12730 | |
Implement updateTimeStamp method in BackingStore | MS4 | NO | MS4 | ?? | HA - 2 | Done. | |
Implement local caching of state in shoal replication store | MS4 | NO | MS4 | Mahesh | HA - 12 | Done | |
Support local caching of passivated EJBs. Ensure passivated EJB state is available after server crash. | MS4 | NO | MS4 | Mahesh | HA - 12 | Started | |
Implement BatchBackingStore for EJB | MS4 | NO | MS4 | Mahesh | HA-11 | Not started. | |
Work with Web container team & LB team to support replicaLocation cookie. Ensure better performance after server crash | MS4 | Yes | MS4 | Mahesh | HA-11 | This is really a web container task. Working with Rajiv and Kshitiz | |
Ensure that updates / save occur in the correct order in replica cache | MS5 | NO | MS5 | Joe Fialli | HA-2 | Not started. | |
Add support for batching replication messages using interceptors | MS5 | NO | MS5 | Mahesh | HA-10 | Not Started. Moving this from MS4 to MS5 as this is really a perf related feature | |
Allow ReplicaSelector to rejoin events | MS5 | NO | MS5 | Joe Fialli | HA-2 & HA-4 | Not yet started. | |
Improve log messages for easy debugging | MS5 | NO | MS5 | Mahesh & Joe Fialli | HA-8 | Started. This is really an internal feature. Other modules will not directly depend on any of the log messages |