GlassFish 3.1 Load Balancer Plugin One Pager

1. Introduction

1.1. Project/Component Working Name

GlassFish 3.1/Load Balancer Plugin

1.2. Name(s) and e-mail address of Document Author(s)/Supplier

Kshitiz Saxena : kshitiz.saxena@sun.com

1.3. Date of This Document

Date Revision Comments Author
2010-05-20
1.0
Initial draft
Kshitiz Saxena
2010-06-09
1.1
Incorporated review comments of Joe and Mahesh. Also added an option based on consistent hash algorithm. All changes are marked in blue.
Kshitiz Saxena
2010-06-22
1.2
Incorporated suggestions on Nazrul for upgrade and default values of lb-enabled. All changes are marked in shade of blue.
Kshitiz
Saxena
2010-07-05
1.3
Incorporated ASARCH review comments. All changes are marked in shade of blue.  Kshitiz Saxena
2010-08-31
1.4
Incorporated CCC review comments. All changes are marked in shade of blue.
Kshitiz Saxena

2. Project Summary

2.1. Project Description

 In GlassFish 3.1 the support for clustering of application server instances is being introduced, thus there is a evident need to have a load-balancer to front-end the cluster of instances. GlassFish 2.1.1 already had a load-balancer and same will be leveraged for GlassFish 3.1. The load-balancer for GlassFish is a native plugin which need to be installed on a web-server. After installing load-balancer plugin on web-server and configuring it, web-server will start distributing  requests across cluster of GlassFish instances and handle fail-over among many other features. A variety of web-server are supported by load-balancer plugin. The list includes Sun Java System Web Server, Apache HTTP server and Internet Information Service(IIS). Being a external component, which is installed on a web-server, there are no functional changes required to make it work with GlassFish 3.1.

The load-balancer plugin gathers information on back-end cluster using a xml file usually referred to as load-balancer xml. The load-balancer xml need to be generated by GlassFish admin framework. The commands existing in GlassFish 2.1.1 will be supported in GlassFish 3.1 to provide same user experience.

The load-balancer plugin need to be installed on web-server and then needs certain amount of configuration in web-server. The process is tedious and error-prone if done manually. An installer which automates most of the work required for load-balancer plugin installation and configuration will be provided to the user.
The session replication framework in GlassFish is based on replica partner. So in case an instance handling session goes down, it will preferred to fail-over request to instance which acts as replica. This feature will enable higher throughput even in case of instance failure with session replication enabled.

2.2. Risks and Assumptions

Load-balancer plugin pushes certain information as proxy headers alone with request. GlassFish implementation must be able to interpret these headers and populate request object appropriately. Also GlassFish web-container is required to stamp sticky information provided by load-balancer plugin in proxy header either as cookie or url-rewriting for correct functioning of load-balancer plugin.
The new feature of preferred fail-over instance will require sharing information of replica partner for a particular session.

3. Problem Summary

3.1. Problem Area

With introduction of clustering feature in GlassFish 3.1, there is a basic requirement of having a load-balancer to front-end the cluster. A load-balancer is required to distribute incoming requests to instances in cluster to provide higher throughput. It need to detect instance's health and route requests only to healthy instances thereby providing high availability.
In most production environment, a web-server tier frond-ends a application-server tier. Such deployments can serve static content from web-server tier itself, sending only requests for dynamic content to application-server tier thus increasing the system throughput. So a load-balancer which can be installed and configured on a web-server tier will be a ideal choice.
To provide high availability in true sense, session data need to be replicated to another instance to handle instance failure scenarios. Replica will hold session information in case current instance handling session goes down. This information can be retrieved by any instance in cluster, however to further optimize request for that session can be failed over to replica instance itself. This will provide higher throughput even in case of instance failure.

3.2. Justification

High Availability(HA) is one the release driver for GlassFish 3.1 and load-balancer is one of the core component.

4. Technical Description

4.1. Details

4.1.1. Load-balancer plugin configuration

The load-balancer acts as a front-end to GlassFish cluster/instances. It will distribute load to configured GlassFish instances for applications deployed on them. It gathers information on back-end GlassFish instances from file known as load-balancer xml. The load-balancer xml provides following information :

    1. The clusters front-ended by this load-balancer. It can front-end multiple clusters having heterogeneous deployment of applications.
    2. Instances in each cluster with following details :
      1. HTTP/HTTPS listeners for the instance
      2. State of the instance - enabled/disabled
      3. Weight associated with each instance. This is useful in case of deployment over heterogeneous system.
    3. Applications deployed in each cluster with following details :
      1. Context root of the web application
      2. State of the instance - enabled/disabled
    4. Health checker used to detect instance recovery
    5. HTTP request-uri used to ping failed instance
    6. Interval between health check pings
    7. Response time-out for health check ping

To generate a load-balancer xml, a load-balancer view need to be defined in domain.xml. It is used to create an association between load-balancer and cluster/standalone instances among other details. Below is snippet from domain.xml from GlassFish 2.1.1. Same elements and structure will exist for GlassFish 3.1 as well.
 
The load-balancer configuration element for cluster and standalone instances is provided below :
 <lb-configs>
    <lb-config https-routing="false" monitoring-enabled="false" name="cluster-http-lb-config" reload-poll-interval-in-seconds="60" response-timeout-in-seconds="60" >
      <cluster-ref lb-policy="round-robin" ref="cluster2">
        <health-checker interval-in-seconds="30" timeout-in-seconds="10" url="/"/>
      </cluster-ref>
      <cluster-ref lb-policy="round-robin" ref="cluster1">
        <health-checker interval-in-seconds="30" timeout-in-seconds="10" url="/"/>
      </cluster-ref>
    </lb-config>
    <lb-config https-routing="false" monitoring-enabled="false" name="standalone-http-lb-config" reload-poll-interval-in-seconds="60" response-timeout-in-seconds="60" >
      <server-ref disable-timeout-in-minutes="30" enabled="true" lb-enabled="true" ref="standalone-instance">
        <health-checker interval-in-seconds="30" timeout-in-seconds="10" url="/"/>
      </server-ref>
    </lb-config>
  </lb-configs>
Note :The load-balancer config can either have server reference or cluster reference. It cannot have a mix of both. It is assumed that information on instances within cluster and deployed applications can be retrieved by referencing the cluster configuration.

The load-balancer element having reference to associated lb-config is provided below :
 <load-balancers>
     <load-balancer lb-config-name="cluster-http-lb-config" name="cluster-http-lb" device-host="test.com" device-port="8080"/>
    </load-balancer>
  </load-balancers>

This will be required for automatically pushing load-balancer xml over the wire to web-server. The above element is not needed if command apply-http-lb-changes is not supported.

Also each server-ref will have lb-enabled attribute, to indicate whether an instance is enabled or disabled from load-balancer perspective.
<server-ref disable-timeout-in-minutes="30" enabled="true" lb-enabled="true" ref="instance1"/>

Similarly all application-ref will also have lb-enabled attribute, to indicate whether an application is enabled or disabled from load-balancer perspective.
<application-ref disable-timeout-in-minutes="30" enabled="true" lb-enabled="true" ref="test-application"/>
 
Also user can assign weight to a particular instance. This will be stored as attribute on server element.
<server config-ref="cluster1-config" lb-weight="400" name="instance1" node-agent-ref="nodeagent1">

Below is list of commands to create load-balancer view in domain.xml :

Commands Details
create-http-lb-config Creates the lb-config element with provided values.
create-http-lb-ref
Creates cluster-ref or server-ref under lb-config element with provided values
create-http-health-checker
Creates health-checker element with provided values
enable-http-lb-server
Sets the lb-enabled flag to true for the given instance. If cluster name is used, then for all instances in cluster lb-enabled flag is set to true.
enable-http-lb-application
Sets the lb-enabled flag to true for the given application.
delete-http-lb-config Deletes the given lb-config
delete-http-lb-ref Deletes the given cluster-ref or server-ref. All instances need to be disabled to execute this command successfully.
delete-http-health-checker
Deletes the given health-checker
disable-http-lb-server Sets the lb-enabled flag to false for the given instance. If cluster name is used, then for all instances in cluster lb-enabled flag is set to false.
disable-http-lb-application Sets the lb-enabled flag to false for the given application.
configure-lb-weight
Configures the weight for a particular instance.
list-http-lb-configs
Lists all the lb-config elements
create-http-lb
Creates the load-balancer element. It can create lb-config, cluster-ref/server-ref, enable instance/application by running a single command. This command will be needed if apply-http-lb-changes command is supported.
delete-http-lb
Delete the given load-balancer. This command will be needed if apply-http-lb-changes command is supported.
list-http-lbs
Lists all the load-balancer elements. This command will be needed if apply-http-lb-changes command is supported.

Main purpose of creating load-balancer view in domain.xml is to facilitate generation of load-balancer xml to be consumed by load-balancer plugin. Below is the list of commands to achieve the same :

Commands
Details
export-http-lb-config
Generates the load-balancer xml corresponding to given lb-config. User can provide a lb-config or load-balancer name. It will be exported to provided file name. This command will be further over-loaded to provide a new parameter target. This will enable user to generate load-balancer xml for a set of cluster/standalone-instance, without a need to create a lb-config or load-balancer element in domain.xml.
apply-http-lb-changes
Generates the load-balancer xml corresponding to given load-balancer and pushes it over the wire to configured web-server host name and port number. Web-server require some specific configuration for this feature to work. This command may not be supported in GlassFish 3.1.

The generated load-balancer must conform to glassfish-loadbalancer_1_2.dtd.

For more details on these command refer to GlassFish 2.1.1 documentation.

In addition to support for above commands, there will be following changes with respect to GlassFish 2.1.1

    1. The default value of lb-enabled attribute for a newly created instance will be true. In GlassFish 2.1.1, the default value was false.
    2. The default value of lb-enabled attribute for a newly deployed application will be true. In GlassFish 2.1.1, the default value was false.
    3. An additional parameter will be added to asadmin create-instance command to enable user to set lb-enabled attribute to desired value.
    4. An additional parameter will be added to asadmin deploy command to enable user to set lb-enabled attribute to desired value. 

Additional changes for backward compatibility are as follows :

    1. Auto-apply feature will not be supported in GlassFish 3.1.
      1. Asadmin command create-http-lb will still accept autoapplyenabled argument. It will ignore its value and print out a warning message.
      2. During upgrade autoapplyenabled attribute will be removed from load-balancer element. A warning message will be logged to indicate the same.
    2. Properties upgraded to attributes for load-balancer element
      1. Properties device-host and device-port has been upgraded to attributes. Ugrade tool must take care of these changes.
    3. Controlling lb-enabled default values
    4. Since lb-enabled attribute default value for instance and application have been changed from false to true, user is provided with an additional option to control the default value using system property. The system property that will be used for this purpose is org.glassfish.lb-enabled-default. This property need to be defined in DAS. If defined, provided value will be used as default value for lb-enabled attribute. If not defined, default value of true will be used.
    5. A detour to above behavior is that if all instances in a cluster has lb-enabled attribute set to false, then a newly created instance will have lb-enabled attribute set to false as well. 

4.1.2. Installer for load-balancer plugin

The load-balancer plugin need to be installed on web-server and then web-server need to be configured. If configured correctly, web-server will use the plugin to handle the requests. The installation and configuration is a tedious task and error prone if done manually. To ease out this process, a tool will be provided to enable users to install and configure the loadbalancer plugin on the web server.
In GlassFish 2.1.1., a tool called GlassFish LoadBalancer Configurator was developed to provide above mentioned capability. The same tool will be leveraged for GlassFish 3.1 as well. It is a IzPack based installer and requires JAVA to execute. It accepts user inputs and perform installation and configuration tasks based on that. The tool performs  installation and configuration in a two step process : 

  1. In first step, loadbalancer plugin is exploded and installed
  2. In second step, web server is configured to use loadbalancer plugin

The tool also provides Post Installation steps, if any. User can also generate a automation script which can be used for silent install at a later point of time. It also provides an uninstall script, which can be used to remove the load-balancer configuration and plugin from the web server.


The tool will also provide support for upgrade. User will be able to perform upgrade from GlassFish 2.x to GlassFish 3.1 using this tool. It will detect that load-balancer configuration already exists in web-server, and will only update the load-balancer plugin binary. In future, it can also be used to distribute new load-balancer plugin binaries having bug fixes or new features.

4.1.3. Preferred fail-over instance

Load-balancer detects instance failure and fails over requests being serviced by that instance to another healthy instance thus providing high availability. This newly selected instance is called fail-over instance. In current implementation of load-balancer plugin, the selection of fail-over instance is done using round-robin algorithm.  Since round-robin algorithm is in general stateless, there is no preferred fail-over instance.

The session replication framework in GlassFish replicates session to a partner to provide high availability of session. The partner is know as replica. In case of instance failure, the session can be restored on any instance from replica. When request is failed over to another instance, it needs to figure out the replica to load the session. Session replication now uses consistent hash algorithm to identify the replica. In case identified replica does not hold the session, it resorts to the broadcast mechanism to identify the replica. This will happen for all sessions being handled by failed instance resulting in lot of network traffic and loss of throughput. However if load-balancer plugin take intelligent decision, based on some information available in request, it can directly route request to replica instance.  The replica instance can directly load session from its local cache without a need for broadcast mechanism. This will provide better performance and throughput even in case of instance failure.

Option 1 : A contract between session replication framework and load-balancer to identify the replica for a session
The information of replica must be available in incoming request to enable load-balancer plugin to select that instance for handling session fail-over. This information can be present either as cookie or parameter in request-uri. The load-balancer plugin depends on web-container to stamp session stickiness information on response, so that it is available in subsequent requests. Now it will further depend on web-container to stamp replica information on response as well.

One important point to note here is that information of instance currently handling session is not stamped in clear text. Load-balancer plugin actually generates unique identification for each instance and use that value instead of instance name in clear text. Thus it will expect replica instance information being stamped to be its unique identification. It must be same as one generated by load-balancer plugin for that instance. Due to this constraint, there are two approaches to implement this feature.

Approach 1 - Load-balancer plugin selects replica partner : When load-balancer gets a new request(request not belonging to any session), it will select an instance to service the request. It will also select another instance using same round-robin algorithm to act as replica partner.  The replica instance name, both in clear text and its unique identification, will be added as proxy headers on request. The request will then be forwarded to GlassFish instance. The session replication framework can extract replica instance name from proxy headers and use it as replica partner. Also web-container can extract unique identification for replica instance from proxy header, and stamp that information either as cookie or parameter on uri(url-rewriting).  Upon instance failure, load-balancer will select instance corresponding to replica instance information available in request to act as a fail-over instance.. It will also select a new replica partner and add it as proxy headers on request. In this approach, a replica partner will be selected for all new requests, even those which does not create any session.

Approach 2  - Session replication framework selects replica partner : The load-balancer plugin will not modify a new request in any manner. In case session is created by the new request, the session replication framework will select an instance to act as replica. The information is made available to web-container.  The web-container then generates a unique identification for that instance using the mechanism used by load-balancer plugin. This will require that logic to generate instance identification is duplicated in web-container as well. Also it need to be ensured that both implementations remain same even in future. Upon instance failure, load-balancer will select instance corresponding to replica instance information available in request to act as a fail-over instance. It will perform a check whether identified replica instance is with-in cluster boundary. This will guard against malicious requests trying to move session to another cluster, which will result in loss of session. The session replication framework will select a new replica partner.

Option 2 : Using consistent hash algorithm in both session replication framework and load-balancer
Using a consistent hash algorithm across load-balancer and session replication framework is another option to handle this scenario. There will be no contract between load-balancer and session replication framework and they can work independent of each other. However both of them need to use identical implementation of consistent hash algorithm.

This approach was used in SailFin and can be used here as well. The load-balancer uses consistent hash algorithm to distribute incoming traffic. The consistent hash algorithm is stateless in nature and thus yield same result for a given key. This implies for a given key, it will select same instance on load-balancer as well as session replication framework. The session replication framework will use same algorithm to select a replica partner.

Main drawback of this approach is that distribution will not be as fair as round robin mechanism. However in SailFin, it provided close to round robin distribution.

4.1.4 Supported platforms


Supported platforms by load-balancer plugin will be subset of platforms supported by GlassFish 3.1. It will continue to support platforms already supported in GlassFish 2.1.1. As of now there is plan to support any new platform. Below is the list of supported platforms.

  • Solaris 9/10
  • RHEL 3/4/5
  • Windows 2003 Advanced Server Edition

4.1.5 Supported web-servers


It will continue to support web servers which were supported by load-balancer plugin in GlassFish 2.1.1. Below is the list of supported web servers :

  • Sun Java System Web Server 6.1SPx/7.x
  • Apache HTTP server 2.0.x/2.2.x
  • Internet Information Services(IIS) 5.0/6.0

4.2. Bug/RFE Number(s)

N.A.

4.3. In Scope

All features described in this one pager are with-in scope of this document.

4.4. Out of Scope

N.A.

4.5. Interfaces.

4.5.1 Public Interfaces

Interface Comments
asadmin commands Newly added asadmin commands to create load-balancer view in domain.xml and also to generate load-balancer xml
load-balancer xml
Load-balancer xml to be consumed by load-balancer plugin
sun-loadbalancer_1_2.dtd
DTD for load-balancer xml  

4.5.2 Private interfaces

None

4.5.3 Deprecated/Removed Interfaces

None

4.6. Doc Impact

Load-balancer documentation will be part of high availability guide. Documentation existing from GlassFish 2.1.1 can be reused. It will require man pages for admin commands(CLI as well as GUI). Need additional documentation for preferred fail-over instance feature.

4.7. Admin/Config Impact

A set of new commands are being added to GlassFish 3.1 for creating http load-balancer element in domain.xml and then to generate load-balancer xml based on that. These commands will be available in both CLI and GUI.

4.8. HA Impact

Load-balancer is a core feature of high-availability.

4.9. I18N/L10N Impact

The i18n/l10n impact consists of making sure that the output from the new sub commands follows the patterns that are already established for administrative commands.

4.10. Packaging, Delivery & Upgrade

4.10.1 Packaging

Load-balancer plugin will be packaged a jar file generated using IzPack.

4.10.2 Delivery

IzPack based bundle will be delivered as an add-on feature.

4.10.3 Upgrade and Migration

Since load-balancer is a standalone component outside GlassFish, there is no upgrade or migration requirement for it.

4.11. Security Impact

Load-balancer plugin is installed on web-server and thus utilizes security framework of web-server without impacting it adversely.

4.12. Compatibility Impact

This feature is compatible with older version of GlassFish. All features except newly introduced feature of preferred fail-over instance will continue to work.

4.13. Dependencies

4.13.1 Internal Dependencies |

  1. The GlassFish 3.1 administration framework will be used to write all admin commands.
  2. Dependency on web-container to parse proxy headers and populate request object appropriately.
  3. Also web-container is responsible for stamping sticky information to maintain session stickiness. This need to be enhanced to stamp replica information as well.
  4. For preferred fail-over instance feature to work correctly, there is a need to maintain contract between load-balancer plugin and replication framework.

4.13.2 External Dependencies

GlassFish LoadBalancer Configurator is a IzPack based installer and configurator. Thus it has dependency on IzPack.

4.14. Testing Impact

The feature need to be tested in fashion similar to GlassFish 2.1.1. All functional test cases from GlassFish 2.1.1 need to be executed.
The GlassFish Loadbalancer Configurator provided to install and configure load-balancer plugin need to be exhaustively tested.
The preferred fail-over instance feature introduced in GlassFish 3.1 will require writing and execution of new set of test cases.

5. Reference Documents

1. GlassFish 2.1.1 documentation on load-balancer plugin
2. GlassFish 2.1.1 documentation on load-balancer administration
3. GlassFish 2.1.1 documentation on load-balancer admin commands
4. White paper on GlassFish load-balancer
5. Blog on GlassFish load-balancer plugin

6. Schedule

6.1. Projected Availability

Item Date/Milestone Feature-ID Description QA/Docs Handover Status / Comments
1 MS1 N.A. Load-balancer one pager describing features and implementation details No  
2 MS3 LBREC-001 Admin command to create load-balancer elements in domain xml Yes  
3 MS3 LBREC-002 Admin command to generate load-balancer xml Yes  
4 MS4 LBREC-004 Preferred fail-over instance Yes  
5 MS5 LBREC-003 Installer support for Load-balancer plugin Yes  
6 ? LBREC-005 Pushing load-balancer xml over the wire to web-server Yes This feature will be implemented if time permits.