GlassFish User-Managed Clusters Design SpecThis is the design specification for the user-managed clusters feature for GlassFish. Authors
Note: This is the second major revision of this design. See this page for the first revision. IntroductionIn the GlassFish 3.1 release, clusters and clustered instances are managed by a domain administration server (DAS). The DAS is required to create, list, start, stop, and delete a cluster and the instances that are in a cluster. Configuration information is synchronized with members of the cluster by the DAS. The user-managed cluster feature provides the ability to create a cluster of instances without using a separate DAS to manage them. Responsibility for managing the cluster and its instances rests with the user, either by manually updating the configuration of each instance or by providing an external software system that does this management. Since the user provides the management for the cluster, the feature is called user-managed clusters. The purpose of this document is to describe the design for the feature. This includes the following sections:
Additional high level information about the requirements and design, such as packaging, i18n/l10n impact, etc. is available in the one-pager/project page for this feature. That information is not duplicated in this document. RequirementsThe requirements for this feature are specified on the one-pager/project page for this feature. General DesignThe basic idea for the feature is to provide the ability for a DAS to join a cluster as a core member of the cluster. Each member of the cluster is its own domain, i.e., it has an entry in the "domains" directory and would be started with the start-domain command. To simplify the implementation, the domain.xml does not have a <cluster> element for the cluster containing the DAS. Rather, information is added to other element such as <server> and <group-management-service> to provide the information that is needed to allow the DAS to join a cluster. The DAS would join the GMS group as a core member rather than an observer. Domains, Clusters, and InstancesWhen using the user-managed cluster feature, the DAS for a domain can be a member of exactly one cluster. The DAS can still manage other clusters for which it is not a member, but the DAS itself can only be a member of one cluster. In 3.1, the DAS could not be a member of a cluster. So this feature changes that. To use the user-managed cluster feature, a user creates the DAS using the create-domain command. To make the DAS a member of a cluster, a cluster-member-name and cluster-name properties are set on the server. asadmin create-domain domain1 asadmin start-domain asadmin set servers.server.server.property.cluster-member-name=domain1-instance1 asadmin set servers.server.server.property.cluster-name=domain1-cluster // GMS configuration commands follow The RuntimeType of an instance in a user-managed cluster will be DAS. This means that all places that call isDas or isInstance to determine behavior that effects cluster membership will need to be changed so that there is instead a check to see if the instance is a member of a cluster. A new Server.isClusteredDas method is available to make this check (see details in the isDas section below). As with any other DAS, the files for a member of a user-managed cluster will reside within a "domains" directory. Since all members of the cluster should have same domain name, support for multiple instances in a user-managed cluster requires the use of multiple domains directories. This is specified on the create-domain command using the --domaindir option. This feature does not guarantee that the configuration for instances in a cluster stays in sync with each other. It is the responsibility of the user or software provided by the user to do this. When GlassFish instances communicate with one another, the software must handle potential errors caused by having configuration data, including the list of deployed applications, that is not in sync. Static vs. Dynamic Cluster ConfigurationThere are some modules in GlassFish that depend on knowing information about what instances are in a cluster. There are two ways of getting this information. The Cluster config bean contains static cluster configuration information, i.e., the information that is in the <cluster> element of the domain.xml. This includes instances that may be up or down. The GMS subsystem provides dynamic cluster configuration information, specifically the list of members that are up, or that have been up since this instance was started. All places that need a list of cluster members will be made to use the dynamic cluster configuration information. The following classes, which call getCluster()/getClusterForInstance(..) depend on the static information from the <cluster> element and may need modification to get the information from the dynamic cluster configuration or new properties of the Server config bean. EJB Container
GMS
JMS
Load Balancer
Configuration changes for isDas() callsThere are numerous calls to isDas() throughout the code, but most of these do not need to be modified for a user-managed instance/clustered DAS. In the places, where modifications are needed, new duck-typed methods on the Server config bean are required. To identify if an instance is a clustered DAS, a new duck-typed method, isClusteredDas(), on the Server config bean is required. A new Server.getClusterMemberName method is used to obtain the cluster member name for an instance in a user-managed cluster. public static boolean isClusteredDas(Server server) { <----- CHANGE REQUIRED boolean isClusteredDas = false; if (isDas(server)) { isClusteredDas = server.getProperty("cluster-member-name") == null ? false : true; } return isClusteredDas; } public static String getClusterMemberName(Server server) { <----- CHANGE REQUIRED return server.getPropertyValue("cluster-member-name", server.getName()); } The remainder of this section analyzes the calls to the isDas method throughout the server. No changes are needed from any of these areas except for those indicated with "CHANGE REQUIRED". Admin CommandRunnerImpl uses env.isDas(), in addition to whether the number of servers is greater than 1 or the cluster size is greater than 0, to ask 'Does this server require replication?'. There is only 1 server in a user-managed cluster, and the cluster element will not exist, so no change is required. Application ApplicationConfigListener uses server.isDas() to help determine if the current instance matches the target. In a user-managed cluster, the application is deployed directly to the instance so the ApplicationRef will have Server as its parent. No change should be needed. private boolean isCurrentInstanceMatchingTarget(Object parent) { // DAS receive all the events, so we need to figure out // whether we should take action on DAS depending on the event if (parent instanceof ApplicationRef) { Object grandparent = ((ApplicationRef)parent).getParent(); if (grandparent instanceof Server) { Server gpServer = (Server)grandparent; if ( ! server.getName().equals(gpServer.getName())) { return false; } } else if (grandparent instanceof Cluster) { if (server.isDas()) { return false; } } } return true; } ApplicationLifecycle.getVirtualServers(String target) returns the virtual server, "server" if the target is "domain" and we are running in DAS. private String getVirtualServers(String target) { if (env.isDas() && DeploymentUtils.isDomainTarget(target)) { target = "server"; } ApplicationLifecycle uses env.isDas() to load system applications on DAS. Currently, the following system applications are available under lib\install\applications: __admingui, __cp_jdbc_ra, __dm_jdbc_ra, __ds_jdbc_ra, __xa_jdbc_ra, jaxr-ra, jmsra, metro, ejb-timer-service-app.war, mejb.jar. Only the __admingui is registered in the domain.xml. If loading system applications on a user-managed instance is ok, then no change is required. If a system application needs to be loaded on start-up, ApplicationLoaderService will load the system app if the server is DAS or the system app is enabled. If loading a system app on a user-managed is ok, then no change is required. If a system application, stand-alone resource adapter, or application is enabled, then the ApplicationLoaderService will load the app on the instance and also (partially) load on DAS so the application information is available on DAS. The user-managed instance should have the app loaded when the app is enabled, so no change should be required. Admin Console AdminConsoleAdapter uses !env.isDas() to return if it's not running on DAS. Admin Console would be able to run on a clustered DAS. If we want to prevent Admin Console from running on a clustered DAS, a change may be required here. Config API Server.isDas() - If the server name remains as "server", then no change is required. ConfigRefValidator.isValid(..) - Currently, cannot change config-ref of DAS from "server-config". If the server name remains as "server", then no change is required. ResourceUtil.getTargetsReferringResourceRef(String refName) - If the server isDas(), then SystemPropertyConstants.DAS_SERVER_NAME is added to the target list. SystemPropertyConstants.DAS_SERVER_NAME/DEFAULT_SERVER_INSTANCE_NAME and DAS_SERVER_CONFIG, which equal "server" and "server-config", do not need to change if the server name remains as "server". Cluster Some commands (CreateInstanceCommand, ListInstancesCommand, RestartInstanceCommand, StartClusterCommand, StopInstanceCommand) use !env.isDas() to allow the command to run only on DAS. StartInstanceCommand and StopClusterCommand also use env.isDas() to only run on DAS. These commands are not relevant for a clustered DAS. If the commands do not need to be prevented from running on a clustered DAS, no change is required. Deployment The EnableCommand\DisableCommand on DAS will, if the target is a clustered instance, replicate the command to all instances in the cluster so they can update their configs. For a clustered DAS, the ClusterOperationUtil.replicateCommand, skips this replication for DAS, so no change is required. EJB Container EjbContainerUtilImpl uses !isDas() to set _doDBReadBeforeTimeout = true. In this case, !isDas() is asking 'Is _doDBReadBeforeTimeout default true?' On a clustered instance, the default is true, so a user-managed instance default should also be true. A change is required. Here the isDas() method on EjbContainerUtilImpl is modified to return false for a clustered DAS. if (!isDas()) { // On a clustered instance default is true _doDBReadBeforeTimeout = true; } public boolean isDas() { return (env.isDas() && !server.isClusteredDas()) || env.isEmbedded(); <----- CHANGE REQUIRED }
DistributedEJBTimerServiceImpl uses !isDas() to ask 'Is the server required to 1) register for Planned Shutdown event, 2) set DB read before timeout to true, 3) register for transaction recovery events?'. public void postConstruct() { if (!ejbContainerUtil.isDas() || server.isClusteredDas()) { <----- CHANGE REQUIRED if (gmsAdapterService != null) { GMSAdapter gmsAdapter = gmsAdapterService.getGMSAdapter(); if (gmsAdapter != null) { // We only register interest in the Planned Shutdown event here. // Because of the dependency between transaction recovery and // timer migration, the timer migration operation during an // unexpected failure is initiated by the transaction recovery // subsystem. gmsAdapter.registerPlannedShutdownListener(this); } } // Do DB read before timeout in a cluster setPerformDBReadBeforeTimeout(true); // Register for transaction recovery events recoveryResourceRegistry.addEventListener(this); } } ReadOnlyBeanMessageCallBack uses !ejbContainerUtil.isDas() to ask 'Does this server require to 1) register as GMS adapter Message Listener 2) set as DistributedReadOnlyBeanNotifier ?'. public void postConstruct() { if (!ejbContainerUtil.isDas() || server.isClusteredDas()) { <----- CHANGE REQUIRED if (gmsAdapterService != null) { GMSAdapter gmsAdapter = gmsAdapterService.getGMSAdapter(); if (gmsAdapter != null) { gms = gmsAdapter.getModule(); gmsAdapter.registerMessageListener(GMS_READ_ONLY_COMPONENT_NAME, this); _readOnlyBeanService.setDistributedReadOnlyBeanNotifier(this); } } } } GMS GMSAdapterImpl uses isDas to determine the member type as spectator for DAS or core for non-DAS. A change is required to set the clustered DAS as a core member, not a spectator. (Technical Requirement High Availability 3.6 P1 GMS clusters Support all methods that GMS provides for forming the cluster, i.e., multicast, non-multicast, etc.). The HealthHistory constructor currently takes a Cluster object to iterate over the instances. We'll add another constructor for the clustered DAS case so that the instance name is passed in instead and move the creation of the concurrent map out of the constructors. The current isDas() call can remain as-is. private void readGMSConfigProps(Properties configProps) { configProps.put(MEMBERTYPE_STRING, (isDas && !server.isClusteredDas()) ? SPECTATOR : CORE); <--- CHANGE REQUIRED GMSAdapterImpl uses isDAS to determine whether it is a bootstrapping node, where DAS is a bootstrapping node. A bootstrapping node refers to a node that was used to bootstrap finding the cluster when multicast is not enabled. This is currently not being used in 3.x. The isDAS call may not be the only way to determine if a self-managed member is considered a bootstrap node. But at this time we are not using the concept at all, so it is okay to just comment that info out and we will work on it when implementing non-multicast support. (from Joe F.) private void readGMSConfigProps(Properties configProps) { ................................................... case IS_BOOTSTRAPPING_NODE: configProps.put(keyName, isDas ? Boolean.TRUE.toString() : Boolean.FALSE.toString()); break; GMSAdapterImpl has the following which may apply for a clustered DAS. //fix gf it 12905 if (testFailureRecoveryHandler && ! env.isDas() || server.isClusteredDas()) { <----- CHANGE REQUIRED // this must be here or appointed recovery server notification is not printed out for automated testing. registerFailureRecoveryListener("GlassfishFailureRecoveryHandlerTest", this); } (Technical Requirement Administration 4.6 P2 GMS get-health Support the get-health command on any instance in a user-managed cluster). The HealthHistory constructor currently takes a Cluster object to iterate over the instances. We'll add another constructor for the clustered DAS case so that the instance name is passed in instead and move the creation of the concurrent map out of the constructors. The current isDas() call can remain as-is. public HealthHistory(Cluster cluster) { // move this constructor to static initializer and leave out the size healthMap = new ConcurrentHashMap<String, InstanceHealth>( cluster.getInstances().size()); for (Server server : cluster.getInstances()) { if (server.isDas()) { <----- LEAVE AS-IS continue; } // etc } } public HealthHistory(String instanceName) { // add instance to health table } This will require a change in GMSAdapterImpl to create the HealthHistory object differently in the case of a user-managed cluster. if (cluster == null) { if (server.isClusteredDas()) { <----- ADD THIS CHECK AND NEW METHOD initializeHealthHistory(server.getClusterMemberName()); } else { logger.log(Level.WARNING, "gmsservice.nocluster.warning"); return false; //don't enable GMS } } else if (isDas) { // only want to do this in the case of the DAS initializeHealthHistory(cluster); } The HealthHistory object is a GMS client and so is notified dynamically of any changes to the cluster. So it should be ok for the nth instance to come up with only itself in the table, but this will need to be tested. The HealthHistory object is also a listener for changes on the Cluster object when present; this won't happen in the clustered DAS case. MBeanServer DynamicInterceptor uses MbeanService.getInstance().isDas() to add "server" as a target. No change required. JMS JMSConfigListener uses thisServer.isDas() to ask 'Does this server not need to update the cluster broker list on the active JMS resource adapter when there is a config change event on the Server config?'. A change is required to allow the same behavior for a user-managed instance. public UnprocessedChangeEvents changed(PropertyChangeEvent[] events) { .......... if (eventName.equals(ServerTags.SERVER_REF)){ String oldServerRef = oldValue != null ? oldValue.toString() : null; String newServerRef = newValue != null ? newValue.toString(): null; if (oldServerRef != null && newServerRef == null && !thisServer.isDas() || thisServer.isClusteredDas()) {//instance has been deleted <--- CHANGE REQUIRED _logger.log(Level.FINE, "Got Cluster change event for server_ref" + event.getSource() + " " + eventName + " " + oldServerRef + " " + newServerRef); String url = getBrokerList(); aresourceAdapter.setClusterBrokerList(url); break; }// } // else skip if (event.getSource() instanceof Server) { _logger.log(Level.FINE, "In JMSConfigListener - recieved cluster event " + event.getSource()); Server changedServer = (Server) event.getSource(); if (thisServer.isDas() && !thisServer.isClusteredDas() )return null; <----- CHANGE REQUIRED if(jmsProviderPort != null){ String nodeName = changedServer.getNodeRef(); String nodeHost = null; if(nodeName != null) nodeHost = domain.getNodeNamed(nodeName).getNodeHost(); String url = getBrokerList(); url = url + ",mq://" + nodeHost + ":" + jmsProviderPort; aresourceAdapter.setClusterBrokerList(url); break; } } Security AdminAccessController authenticate(GrizzlyRequest req) - With more recent changes to Grizzly, this check is going to be removed by another project. No change required. if (authenticator != null) { /* * If an admin request includes a large payload and secure admin is * enabled and the request does NOT include a client cert, then * the getUsePrincipal invocation can cause problems. So normally * the DAS will not look for a client cert. To override this, the user can * set org.glassfish.admin.DASCheckAdminCert=true but s/he should realize * that this can cause problems with large uploads if secure admin * is enabled and no client cert is present. */ final Principal sslPrincipal = ! env.isDas() || Boolean.getBoolean(DAS_LOOK_FOR_CERT_PROPERTY_NAME) ? req.getUserPrincipal() : null; return authenticator.loginAsAdmin(user, password, as.getAuthRealmName(), req.getRemoteHost(), authRelatedHeaders(req), sslPrincipal); } Web Services MetroContainer.isCluster() returns true if this is not DAS, not embedded, and GMS is enabled. MetroContainer will initialize the HA environment for a cluster if availability is enabled. Since a user-managed instance is a clustered DAS, a change is required for this initialization to happen for a user-managed instance. (Technical Requirement 3.3 High Availability P1 HA Messaging Metro HA aspect must work) public void postConstruct() { ................ if (isCluster() && isHaEnabled()) { final String clusterName = gmsAdapterService.getGMSAdapter().getClusterName(); final String instanceName = gmsAdapterService.getGMSAdapter().getModule().getInstanceName(); HighAvailabilityProvider.INSTANCE.initHaEnvironment(clusterName, instanceName); logger.info("metro.ha.environemt.initialized"); } ............... } private boolean isCluster() { return (!env.isDas() || server.isClusteredDas()) && !env.isEmbedded() && gmsAdapterService.isGmsEnabled(); <----- CHANGE } Command ChangesNo command changes are needed to to support creation and lifecycle for user-managed cluster instances since all configuration for this is done through properties that can be set with the set command. |