1.6

GlassFish Server Open Source Edition 3.1

SSH Provisioning Design Specification

1.0 Overview

SSH Provisioning attempts to accomplish two things:

  1. Support the provisioning of a remote system (node) so that the user can use that system to host server instances for a cluster.
  2. Support the remote execution of cluster and instance lifecycle asadmin commands on provisioned nodes.

In simple terms SSH Provisioning attempts to provide some of the capability that was provided by the node agent in GF v2. The provisioning of a system is somewhat analagous to creating a node agent on a system. Once a system is provisioned it can participate in a cluster.

In the long term we would like the operation of provisioning a system to include the capability of installing GlassFish on the system (rfe 4372). But for 3.1 we will likely require users to install GlassFish on remote systems directly and the provisioning step will simply make the remote system known to the DAS. See the create-node command below for details.

From a technical standpoint the approach we are taking is to use SSH as a mechanism to execute local admin commands on remote nodes. So instead of the user logging into the remote node and executing the local command, we let them execute the remote vesion of the command command via the DAS. Using the target of the command we can determine what node the target is running on and execute the command on the remote system using SSH.

This means that all the heavy lifting for these commands is performed by the local version of the command that is run (locally) on the remote system. Since SSH is an optional capability the user must always have the capability of running the remote command against the DAS, and if SSH is not operational subsequently run the local version of the command manually on the remote system.

GlassFish v2 Features Not Support By SSH Provisioning

As stated above the SSH capability in 3.1 is not a direct feature replacement for the nodeagent feature in v2. Specifically the following capabilities will not be supported by the SSH based solution in the 3.1 timeframe:

  1. Watchdog restart of failed server instances. Note that this capability can be achieved in many cases through the use of the GlassFish native service integration (create-service).
  2. Viewing of failed server log files. Note that the Logging subproject is considering a feature to collect remote instance log files for analysis if the SSH transport is available.

GlassFish 3.1 Features Only Available if SSH is Available

The following is the list of features that will be available if a GlassFish 3.1 deployment is configured to use SSH. Conversely if an administrator does not configure SSH then the following features will not be available:

  1. start-cluster for clusters consisting of instances remote to the DAS.
    • The non-SSH alternative is to run start-local-instance manually on each instance system – or rely on the native service integration to control the instance life cycle.
  2. start-instance for instances remote to the DAS.
    • The non-SSH alternative is to run start-local-instance manually on the instance system – or rely on the native service integration to control the instance life cycle.
  3. create-instance for instances remote to the DAS
    • The non-SSH alternative is to run create-local-instance manually on the instance system.

2.0 SSH Client Library

In order to use SSH in GlassFish we need a Java SSH client library. The core SSH client support will be provided by the Trilead-ssh2 library. Specifically the version from the Hudson branch:

http://github.com/hudson/trilead-ssh2

The Trilead-ssh2 library implements SSH Protocol Version 2. It supports SSH sessions (remote command execution and shell access), local and remote port forwarding, local stream forwarding, X11 forwarding, SCP and SFTP. According to the Hudson project the trilead-ssh2 implementation has been very solid. The trilead-ssh2 library will be downloaded as a binary from the hudson branch. Trilead-ssh2 is available under the BSD license.

On top of the trilead-ssh2 library we may need a thin API to provide convenience methods and isolate the trilead-ssh2 dependencies from the rest of GlassFish. This API would provide a way to conveniently invoke remote commands and get progress and status from them. It also would provide a method to determine if SSH is available for use so that the CLI and Console can provide appropriate user feedback and error messages when SSH is not available/configured.

3.0 Configuring SSH

SSH must be configure by the user before attempting to configure a cluster. GlassFish will not provide any tools for assisting the user in configuring SSH.

If SSH is not configured then the ability to remotely provision a node and create/control remote server instances (including start-cluster/stop-cluster) will not be available. The user will need to perform these operations manually on the remote systems.

4.0 SSH Authentication

When using SSH to execute GlassFish commands on a remote system the DAS must authenicate to SSHD on that remote system. SSH supports a couple of authentication schemes. At a minimum we plan on supporting SSH public key authentication with un-encrypted key files. If time permits we will add support for encrypted key files as well as username/password authentication.

In order to authenticate to SSHD the following needs to be passed to the SSL client library:

  1. username
  2. key file location if using key authentication
  3. key file passphrase if key file is encrypted with a passphrase
  4. user's SSH password if using password authentication

Supporting simple public key authentication is fairly straightforward. After the user has set up SSH, generated their key pair (with no passphrase protection), and installed their public key on the servers they specify the username and keyfile location when they create the Node objects that represent the servers (see create-node-ssh below). This information is stored in domain.xml and can then be used by the DAS to authenticate to the SSH service.

For encrypted key files the DAS needs access to the passphrase that was used when the SSH key was generated. This key needs to be in the clear when passed to the SSH client library. The plan is to leverage the GlassFish keystore and password aliases. Basically the user must create a password alias for the keyfile passphrase, and provide that alias when the node is created. The DAS can then retrieve the passphrase from its keystore.

To support password authentication a similar approach would be used. The user creates a password alias for their SSH password and provides this alias when the node is created. The DAS can then retrieve the SSH password from its keystore whe
n it needs to authenticate to SSH on that node.

We do not plan on supporting interactive prompting of the user's password when an asadmin command is run.

5.0 GlassFish asadmin Authentication

After authenticating to SSH we will be executing asadmin commands on the node. The assumption is that we will only need to execute "local only" commands on the node such as _create-local-filesystem and the start-local-instance. Since these are local only commands they do not require admin authentication, therefore the assumption is that we do not need to pass admin credentials to these invocations of asadmin.

If it turns out this assumption is wrong, then we would likely use the asadmin --passwordfile option to pass admin credentials to the asadmin command executed on the node. In order to do that we need to know the admin user and admin password in clear text. Currently the DAS does not have this information. The Dynamic Reconfiguration subproject has a similar requirement and is working on a solution. The hope is, if needed, we could use their solution.

6.0 Locating Remote Systems

In addition to authentication information we need the following whenever we want to execute a command on a remote system using SSH:

  • The host to execute the command on
  • The GlassFish install location to execute the command from

This information will be provided by the user when they provision the system using create-node and remembered in the domain configuration for each node. Since each server instance is associated with a node when the instance is created, we can always determine this info if we know the instance name.

7.0 Nodes

In GlassFish v2 there was a concept of a node agent, and all server instances were created as a child of a node agent.

In 3.1 the concept of a node agent has been generalized to that of a "node". A node represent a host + glassfish installation. A node can be accessed via a connector. In 3.1 we will support one connector: the ssh connector. The ssh node also supports a degenerate case for a "local node" – that is the same node that the DAS is running on. In this case the ssh connector is not used as commands can be run locally by the DAS.

8.0 CLI

SSH support impacts a number of CLI operations. For now I've included both the impacted commands, and other related lifecycle management commands here, but this section of the spec should  be incorporated into the to the Basic Clustering spec.

create-node-ssh

This command is somewhat analogous to create-node-agent in v2. It provisions a system for use in a cluster that can be accessed using SSH. For 3.1 that provisioning is essentially creating a node entry  in the DAS config. In the future the provisioning could include installing the GlassFish software, and creating a legacy style node agent if those are supported.

By leaving off the ssh options and indicating that the nodehost is localhost this command can be used to create a node that is assumed to be co-located on the same host as the DAS and therefore SSH would not be required since commands could be executed locally.

create-node-ssh
    --nodehost remote_host_name
    --installdir glassfish_home_path
    [--nodedir path_to_node_directory_on_nodehost]
    [--sshnodehost ssh_host_name]
    [--sshport port_number_for_ssh_on_nodehost]
    [--sshuser ssh_username ]
    [--sshkeyfile ssh_key_file_path]
    [--sshpassphrase passphrase_alias ]
    [--sshpassword password_alias ]
    [--force={true|false}]
    node_name

--nodehost      Host name where instances will reside.

--installdir    Location of GlassFish installation on nodehost. 
                This is the upper glassfish installation directory.
                For example: /export/home/glassfish3

--nodedir      Path to a directory to contain the server instances that are
               created on the node. Defaults to install_dir/nodes. If a
               relative path is given it is considered relative to install_dir.

--sshport       Port number to connect to ssh server on. Defaults to 22.
                Ignored if nodehost is "localhost".

--sshuser       SSH username to use to access nodehost.
                Defaults to the user that the DAS process is running as and
                it is recommended that this is the same as the user running
                the DAS process.
                Ignored if nodehost is "localhost".

--sshkeyfile    SSH private key file for username.
                Defaults to ${user.home}/.ssh/id_[rd]sa
                (or appropriate location depending on platform).
                The path specified must be absolute and reachable by the DAS.
                The path may contain Java properties of the form ${prop.name}.
                The key file must be readable by the DAS. For this reason it
                usually works best if the DAS is running as the same users as
                that specified via "--sshuser".

--sshpassphrase The alias for the passphrase that was used to protect the
                user's key file. This is only required if the key file is
                protected with a passphrase. This alias must be created
                first using the "asadmin create-password-alias" command.
                This command let's the user save the actual SSH passphrase
                in the DAS keystore and give it an alias name that is used
                here.

--sshpassword   The alias for the user's SSH password if password authentication
                will be used. This alias must be created first using the
                "asadmin create-password-alias" command. This command let's
                the user save the actual SSH password in the DAS keystore
                and give it an alias name that is used here.

--force         If true force the creation of the node even if nodehost can't
                be accessed via SSH using the parameters provided. Usually if
                the host can't be contacted via SSH the command will fail with
                an error.

node_name       Name of node. Must be unique in domain. No other node, cluster
                or instance may use this name.

This will create a node in domain.xml that can be referenced by subsequent create-instance commands and can be accessed via SSH.

If the nodehost can't be accessed via SSH using the parameters supplied then the command will fail with a helpful error message. If --force=true then the command will give a warning and create the node instead of failing.

--sshpassphrase and --sshpassword will only be supported as time allows. Note that the SSH client will attempt all forms of authentication before failing a connection.. XXX Is this true?

Examples:

create-node-ssh mynode

This creates a local node – one that represents the same host and GlassFish installation used by the DAS. In this case SSH will not be required nor used since all commands can be performed locally.

create-node-ssh
    --nodehost glassfish1.sfbay.sun.com
    --installdir /export/apps/glassfish3
    glassfish1

Creates a node named "glassfish1". The host for this node is glassfish1.sun.com and glassfish is installed in /export/apps/gf on that host. SSH will be used to communicate with glassfish1.sun.com using the username of the user that is running the DAS and using key authentication using the key in ~username/.ssh/id_[rd]sa. If glassfish1.sfbay.sun.com cannot be contacted by the DAS using SSH then the command will fail with a helpful error message.

create-node-ssh
    --nodehost host-1.glassfish.org 
    --sshuser otheruser 
    --sshkeyfile ~/.ssh/id_dsa 
    --installdir /scratch/otheruser/gf/glassfish3 
    newHost

Creates a node named "newHost" on the remote system host-1.glassfish.org on which you use the username "otheruser" to connect using ssh. You need to specify --sshkeyfile; otherwise GlassFish will try to find the key file under the directory for otheruser which might not exist on the current system where you execute the create-node-ssh command.

update-node-ssh

Updates a config node to an SSH node, or updates the values in an SSH node. Usage is the same as create-node-ssh except that --nodehost and --installdir are optional. If any server instances have been created on the node then you can not update installdir or nodedir.

update-node-ssh
    [--nodehost remote_host_name]
    [--installdir glassfish_home_path]
    [--nodedir path_to_node_directory_on_nodehost]
    [--sshnodehost ssh_host_name]
    [--sshport port_number_for_ssh_on_nodehost]
    [--sshuser ssh_username ]
    [--sshkeyfile ssh_key_file_path]
    [--sshpassphrase passphrase_alias ]
    [--sshpassword password_alias ]
    [--force={true|false}]
    node_name

create-node-config

create-node-config
    [--nodehost remote_host_name]
    [--installdir glassfish_home_path]
    [--nodedir path_to_node_directory_on_nodehost]
    node_name

Creates a node placeholder in the DAS for use by create-instance. It is assumed that the user will subsquently run create-local-instance directly on the node which will complete the population of the node with the appropriate nodehost and installdir values. This is for cases where the user chooses not have make SSH available to communicate to remote nodes.

Does not depend on SSH.

update-node-config

Updates the values in a config node, or updates an SSH node to be a basic config node. Usage is the same as create-node-config. If any server instances have been created on the node then you can not update installdir or nodedir.

update-node-config
    [--nodehost remote_host_name]
    [--installdir glassfish_home_path]
    [--nodedir path_to_node_directory_on_nodehost]
    node_name

delete-node-ssh

delete-node-ssh
    node_name

Removes node from domain.xml. Requires that no server instances exist that reference this node (i.e. you must remove the server instances first before using delete-node).

Does not depend on SSH.

delete-node-config

delete-node-config
    node_name

Removes node from domain.xml. Requires that no server instances exist that reference this node (i.e. you must remove the server instances first before using delete-node).

Does not depend on SSH.

list-nodes

list-nodes [--verbose] [target]

--verbose      Provide more detailed human readable output

Where target is one of:

  • domain: Lists all nodes in the domain. This is the default.
  • cluster_name: Lists all nodes associated with the cluster
  • instance_name: Lists node associated with the instance
  • node_name: Lists the named node

Does not depend on SSH.

XXX Output is TBD. Need to look at list-instances for target usage and verbose output format.

XXX Should alias to list-nodes-ssh?

create-instance

create-instance
   --node node_name | --nodeagent node_name
   [--config confg_name | --cluster cluster_name]
   instance_name

Creates server entry in the DAS. If the node is localhost or if SSH is operational also executes _create-instance-filesystem on the node specified by "node_name".

If SSH is not operational and the node is remote only the local DAS configuration will be updated and the command will inform the user that they must execute the create-local-instance command on the remote system manually (giving them the full text of the command to execute and the system to execute it on).

instance_name must be unique within the domain and may not be used by any other instance or cluster or node.

For backwards compatibility with v2  --nodeagent will be supported as an alias for -node.

Command called on node via SSH: _create-instance-filesystem

delete-instance

delete-instance
   instance_name

Removes server instance config from the DAS and if SSH is operational (or node is local to the DAS) it removes the local server instance on the node by executing _delete-instance-filesystem on the node.

If the node is remote and SSH is not operational only the local DAS configuration will be updated and the command will inform the user that they must execute the local delete-local-instance command on the remote system (giving them the full text of the command to execute).

Command called on node via SSH: _delete-instance-filesystem

start-instance

start-instance
   instance_name

Performs no modification to the DAS config. Looks up coordinates of instance (via the instance's associated node) and if node is remote uses ssh to run start-local-instance on the remote node. If node is local uses Runtime.exec().

If SSH is not operational and node is remote the command will inform the user that they must execute the start-local-instance command on the remote system (giving them the full text of the command to execute).

Command called on node via SSH: start-local-instance

stop-instance

stop-instance
   instance_name

Does not use SSH.

start-cluster

start-cluster
    cluster_name

Execute start-instance on each instance of the cluster. This depends on SSH via the use of start-instance. Returns clear status from each start-instance execution.

stop-cluster

stop-cluster
    cluster_name

Does not use SSH.

9.0 Configuration

We need this information to access a node via SSH:

  • node_name
  • host_name
  • ssh_port
  • ssh_username
  • ssh_keyfile
  • glassfish_home_path

The proposal is to create a new element, the generic "node". A node can have zero or more connectors. In 3.1 we would support zero connectors or an ssh connector. In the future other connectors (like one to support legacy style node agents) could be supported.

Here is a domain.xml snippet

<server . . . name="myinstance1"  node-ref="ssh_node">
. . .
</server>

<nodes>

  <!-- An SSH node -->
  <node name="ssh_node" node-host="gf1.sfbay.sun.com" install-dir="/export/gf">
    <ssh-connector port="22">
      <ssh-auth type="key" username="${user.name} keyfile="${user.home}/.ssh/id_dsa" />
    </ssh-connector>
  </node>

  <!-- A localhost node. No connector needed. -->
  <node name="localhost" node-host="localhost" install-dir="${com.sun.aas.installRoot}">
  </node>

  <!-- A localhost node from a different install location and agentdir. -->
  <node name="mylocalnode2" node-host="localhost" install-dir="/export/apps/gf" node-dir="/export/myagentdir">
  </node>

  <!-- Placeholder node after create-node-config -->
  <node name="myconfignode" node-host="" install-dir="">
  </node>

  <!-- Node after create-local-instance run directly on node -->
  <node name="mynode3" node-host="gf1.sfbay.sun.com" install-dir="/export/gf1">
  </node>

  <!-- Node converted from v2 node-agent node -->
  <node name="v2node" node-host="gf2.sfbay.sun.com" install-dir="${com.sun.aas.installRoot}">
    <ssh-connector port="22">
      <ssh-auth type="key" username="${user.name}" />
    </ssh-connector>  </node>

</nodes>
Attribute Required/Optional Default Comment
node.name Required -- Name of node
node.host Required -- Node's host. If empty string then node is a placeholder
node.installdir Optional -- GlassFish install location on node. This is the upper glassfish isntallation directory like /export/home/glassfish3. Needed for ssh nodes.
ssh-connector Optional --  
ssh-connector.port Optional 22  
ssh-connector.ssh-auth.type Required --  
ssh-connector.ssh-auth.username Optional user the DAS process is running as  
ssh-connector.ssh-auth.keyfile Optional ~username/.ssh/id_[rd]sa  

10.0 domain.xml Upgrade

When upgrading a v2 domain.xml the following transformations take place:

  • node-agent-ref on server instance becomes node-ref
  • node-agents becomes nodes
  • node-agent becomes node
    • name is kept
    • client-hostname property on jmx-connector becomes nodehost attribute on node
    • install-dir attribute on node is set to com.sun.aas.installRoot. I.e. the same install location as the DAS uses.
    • agent-dir becomes node-dir.
    • If the client-hostname is the same host as the DAS host then the node is a normal config node, otherwise the nodeagent was remote and the ssh-connector element is added with standard default SSH values.
    • all other node-agent attributes and sub-elements are discarded

This transforms a v2 node agent to a node of the form:

<node name="v2node" node-host="gf2.sfbay.sun.com" install-dir="${com.sun.aas.installRoot}">
    <ssh-connector port="22">
      <ssh-auth username="${user.name}" />
    </ssh-connector>
  </node>

So we attempt to have a functioning SSH node. If these values are not correct then they will need to update them with update-node-ssh.

11.0 CLI Compatibility with v2

create-instance

The 3.1 create-instance will accept "-nodagent" as an alias for "-node"

create-node-agent-config

Mapped to "create-node-config"

delete-node-agent-config

Mapped to "delete-node-config"

create-node-agent

This command will fail as no longer supported.

delete-node-agent

This command will fail as no longer supported.

start-node-agent

This command will fail as no longer supported.

stop-node-agent

This command will fail as no longer supported.

list-node-agents

This command will fail as no longer supported.

12 Usage Scenarios

3.1 SSH Scenarios describes some CLI usage scenarios.

3.1 SSH Authentication Options describes some CLI scenarios for the different authentication options.

13.0 Reference

Hudson Project: http://hudson-ci.org/