Discovery Service Options

The goal of the service is to act a map of groups to masters. The group is described by a group name and namespace, and the returned data contains information about the GMS member including its location (IP address, port, protocol). When an instance is started, it checks with the discovery service to see if there is already a master for the group. If not, this member puts its own information into the service and becomes group master. When another instance comes up, it will retrieve the master's location and connect to the group.

We currently have two options for this service:

  1. Our own REST-based service, which will need to live at a WKA
  2. Amazon S3, which can be used anywhere that is web-accessible (does not have to be used within AWS, but it would be cheaper that way as I understand it)

In either case, the bulk of GMS code will not need to know how the master information is stored. That logic is kept in a simple SPI used by the GMS runtime. However, there will be some slightly different requirements for the user as discussed below.

GF REST Discovery Service

The service is implemented as a RESTful web service running in a GlassFish instance. The actual implementation of the service can change (currently a Java web app using Jersey and requiring a full Java EE container), but the REST interface defined should not change.

It is expected that there is at least one instance of the service running, and later we may implement replication of data so that more than one instance can be used. To the GMS code, the instance is simply defined by a URI. The URI(s) of the discovery services are the only well known addresses that we need. An implementation of the service could be running anywhere that is accessible through HTTP; it doesn't have to be in the same environment as the GMS group.

TBD:

  • Where this service will live within an installation. It could be a standalone process, but it would nice if there is some existing process that this code could live in.
  • Currently we have no security built into the REST service. This is a TODO for later.
  • Replication from one instance of service to another (GMS already accepts more than one service URI to handle this case).

User Requirements:

  • Service must be deployed at a well-known address.
  • Location of the service is set in the GMS_DISCOVERY_REGISTRY_URI_LIST cluster property

Amazon S3 as Discovery Service

Amazon S3 is storage that is available to a machine anywhere on the internet. It's a fee-based service, though prices would be cheap for the amount of traffic that GMS would generate. The service operates as a RESTful and non-REST web service, and there is a client library already for accessing it. Information is stored in "buckets," and each bucket is defined by a name that is unique within S3. The GMS runtime, a client to the discovery service, shouldn't care whether the service is our own home-grown one or Amazon S3. But there may be some changes we need to make based on the limitations of S3. For instance, it is "eventually consistent," so we may need to poll for changes after setting a master for a group. For more on the data consistency of S3, see the developer guide starting with page 9.

TBD:

  • How to write a Jersey client to call to S3. The alternative would be to use the Amazon 3rd party library.
  • How to handle the "eventually consistent" data model. A solution to this, such as having group masters periodically poll the service and take some action to resign as master, could solve other split-group issues as well.

User Requirements:

  • Location of S3 service can be included in the GMS_DISCOVERY_REGISTRY_URI_LIST property, signalling that GMS should use S3 for master discovery.
  • We will need to either generate an S3 bucket name or let the user supply one.
  • Security is already built into S3, but we need a way for a user to specify the access information for that account.

The Client

In this discussion, any GMS member is a client to the discovery service. See the one pager for more information on what a member does with the information. As to how a member uses the service, a client has been written that abstracts away the actual client-server protocol. The current client we use is:

/**
 * REST client for operations on a group's master. If created with
 * more than one URI, it will use all the services as noted in the method
 * documentation.
 */
public class MasterResourceClient {

    /**
     * Creates the client used to access discovery registries. The
     * constructor takes a comma-separated list of URIs that are
     * used in the business methods. It is expected that an instance
     * is only created when this string exists and is not empty.
     *
     * @param baseUrisString Comma-separated list of URIs for discovery
     * services, e.g. "http:server:port/contextroot".
     * @throws IllegalArgumentException if the passed in string is
     * empty or null.
     */
    public MasterResourceClient(String baseUrisString) {...}

    /**
     * Gets the master, if one exists. If not, returns null. This method
     * checks the services in the order their URIs were passed into the
     * constructor. It returns the first master found and stops checking
     * other services.
     *
     * If a GET call throws an exception other than for a 204 response, it
     * is logged and the next service is checked. (A 204 http response
     * indicates "No content," which happens when there is no master for
     * the given group id.)
     *
     * @param gid Id of the group.
     * @return The MemberInfo of the master, if there is one.
     * @throws IllegalArgumentException if the GroupId parameter.
     * is null or contains null fields.
     */
    public MemberInfo getMaster(GroupId gid) {...}

    /**
     * Put the new master for a group. There can only be one
     * master for a group. If a master is already set, then this
     * call has no effect. It is expected that discovery services
     * replicate information among themselves. Therefore, this method
     * attempts to PUT to the first service in the URI list. If it
     * receives an error (such as a 404 returned for the service),
     * then it proceeds with the next on the list.
     *
     * @param mi The member to set.
     * @return True if the master was set, otherwise false.
     * @throws IllegalArgumentException if the MemberInfo parameter
     * does not contain a valid GroupId.
     */
    public boolean putMaster(MemberInfo mi) {...}

    /**
     * Remove the master for a group id. If this member is not the
     * master, the current master is not removed and this method
     * returns false. It is expected that discovery services
     * replicate information among themselves. Therefore, this method
     * attempts to DELETE to the first service in the URI list. If it
     * receives an error (such as a 404 returned for the service),
     * then it proceeds with the next on the list.
     *
     * The current logic for removing a member for a group is to
     * compare the member name to the current master. So only the
     * GroupId and memberName inside the MemberInfo are important.
     *
     * @param mi The member info to delete as master.
     * @return True if the member was deleted.
     * @throws IllegalArgumentException if the MemberInfo parameter
     * does not contain a valid GroupId.
     */
    public boolean deleteMaster(MemberInfo mi) {...}

    /**
     * Don't forget this when done.
     */
    public void close() {
        client.destroy();
    }

}

For my reference, here is the Jersey API

Model API

Snapshot of the model classes used by the client:

public class MemberInfo {
    private GroupId groupId;
    private Location location;
    private String memberName;
    private boolean isMaster;
}  
public class GroupId {
    private String groupName;
    private String nameSpace;
}
public class Location {
    // todo: switch to java.net classes later if we want to
    private String ipAddress;
    private int port;

    // todo: make this an enum once we have the REST part working
    private String protocol;
}