GlassFish Server Open Source Edition 3.2 - Platform Services One Pager(template version 1.92)
1. Introduction1.1 Project/Component Working Name:GlassFish Server Open Source Edition 3.2 Platform Services Improvements 1.2 Name and Email Address of Author1.3. Date of This Document:Started: 04/18/11 2. Project Summary2.1. Project Description:Platform Services Improvements 2.2. Risks and Assumptions:Platform Services are, obviously, dependent on native code and tools for multiple operating systems. There are many many flavors of Linux. We assume that the ancient common denominator of init.d scripts will work on all Linux deployments. Is it possible for a savvy and/or clueless user to foul-up his operating system configuration so badly that it is impossible for our Platform Services to work? Yes! E.g. he or she can redefine all the standard run-levels. We will not support things like that. Assumption:
3. Problem SummaryWe need richer support for running the native tools on multiple platforms for handling the lifecycle of GlassFish servers. 3.1. Problem Area:We have support, right now, for creating services on Windows, all versions of Linux, and Solaris 10+ SMF. We need to extend this to include all versions of UNIX including non-SMF Solaris. After creation the user is on his own and must use the platform's native tools for managing the services. 3.2. Justification:It is important to do this in order to make Platform Services closer to the Java ideal of "write once - run everywhere". Services for our product will be platform-independent, as far as the user is concerned. 4. Technical Description:4.1. Details:Linux and non-SMF UNIX will be done at the lowest common denominator. We manipulate scripts directly in the special services area of the file systems. This was all developed over 40 years ago. This makes it old-fashioned and very time-consuming but phenomenally well-tested, well-documented and robust. SMF will be worked with directly when available on the platform. Windows will be worked with directly. You can see in the existing code precisely how we interact with these 3 main implementation areas. Each area will be expanded to include the new features. 4.2. Bug/RFE Number:4.3. In Scope:4.3.1 New commandsNew Asadmin commands will be developed to join the already existing create-service command:
4.3.2 New Auto-Restart BehaviorIf the user wished to have a service restart automatically he was required to set this up using the platform's native tools. We now will take this over internally. We will add a new Service Property (see create-service details below) that sets the number of times a restart is attempted after a crash. By default we will try to restart a server that crashes 3 times and then quit. 4.4. Out of Scope:We will not support multiple ad hoc tools available on particular Linux versions. E.g. Ubuntu has one way of working with services and Red Hat has a completely different solution. They all share the ancient tried and true solution. 4.5. Interfaces:4.5.1 Public InterfacesAs mentioned before the new public interfaces are the new asadmin commands discussed above. 4.5.1.1 create-serviceFor completeness, here is the usage for the existing command from 3.1 One change is planned which is to make --force true by default instead of false.
4.5.1.1.1 AutoRestart optionRESTART_TRIES=number where -1 means restart infinitely, 0 means do not restart, and any other number means try that many times before any of these commands reset the counter (as appropriate to the server type of course):
4.5.1.2 delete-serviceThis command exists in 3.1 as the undocumented command named _delete-service
4.5.1.3 list-servicesThis command will report on all GlassFish Services that are discoverable. I.e. all such services that were created by create-service.
4.5.1.4 start-serviceThe start-service command will start a service that is not currently running.
4.5.1.4 stop-serviceThe stop-service command will stop a service that is currently running.
4.5.1.4 restart-serviceThe restart-service command will restart a service that is currently running.
4.5.2 Private InterfacesNo new private interfaces that are observable externally. 4.5.3 Deprecated/Removed Interfaces:The asadmin command, _delete-service will be removed. It is, by definition, not supported so is not really a public interface. 4.6. Doc Impact:Rather extensive new documentation will be required for this feature. 4.7. Admin/Config Impact:This change is part of Admin, so the impact has been discussed already above. 4.8. HA Impact:None. 4.9. I18N/L10N Impact:No impact. 4.10. Packaging, Delivery & Upgrade:4.10.1. PackagingNo new packages will be necessary. 4.10.2. DeliveryNo impact. 4.10.3. Upgrade and Migration:No backward compatibility issues. We plan on requiring zero changes to domain.xml so no upgrade issues there. 4.11. Security Impact:Obviously there are huge security implications for running services on any platform. But there is nothing new to consider for this feature enhancement. 4.12. Compatibility ImpactOld interfaces remain the same. We are adding new commands and functionality - not modifying the existing behavior. 4.13. Dependencies:4.13.1 Internal DependenciesCommon utilities. 4.13.2 External Dependencieswinsw.exe is a C# program for wrapping java programs as Windows services. It is a Kenai project and is in GlassFish 3.1 already so it should not be an issue. 4.14. Testing Impact:This is a very difficult area to test. We need many different platforms. I don't believe it was tested thoroughly for GF 3.0 or 3.1. I did plenty of manual testing on the platforms I have access to for 3.1 (Solaris 10, Windows XP, Ubuntu). 5. Reference Documents:List of related documents, if any (BugID's, RFP's, papers). Explain how/where to obtain the documents, and what each contains, not just their titles. 6. Schedule:Use this link to see the issues and assigned milestones
6.1. Projected Availability:Indicate which milestones from the current schedule the project
7.0 Reviewer CommentsFrom Tom Mueller, April 26, 2011 1. Section 3.1 talks about supporting more platforms (non-SMF solaris), but I don't see this in RFE 16311. Where did this come from? I'd suggest removing it. Would this include adding AIX too? 2. Section 3.1 doesn't talk about implementing 3a. Specifically in the case where you have a service that will automatically restart the server, if the user runs stop-instance, the instances shouldn't be restarted. Also, on Windows, there is the notion of a service being in Manual or Automatic mode. Does a start/stop-instance effect that setting? Maybe what I'm looking for is to see your last comment in the one-pager. Or maybe a reference to the RFE in section 3.1 saying that the details of exactly what will be implemented are there. 3. I'm not convinced that we really need start-service, stop-service or restart-service. P3? 4. Section 4.5.1.1.1 - is RESTART_TRIES passed in as a "--serviceproperties" option? It's really all upcase with an underscore? 5. WRT restarting a server, is this implemented by the GlassFish code or via the automatic mechanisms of the OS? I seem to recall that SMF has a restart mechanism itself. 6. If RESTART_TRIES is set to 3, I can see not trying to start the server again if it fails to start 3 times in a row. But what happens if the server crashes once per day for 3 days. Or how about once a month for 3 months? Does it not get restarted then? Should there be some timeout after which the count is reset? 7. Section 4.12: Isn't the change of the default value for the --force option a compatibility issue? 8. Maybe this is for the design spec, but it would be really helpful to have a state transition diagram that shows what happens for various actions, including the GF commands (stop-instance, etc.) as well as the OS commands "svcadm ..." or GUI actions. From John Clingan, April 27, 2011 3.1 Technically, "We need to extend this to include all versions of UNIX including non-SMF Solaris." is not a requirement. The solution only needs to include supported platforms, but the supported platform matrix may change in the future :-) 4.3.1 Regarding Tom's comment about (re)start/stop-service, how would (re)start/stop-instance (or domain) differ? Wouldn't (re)start/stop-instance do the same thing? Byron, is there a semantic difference in your mind? 4.3.2: Can you clarify "we will now take this over internally"? Does this mean the DAS? Solution must be able to watchdog 100 remote instances today, and perhaps 100's in the future. Also, note that *all* the solutions require administrator/root privileges (or "process management" privileges in RBAC environment). Since this is the case, why not use inittab for non-SMF Unix/Linux environments to distribute the watchdog role? Also, how would auto-restart deal with the competing SMF & Windows Service restarting roles? |