GlassFish Wiki : MonitoringReviewComments

Response for Review Comments

Jan Luehe

Comment	Response
4.1.1.1 Probe Provider Factory. ProbeProviderFactory defined as interface, but it is a real class. createInstance should be createProvider.When you say The method returns a class that implements this interface, explain how.	Addressed. ProbeProviderFactory has a getProvider method instead of getInstance - updated spec.
4.1.1.2 Probe Providers. Should the TxManager interface not be annotated with @ProbeProvider? When you say that the return value from this method will be a class that implements the TxManager interface, you might want to explain how the impl is created, and what the impl looks like initially (i.e., zero-overhead methods).	Addressed - a brief explanation is provided.
4.1.1.5 Emitting probe events Minor typo in code snippet: txProvider should be txMgr, or the other way round	Addressed
4.1.1.6 ProviderRegistry. Can you explain the meaning of the pattern components in :::. I think it's as follows: moduleId:providerName:appName:probeName. What are the naming conventions for moduleId and appName?	Out of scope. Addressed in the out of scope section.
4.1.1.12 Registering clients. Definition of ProbeProviderInfo from 4.1.1.7 repeated and probably out-of-scope here. The terms Client and Listener seem to be used interchangeably. Should we use Listener consistently in the API, i.e., change Client to Listener in ProbeClientMediator and registerClient? Should ProbeClientMediator also define a method for unregistering a listener/client?	ProbeClientMediator is different from the code
4.1.1.13 Note about gfProbe clients. A client that does not have any of its methods annotated with ProbeListener will never receive any probe events and will be ignored by the framework, right?	Yes - clarified in the spec
4.1.1.14 ProbeContainer. So the plan is to be able to deploy probe listeners: Packaged in standalone JAR files Packaged in WAR- and EAR-bundled JAR files As part of a WAR file's WEB-INF/classes (would they not inherit the permissions granted to the WAR file's other classes, which would make it difficult for the security system to enforce the restriction that probe listeners must not be able to open server sockets or create threads?) Will the probe listeners be automatically unregistered from the ProbeClientMediator during undeployment?	Out of scope. The listeners will work under the same SecurityManager as that of the standalone war or ear. As soon as the jar is undeployed - it will be unregistered
4.1.2.3.1 Probe Listener. listeners for a provider will be registered or unregistered based on the lifecycle event from the provider (when provider is coming up or going down): Do providers implement some kind of state machine? When a provider is created using ProbeProviderFactory.createProvider, will it automatically be put in STARTED?	The provider is free to register the provider as they deem appropriate. Once registered, the provider is in the started state (although we do not have an lifecycle event that deems it as started)
4.1.2.3.3 OVH. It consists of interfaces for the Telemetry objects to register/unregister themselves as the tree nodes. Not sure I completely understand, but does this mean that in addition to registering itself by calling ProbeClientMediator.registerClient, a client must also register with the OVH? How does it do that, and under what name? Is this interface the same as the TreeNodeFactory in 4.1.2.6? What does the term client mean here? In earlier sections, client meant ProbeListener, but in the context of the OVH, it also refers to REST and CLI (once part of the tree, they are exposed to any client).	The telemetry object will register themselves and any other nodes that they want to be used during runtime to provide monitoring data into the tree node. The OVH follows the same pattern as v2.
4.1.2.4 Custom ProbeListeners or Scripts. What is the difference between ProbeClient and ProbeListener? How can a ProbeClient be turned into a script?	There is no difference between ProbeClient/ProbeListener - they have been used interchangeably. Removed reference to clients / scripts. Moved the section out under 4.4 (Out of Scope)
4.1.2.6 Framework Utilities Code snippet: What happens to TreeNode child? Is it ever used?	It will be used by admin cli/ui to return the value the child object - explained it in the spec
4.5.1.2 gfProbes Infrastructure Exported Interfaces. ProbeProviderFactory: Is it createProvider or createInstance? Is ProbeProviderFactory really an interface? How do you acquire an instance of it ProbeProvider, ProbeProviderInfo, ProbeProviderRegistry missing?	Changed getInstance to getProvider. You get an instance of the ProbeProviderFactory through the Habitat.
4.5.1.3 Utility Framework Classes. Why does FlashlightMonitoringRegistry have Flashlight in its name, when none of the other classes (e.g., TreeNodeFactory) do?	Addressed. FlashlighMonitoringRegistry renamed as MonitoringDataRuntimeRegistry

Nazrul Islam

comments	response
1) Section 4.1.2.3.3: Please ensure that the object view hierarchy is aligned with GFv2 and the dotted names are backward compatible.	Addressed
2) Section 4.1.2.5: Please ensure that monitor_type is correctly mapped for GFv2 types also.	Addressed
3) Section 4.1.2.5: asadmin set command does not work without domain.xml persistence.	Addressed
4) Section 4.12: Please specify how GFv2 monitoring levels (OFF, LOW, HIGH) will work.	Addressed
5) Please ensure that this works for both JDK 5 and JDK6.	It will work on JDK 5 and JDK 6
6) Please expose the monitoring information with JMX MBeans. REST support may come post Prelude.	Evaluating the feasibility, will update
7) Please check all the APIs for naming consistency and align.	Done
8) Secion 4.1.1.1: Please specify that appName can be null.	done
9) Section 4.3: Please specify what is in scope for GFv3 Prelude.	Specified all items that are not in scope in the out of scope section
10) Section 4.4: Please clearly specify what is out of scope so that future design goals are understood.	Done
11) Section 4.5.1.2: Do we need to expose the probe annotations (@ProbeName, @ProbeParams, @MethodEntry) and ProbeProviderInfo?	Yes, Providers need to know this to signal what they want to expose
12) Section 4.5.1.2: Is there any risk with exposing /remove/ method on TreeNode to 3rd parties? Should we have exposed only the integration point (example parentId)? See example at: http://wiki.glassfish.java.net/Wiki.jsp?page=GFV3PluggabilityOnePager#section-GFV3PluggabilityOnePager-4.1.1AdminConsole	Do not need to expose it.
13) Section 4.1.2.5: Question Example shows "gf:tx:*". Should the first tupple be "gf" in the example? How does this relate to DTrace?	First tuple is related to the moduleName as outlined in 4.1.1.1. DTrace is out of scope for this release, however an example is provided on how DTrace monitoring can be provided in JDK 1.7
14) Please clarify support for JVM, JDBC, JPA, jRubby monitoring and CallFlow support.	All these will be provided Post Prelude. CallFlow was mentioned as post-prelude in the Compatibility section - it is now moved to the out of scope section.

Lloyd Chambers

comments	response
llc-01 Monitoring via JMX could happen for Prelude, but seems "tight". JMX+Notifications should be added to the API stack in 4.1.0.	Sure its tight. We are exposing the Monitoring through the OVH tree structure. We have two possible ways to go about. Will discuss with you and update the decision.
llc-02 the creation and removal and listing/status of probes should be exposed via JMX I think. Perhaps an annotation to that effect can be placed upon a probe. Then the GUI could easily search for and fine all probes, as well as create and remove them.	Mahesh?
llc-03 I'm a bit confused by section 4.1.1.5. If it is a zero overhead system, why is the TransactionManagerImpl doing explicit calls to onTxBegin? Or is it that this always needs to be done, and the probe can be one of the listeners?	If there are no underlying listeners listening to this event, this call is a no-op call. If there are listeners registered for this Probe then they receive these events.
llc-04 The ProviderRegistry and ProbeProviderInfo looks perfect for exposing via MBeans.	ProviderRegistry is out of scope for Prelude and has been marked as such.
llc-05 The Stats classes as well as Maps are awkward in a CLI. Is there a better way to expose this data?	The cli/ui will call the TreeNode api and invoke getValue on a particular node to get the data.
llc-06 In 4.1.2.5, monitoring seems to be based on config dotted names. But the components to be monitored are often runtime entities which don't exist in config eg Servlets.	Good point. We will be relying on the Probe events like ServletLoaded or JSPLoaded, to create and populate the tree.
llc-07 For compatibility layer we may need "get --monitor". But the V3 final dotted names will not distinguish that way; dotted names are pervasive and share one hierarchy for all names.	We will have to distinguish some way as there are several overlaps, and I think --monitor is the way to do that.
llc-08 Packages names include "gfprobe" but others use "flashlight". Shouldn't they share the same terminology?	They should all use Flashlight - modified to be consistent.

Sanjeeb Sahoo

comments	response
section #2.1: I don't see why we should write a tool that can be used by a wide variety of Java applications. Surely, application server will be one of the primary users of such a tool, but not the only one. This infrastructure should be be provided by underlying platform, which in our case is JDK.	Our framework contains a probe container, which hosts Probe applications. A probe application is simply a jar containing probe clients. By doing this, we can leverage our deployment framework to deploy the app. The ProbeContainer (which is just another V3 container) can host these probe clients and do a bunch of things like publishing the monitored data in MBeanServer, registering with any JSR 311 implementaiton (like Jersey). Also, doing the above in appserver will help out monitization story.
section #2.2: Since JDK 6 is the minimum platform, I can't see how this can satisfy needs of V3 Prelude, which is anticipated to be used in JDK 5 more often than not. Pl. correct me if I am wrong here.	We do support JDK 1.5 with the provider approach. Note that this approach is still better and it is definitely "lighter" compared to V2.
section #3.1: Is there not a need to monitor a cluster of application server instances? btrace allows this.	Can you explain how BTrace does this? Does Btrace understand enough about our GlassFish cluster. As explained previously, we just use our existing deployment model which allows a probe app to be deployed onto our cluster. You will be able to monitor clusters, each instance in a cluster will have its own OVH that can be aggregated at the cluster level.
section #4.1.1: What happens when some other parts of runtime tries to insert ClassFileTransformer? e.g., JPA implementations are known to do so.	The Instrumentation API doesn't prohibit multiple Transformers in the system. When retransform is called, it simply passes the output (of the transformed class) to the next Transformer.
section #4.1.1: It says /Application server would be the right product to build this intelligence of generating the lightweight data in production environment and analysing them to present it in a meaningful way, so the administrators can get to the root cause of the problem./ Can you please describe how you achieve zero overhead when the code has been statically instrumented to emit events as described in section #4.1.1.5?	OK, almost Zero in this case and definitely zero for non provider case. Also, it is fairly easy to retransform the caller itself to make it really zero overhead.
section #4.1.1.15: It says > /This means that there is no client here./ This is incorrect. There is a client API for BTrace – is used by command line client and VisualVM client. In a private email conversation, btrace author, Sundar ( http://blogs.oracle.com/sundararajan/ ), has confirmed that a JMX cover is being planned. He says: /BTrace author can annotate a bit more in trace class so that the trace class registers itself as a MXBean. Specially annotated fields will be exposed as MXBean attributes. This way any JMX client can consume trace data. / i.e., something like " @BTrace(MBean="com.acme.MyBean"), public class X Unknown macro: { @Attribute(name="cacheSize") private static int i; 'i' is updated from various action methods of various probes }	Good. So it is not there yet. Also, we are talking of providing support for JMX, REST etc. More importantly, the client need not worry about writing transport specific way
section #4.1.1.15: It says /Also, the set of operations that a Btrace client can perform is also limited./ It is fairly simple to relax BTrace strictness. Sundar says they need to change BTrace code in just two places. If desired, we can allow restricted method calls (say calls to specific Sun classes only). How to write higher level probes (rather than trace author using internal code names which would be unstable) using btrace? It is given below: /BTrace allows one or more probe descriptors be written. The probe descriptors map abstract provide/probe name to concrete class/method (or some other concrete probes). This way trace author can write trace code that is "stable". For example, see "$BTRACE/samples/SocketTracker1.java". The probe descriptor file is $BTRACE/samples/java.net.Socket.xml This file maps JDK's internal class/method names to higher level provide/probe names. This method is better compared to doing code changes to traced code. i.e., DTrace expects traced program authors to insert macros in their code. While that approach is okay for C/C++ world, for java world any code change in traced program just for observability is undesirable. Even a simple "event out" method will be a cost. The XML mapping scheme removes that disadvantage. You may want probe points at specific locations which may not be expressible as BTrace probe points. Even in such cases it is possible to call an empty static method from a class – which can be instrumented by BTrace. The empty methods are inlined and optimized away by hotspot and other VMs anyway. (whereas event firing methods may not optimized that way).	Maybe we could use bTrace's byte code retransformation capabilities. If bTrace would provide an API to register callback methods, then we treat Btrace as one of the providers of byteCode transformation engine.