This page provides links to review drafts of new and changed documentation for the Group Management Service project as listed in the GMS Documentation Plan.

Mandatory reviewers for each item are listed in each section.

Changes to existing documentation since the last release are marked with change bars. No changes are marked in new documentation.

Please provide your feedback by adding a comment to this page. To simplify the processing of your comments, please add your comments in the format in the sample comment. Review existing comments to see known issues and avoid duplicates.

Changes to Books

High Availability Administration Guide Changes

Section Documentation Impact Reviewers
Group Management Service Major Joe Fialli, Bobby Bissett

See the Clustering Infrastructure Doc Review Page

Upgrade Guide Changes

Section Documentation Impact Reviewers
Changes to Group Management Service Settings NEW Joe Fialli, Bobby Bissett

See the Upgrade Doc Review Page

Changes to Online Help

Changes to Man Pages

Mandatory reviewers in addition to the reviewers that are listed in the table are as follows:

  • Kazem Ardekanian
  • Tom Mueller
Man Page Name and Section Documentation Impact Reviewers
get-health(1) Moderate Joe Fialli, Bobby Bissett
validate-multicast(1) NEW Joe Fialli, Bobby Bissett

get-health.1.pdf (application/pdf)
validate-multicast.1.pdf (application/pdf)
validate-multicast.1.pdf (application/pdf)
validate-multicast.1.pdf (application/pdf)
Comment ID Location Comment
RJP-001 get-health(1) man page Sample comment. Should provide a proposed fix and correct content if applicable.
RJP-002 validate-multicast(1) man page Another sample comment.
Posted by rebeccaparks at Nov 03, 2010 14:59
Comment ID Location Comment
trm-1 get-health(1) man page This page doesn't have the usual comment about this being available in remote mode only.
trm-2 validate-multicast(1) man page Shouldn't the default value for --bindinterface be 0.0.0.0 rather than null?
trm-3 validate-multicast This page is missing the usual comment about being available in local mode only. 
trm-4 validate-multicast The description should explain that this command is to be used by running it at the same time on each of the hosts to be validated. 
Posted by trmueller at Nov 17, 2010 12:59
Comment ID
Location
Comment
jmf-2
validate-multicast(1) man page
In example1 under Examples,  remove "--timeout 3".  This means to run the command for only 3 seconds. Given that UDP multicast messages are lossy,
it is not sufficient to only send one multicast message and consider it a failure if it is not received.  As currently configured, the default of 20 seconds ensures
that 10 multicast messages are sent.
 In my example, I had arbitrarily set the value to 30 seconds to increase the time that all three commands were running at same time (to ensure that NONE of the multicast messages were getting through to the one machine that was not seeing any multicast traffic).  It is best to keep the example as simple as possible and not introduce parameters that are not necessary. Perhaps when the example was first written, validate-multicast printed each message it received.  Now it only
prints the first one it receives, so there is no need to set the timeout to 3 seconds.


Additionally, to keep things simple in the example1 for machine2: remove "--bindaddress 10.133.184.219" and set bind address echoed by the command to null.
The text should look like what is below after these changes.

Run from host machine2:
asadmin> validate-multicast
Will use address 228.9.3.1
Will use bind address null
Will use wait period 2,000 (in milliseconds)

jmf-1
validate-multicast
I strongly second "tm-4" comment.  The example only shows running the command on one machine. Minimally one must run this command on 2 or more machines at same time to verify that their multicast messages are seen by all other invocations.  When I used the command to debug an internal issue, I ran the command on three machines at same time on different terminal windows through ssh. Two of the machines could see each other and the third machine could only see its own multicast messages.  The issue ended up being that the machine that was not seeing others multicast traffic was on a different physical switch and that switch was not configured properly to see the other multicast traffic on the internal network.  I will definitely add this info to a Glassfish FAQ entry, but it is too bad that the minimal usage of this command is not illustrated in the man page.  

Here is that email:

I am running it on 3 different subnet 184 machines at one time.\\
 Below is confirmation that the two other 184 subnet machines are seeing     each other using the "asadmin validate-multicast" \\
 diagnosis program. I placed the key info in *bold*.\\
  \\
  \\
  *root@bigapp-oblade-1* /cygdrive/c/export/glassfish3/bin \\
 $ ./asadmin.bat validate-multicast \\
 Will use port 2048 \\
 Will use address 228.9.3.1 \\
 Will use bind interface null \\
 Will use wait period 2,000 (in milliseconds) \\
  \\
 Listening for data... \\
  *Sending message with content "bigapp-oblade-1" every 2,000       milliseconds* \\
 *Received data from bigapp-oblade-1 (loopback)* \\
 *Received data from asqe-oblade-15* \\
 Exiting after 20 seconds. To change this timeout, use the \--timeout     command line option. \\
 Command validate-multicast executed successfully. \\
  \\
 \************\* \\
  \\
 From asqe-oblade-15 /cygdrive/c/export/glassfish3/bin \\
 $ ./asadmin.bat validate-multicast \--bindaddress 10.133.184.219     \--timetolive 128 \\
 Will use port 2048 \\
 Will use address 228.9.3.1 \\
 Will use bind interface 10.133.184.219 \\
 Will use wait period 2,000 (in milliseconds) \\
 Will use time-to-live 128 \\
  \\
 Listening for data... \\
  *Sending message with content "asqe-oblade-15" every 2,000       milliseconds* \\
 *Received data from asqe-oblade-15 (loopback)* \\
 *Received data from bigapp-oblade-1* \\
 *Received data from asqe-oblade-15* \\
 Exiting after 30 seconds. To change this timeout, use the \--timeout     command line option. \\
 Command validate-multicast executed successfully. \\
  \\
 \***\*\\
 Note that this process running at same time is not seeing the other     two instances.\\
  \\
 From *bigapp-opt-2* /cygdrive/c/export/glassfish3/bin \\
 $ ./asadmin.bat validate-multicast \--bindaddress 10.133.184.20     \--timetolive 128 \--timeout 30 \\
 Timeout set to 30 seconds \\
 Will use port 2048 \\
 Will use address 228.9.3.1 \\
 Will use bind interface 10.133.184.20 \\
 Will use wait period 2,000 (in milliseconds) \\
 Will use time-to-live 128 \\
  \\
 Listening for data... \\
 Sending message with content "bigapp-opt-2" every 2,000 milliseconds \\
  *Received data from bigapp-opt-2 (loopback)* \\
 Exiting after 30 seconds. To change this timeout, use the \--timeout     command line option. \\
 Command validate-multicast executed successfully.\\
 \\
\\
The issue ended up being that the isolated machine was on a different network switch than the other machines were.\\
 Either the network switches need to be configured to allow multicast traffic to flow between switches (beyond scope to describe how here)\\
 or all machines need to be on same switch. (easier change but not always possible)\\
 

|

Posted by joe.fialli at Feb 10, 2011 06:02