Automatic Transaction Failover with EJB Timer Migration Demo This demo shows automatic transaction failover combined with the timer migration if an instance that owns EJB Timers in a cluster is killed. The demo uses ejb devtest
v2/appserv-tests/devtests/transaction/ee/timer_with_autorecovery
to step through various stages Team
Setup and Cluster creation
- Compile the classes and create the WAR file to be deployed:
- Start JavaDB database
- Start DAS.
- Create an XA resource that can be shared by all instances in the cluster (_jdbc/xa_)
- Register this resource as the _ejb-timer-service.timer-datasource_ with the _default-config_.
- Create and start the cluster of 2 instances
Running the test
- Programmatic timer is created on _in1_ in a separate method call to get it going
- You can see in the instance log _glassfishv3/glassfish/nodes/localhost/in1/logs/server.log_ the following output:
[#|2010-09-29T16:18:16.371-0700|INFO|glassfish3.1|javax.enterprise.system.tools.admin.org.glassfish.deployment.admin|_ThreadID=15;_ThreadName=Thread-1;|tx-ee-timer-with-autorecovery-web was successfully deployed in 12,952 milliseconds.|#]
[#|2010-09-29T16:18:48.907-0700|SEVERE|glassfish3.1|null|_ThreadID=15;_ThreadName=Thread-1;|In ___MyBean:timeout___ timer01 28080|#]
- After that 3 rows are inserted on in1 into the default and xa databases:
[#|2010-09-29T16:18:49.993-0700|SEVERE|glassfish3.1|null|_ThreadID=15;_ThreadName=Thread-1;|Call # 1|#]
[#|2010-09-29T16:18:49.996-0700|SEVERE|glassfish3.1|null|_ThreadID=15;_ThreadName=Thread-1;|Call # 2|#]
[#|2010-09-29T16:18:49.998-0700|SEVERE|glassfish3.1|null|_ThreadID=15;_ThreadName=Thread-1;|Call # 3|#]
[#|2010-09-29T16:18:50.001-0700|SEVERE|glassfish3.1|null|_ThreadID=15;_ThreadName=Thread-1;|Insert successfully|#]
[#|2010-09-29T16:18:50.023-0700|SEVERE|glassfish3.1|null|_ThreadID=15;_ThreadName=Thread-1;|Call # 1|#]
[#|2010-09-29T16:18:50.025-0700|SEVERE|glassfish3.1|null|_ThreadID=15;_ThreadName=Thread-1;|Call # 2|#]
[#|2010-09-29T16:18:50.025-0700|SEVERE|glassfish3.1|null|_ThreadID=15;_ThreadName=Thread-1;|Call # 3|#]
[#|2010-09-29T16:18:50.026-0700|SEVERE|glassfish3.1|null|_ThreadID=15;_ThreadName=Thread-1;|Insert successfully|#]
- ... But the transaction is left waiting between 2 phases of the two-phase commit (2PC):
[#|2010-09-29T16:18:50.029-0700|WARNING|glassfish3.1|javax.enterprise.system.core.transaction.com.sun.jts.utils.RecoveryHooks|_ThreadID=15;_ThreadName=Thread-1;|JTS5057: FailPoint : [0]|#]
[#|2010-09-29T16:18:50.033-0700|WARNING|glassfish3.1|javax.enterprise.system.core.transaction.com.sun.jts.utils.RecoveryHooks|_ThreadID=15;_ThreadName=Thread-1;|JTS5057: FailPoint : [null]|#]
[#|2010-09-29T16:18:50.040-0700|WARNING|glassfish3.1|javax.enterprise.system.core.transaction.com.sun.jts.utils.RecoveryHooks|_ThreadID=15;_ThreadName=Thread-1;|JTS5057: FailPoint : [1]|#]
[#|2010-09-29T16:18:50.043-0700|WARNING|glassfish3.1|javax.enterprise.system.core.transaction.com.sun.jts.utils.RecoveryHooks|_ThreadID=15;_ThreadName=Thread-1;|JTS5057: FailPoint : [null]|#]
[#|2010-09-29T16:18:50.044-0700|WARNING|glassfish3.1|javax.enterprise.system.core.transaction.com.sun.jts.utils.RecoveryHooks|_ThreadID=15;_ThreadName=Thread-1;|JTS5057: FailPoint : [2]|#]
[#|2010-09-29T16:18:53.745-0700|SEVERE|glassfish3.1|null|_ThreadID=15;_ThreadName=Thread-1;|In ___MyBean:timeout___ timer01 28080|#]
- At this point _in1_ is killed
- Instance _in2_ receives GMS failure recovery notification after a short period of time (see instance log _glassfishv3/glassfish/nodes/localhost/in2/logs/server.log_)
[#|2010-09-29T16:19:15.966-0700|INFO|glassfish3.1|ShoalLogger|_ThreadID=15;_ThreadName=Thread-1;|GMS1016: Analyzing new membership snapshot received as part of event : FAILURE_EVENT for Member: in1 of Group: c1|#]
[#|2010-09-29T16:19:15.966-0700|INFO|glassfish3.1|ShoalLogger|_ThreadID=15;_ThreadName=Thread-1;|GMS1019: The following member has failed: in1 of Group: c1|#]
[#|2010-09-29T16:19:15.968-0700|INFO|glassfish3.1|ShoalLogger|_ThreadID=15;_ThreadName=Thread-1;|GMS1102: Appointed Recovery Server:in2:for failed member:in1:for group:c1|#]
[#|2010-09-29T16:19:15.980-0700|INFO|glassfish3.1|ShoalLogger|_ThreadID=15;_ThreadName=Thread-1;|GMS1004: Sending FailureRecoveryNotification to component TRANSACTION-RECOVERY-SERVICE|#]
[#|2010-09-29T16:19:15.983-0700|INFO|glassfish3.1|ShoalLogger|_ThreadID=15;_ThreadName=Thread-1;|GMS1004: Sending FailureRecoveryNotification to component GlassfishV31|#]
[#|2010-09-29T16:19:15.983-0700|INFO|glassfish3.1|javax.enterprise.system.core.transaction.com.sun.jts.jta|_ThreadID=15;_ThreadName=Thread-1;|[GMSCallBack] failure recovery signal: com.sun.enterprise.ee.cms.impl.common.FailureRecoverySignalImpl@64d74895|#]
- The Transaction Service waits 60 seconds (or the user set value) to ensure that indoubt xids are updated into the database, otherwise while doing the recovery an instance may not get all the correct indoubt xids.
- The recovery process starts (see messages from _javax.enterprise.system.core.transaction.com.sun.enterprise.transaction.jts.recovery_) and when it is executed, you can see
[#|2010-09-29T16:20:16.017-0700|INFO|glassfish3.1|javax.enterprise.resource.resourceadapter.com.sun.enterprise.connectors.inbound|_ThreadID=15;_ThreadName=Thread-1;|Recovery of Inbound Transactions started.|#]
- When resource recovery completes, automatic EJB Timer migration starts as the after event:
[#|2010-09-29T16:20:16.463-0700|INFO|glassfish3.1|javax.enterprise.system.container.ejb.com.sun.ejb.containers|_ThreadID=15;_ThreadName=Thread-1;|[DistributedEJBTimerServiceImpl] afterRecovery event for instance in1|#]
[#|2010-09-29T16:20:16.463-0700|INFO|glassfish3.1|javax.enterprise.system.container.ejb.com.sun.ejb.containers|_ThreadID=15;_ThreadName=Thread-1;|[DistributedEJBTimerServiceImpl] migrating timers from in1|#]
[#|2010-09-29T16:20:16.463-0700|INFO|glassfish3.1|javax.enterprise.system.container.ejb.com.sun.ejb.containers|_ThreadID=15;_ThreadName=Thread-1;|Beginning timer migration process from owner in1 to in2|#]
[#|2010-09-29T16:20:16.699-0700|INFO|glassfish3.1|javax.enterprise.system.container.ejb.com.sun.ejb.containers|_ThreadID=15;_ThreadName=Thread-1;|Timer migration phase 1 complete. Changed ownership of 1 timers. Now reactivating timers...|#]
[#|2010-09-29T16:20:16.701-0700|INFO|glassfish3.1|javax.enterprise.system.container.ejb.com.sun.ejb.containers|_ThreadID=15;_ThreadName=Thread-1;|Rescheduling missed expiration for periodic timer '1@@1285802327613@@in1@@in1' 'TimedObject = MyBean' 'Application = tx-ee-timer-with-autorecovery-web' 'CREATED' 'PERIODIC' 'Container ID = 84266337573208064' 'Wed Sep 29 16:18:48 PDT 2010' '5000' . Last timer expiration occurred at Wed Sep 29 16:19:08 PDT 2010|#]
- Delegated recovery removes the locks, and the EJB Timers are being fired:
[#|2010-09-29T16:20:17.735-0700|SEVERE|glassfish3.1|null|_ThreadID=15;_ThreadName=Thread-1;|In ___MyBean:timeout___ timer01 28080|#]
[#|2010-09-29T16:20:18.773-0700|SEVERE|glassfish3.1|null|_ThreadID=15;_ThreadName=Thread-1;|In ___MyBean:timeout___ timer01 28080|#]
- Validation code queries the table on the XA resource and finds 3 rows:
[#|2010-09-29T16:20:35.743-0700|INFO|glassfish3.1|null|_ThreadID=15;_ThreadName=Thread-1;|Found: BAA280800 : BBB280800|#]
[#|2010-09-29T16:20:35.744-0700|INFO|glassfish3.1|null|_ThreadID=15;_ThreadName=Thread-1;|Found: BAA280801 : BBB280801|#]
[#|2010-09-29T16:20:35.744-0700|INFO|glassfish3.1|null|_ThreadID=15;_ThreadName=Thread-1;|Found: BAA280802 : BBB280802|#]
Cleanup To undeploy the application, stop and remove the cluster do:
This is the end of the demo and the test. |