A deep dive into 2-Phase-Commit with Wildfly and Oracle RAC

Overview and Key Facts

  • JTA uses  javax.transaction.UserTransaction API to manage transaction bounderies
  • Each UserTransaction is associated with a Thread and this thread can only  handle 1 active Transaction in parallel
  • If a UserTransaction times out the Transaction will be rolled back and Transaction status will be set to :  STATUS_ROLLING_BACK
  • To manage Transaction Timout use following API:  ut.setTransactionTimeout(txTimeout)
  • In this sample 2 Wildfly we have configured RAC XA datasources : 
    • RAC bankA maintained and running in London
    • RAC bankB maintained and running in Frankfurt
  • A typical XA transaction consists out of at least 3 different processing steps
    • Step 1:  Data processing : xa_start  DML operation xa_end
    • Step 2:  Prepare the transaction:  xa_prepare
    • Step 3:  Commit the transaction: xa_commit
    • Step 2 and 3 are called 2-Phase-Commit or short 2PC
    • If clusterwide XA transaction are enabled any RAC instance can run any of the above steps
  • 2PC Commit Protocol details:
    • Prepare phase: The TM asks participating RMs to promise to commit (or rollback) the transaction as requested, even if there is a failure. If any RM cannot prepare, the transaction is rolled back.
    •  Commit phase:  If all RMs respond to the TM that they are prepared, then the TM asks all RMs commit the transaction.
  • ut.begin() starts the XA transaction whereas ut.commit() ends, prepares and commits the XA transaction
  • DBA_PENDING_TRANSACTIONS describes unresolved transactions (either due to failure or if the coordinator has not sent a commit/rollback).
  • DBA_2PC_PENDING  describes in-doubt transactions ( after tx timeout the RAC database records these XA transactions missing a commit/rollback)
  • A global XA transaction can span Oracle RAC instances by default ( if not using DTP services ) –
  • The GLOBAL_TXN_PROCESSES initialization parameter, which is set to 1 by default, specifies the initial number of GTX n background processes for each Oracle RAC instance.
  • If GLOBAL_TXN_PROCESSES > 1 clusterwide XA transactions are enabled.

Configuration and Setup

  • Test Framework – click on the image below to find out details !

Framwork_XaTesting

Display_Driver
JDK Version:     1.7.0_71
JDBC-Driver:     Oracle JDBC driver - Version: 12.1.0.2.0

Check_running_Instances
DB Name:     BANKA
Running Instances:     
10:06:16.646 Instance Name: bankA_1 - Host: hract22.example.com
10:06:16.647 Instance Name: bankA_3 - Host: hract21.example.com
DB Name:     BANKB
Running Instances:     
10:06:17.029 Instance Name: bankb_3 - Host: hract22.example.com
10:06:17.029 Instance Name: bankb_1 - Host: hract21.example.com

In this tutorial we will test and evaluate a 2-Phase-Commit scenario between: 
  RAC database bankA and RAC database bankB

RAC db version: 12.1.0.2 / Wildfyl 8.0.2 

To run this code you need to create 2 Oracle XA datasources named 
   java:jboss/datasources/xa_rac12g_banka
   java:jboss/datasources/xa_rac12g_bankb

For details creating a XA datasource with Oracle RAC please read following article:
  Setup a XA Datasource by using the RAC SCAN address

Java Code to run the 2-Phase-Commit operation

  
  @Named("xaTestBean")
  @SessionScoped
  public class XaTestBean implements Serializable
    {   
    @Resource(lookup="java:jboss/datasources/xa_rac12g_banka")
    private DataSource ds1;
   
    @Resource(lookup="java:jboss/datasources/xa_rac12g_bankb")
    private DataSource ds2;
 
    @Resource 
    private UserTransaction ut;
    ..   

   public String transferMoney()
      {
        initTopLevel();
        setTxInfo(Tools.add_hmtl_pre_tag("Calling transferMoney() in progress - try getting XA connections !" )) ;
        try 
          {          
            try ( Connection c1 = ds1.getConnection();
                  Connection c2 = ds2.getConnection(); )              
              {                 
               
                setSess1(Tools.add_hmtl_pre_tag(c1.getMetaData().getURL()));
                setSess2(Tools.add_hmtl_pre_tag(c2.getMetaData().getURL()));               
                setSess1(getSess1() + Tools.add_hmtl_pre_tag(getSessionInfo(c1)));
                System.out.println(getSess1()); System.out.flush();
                setSess2(getSess2() + Tools.add_hmtl_pre_tag( getSessionInfo(c2)));
                System.out.println(getSess2() ); System.out.flush();
                          
                setTxInfo(Tools.add_hmtl_pre_tag("No TX active - before ut.begin() "  + returnUtStatus(ut) + " - Transaction Timeout: " + txTimeout )) ;              
                ut.setTransactionTimeout(txTimeout) ;
                setTxInfo(getTxInfo() + Tools.add_hmtl_pre_tag("Setting New Tansaction Timeout : " + txTimeout)); 
                ut.begin();              
                setTxInfo(getTxInfo() + Tools.add_hmtl_pre_tag("After  ut.begin() "  + returnUtStatus(ut) )) ;
                doUpdate(c1, " - Session 1", "update Accounts set balance = balance - 4000");
                doUpdate(c2, " - Session 2", "update Accounts set balance = balance + 4000");
                setTxInfo(getTxInfo() + Tools.add_hmtl_pre_tag("Tx status before ut.commit() - "  + returnUtStatus(ut) + " - Now sleeping : " 
                        + sleepTime + " [ seconds] " )) ;      
                Thread.sleep(1000*sleepTime); 
                ut.commit();
                setTxInfo(getTxInfo() + Tools.add_hmtl_pre_tag("Tx status after ut.commit() "  + returnUtStatus(ut) )) ;                         
                    // Update Status in index.html too 
                setBankOperation(getBankOperation() + Tools.add_hmtl_pre_tag(getAccountinfo(c1)));
                setBankOperation(getBankOperation() + Tools.add_hmtl_pre_tag(getAccountinfo(c2)));
                setBankOperation(getBankOperation() +  Tools.add_hmtl_pre_tag("Leaving transferMoney() without any errors"));
                c1.close();              
                c2.close();
                setTxInfo(getTxInfo() + Tools.add_hmtl_pre_tag("Leaving transferMoney() without any errors")); 
              }
           
          } catch ( Exception ex ) 
          {   
            String e = ex.getMessage();
            genericExp("Error in JSF top level function: transferMoney() - Need to ROLLBACK transaction", ex);
            setBankOperation(getBankOperation()  + Tools.add_hmtl_pre_tag("FATAL ERROR in transferMoney(): " + e )); 
            try 
              {
                 setTxInfo(getTxInfo() + Tools.add_hmtl_pre_tag("FATAL ERROR: "));
                 setTxInfo(getTxInfo() + Tools.add_hmtl_pre_tag(  "Tx status after Exception before ut.rollback(): "  + 
                         returnUtStatus(ut)  + " - Exception: "+ e));              
                ut.rollback();
                setTxInfo(getTxInfo() + Tools.add_hmtl_pre_tag("Tx status after Exception - but  sucessful ut.rollback(): "   + returnUtStatus(ut)  ));               
              } catch( Exception e2 )
                    {
                       setTxInfo(getTxInfo() +Tools.add_hmtl_pre_tag( "Tx status after Exception: FATAL error during ut.rollback() :"  + 
                                returnUtStatus(ut)  + " - Exception: "  +e2));                   
                    }                    
            return "bankInfo";
          }  
        return "bankInfo";
      }

Testing SQL Exceptions in function transferMoney():

Prepared Test case:
Press : Init_Accounts
Press Transfer Money 3x : TransferMoney_via_2PC 

Output from index.xhtml:
09:59:54.830 Calling  checkAccounts() in progress ... 
09:59:54.835 Database Name:BANKA -- Account: User99_at_BANKA -- Balance: 10000.0
09:59:54.839 Database Name:BANKB -- Account: User98_at_BANKB -- Balance: 10000.0
09:59:54.839 Leaving checkAccounts() without any Error 

10:00:12.397 Database Name:BANKA -- Account: User99_at_BANKA -- Balance: 6000.0
10:00:12.405 Database Name:BANKB -- Account: User98_at_BANKB -- Balance: 14000.0
10:00:12.405 Leaving transferMoney() without any errors

10:00:18.232 Database Name:BANKA -- Account: User99_at_BANKA -- Balance: 2000.0
10:00:18.248 Database Name:BANKB -- Account: User98_at_BANKB -- Balance: 18000.0
10:00:18.249 Leaving transferMoney() without any errors

10:00:21.376 FATAL ERROR in transferMoney(): ORA-02290: check constraint (SCOTT.S_LOWER_CHK) violated


Last SQL:     
10:00:21.373 update Accounts set balance = balance - 4000  - Session 1

Last SQL Error:     
10:00:21.373 Exception in Class: com.hhu.XaTestBean - Methode: doUpdate
java.sql.SQLIntegrityConstraintViolationException: ORA-02290: check constraint (SCOTT.S_LOWER_CHK) violated
  Url: jdbc:oracle:thin:@ract2-scan.grid12c.example.com:1521/banka User: SCOTT
   Stmt: update Accounts set balance = balance - 4000   - Session 1

TX Info:     

10:00:21.267 No TX active - before ut.begin() TX status:  6 - STATUS_NO_TRANSACTION - Transaction Timeout: 5
10:00:21.267 Setting New Tansaction Timeout : 5
10:00:21.273 After  ut.begin() TX status:  0 - STATUS_ACTIVE
10:00:21.377 FATAL ERROR: 
10:00:21.377 Tx status after Exception before ut.rollback(): TX status:  0 - STATUS_ACTIVE 
             - Exception: ORA-02290: check constraint (SCOTT.S_LOWER_CHK) violated
10:00:21.383 Tx status after Exception - but  sucessful ut.rollback(): TX status:  6 - STATUS_NO_TRANSACTION

Code Explanation

  • After calling transferMoney() a 3rd time the database  check contraint gets fired as our balance moves to a negative value
  • Inside the exeception handler the XA transaction was successfully rolled back and our thread ends up with setting its XA status to  : STATUS_NO_TRANSACTION

 

Testing automatic transaction recovery after a COMMIT record was written to Wildlfly LOG-STORE

  • Both XA resources have voted to COMMIT the transaction
  • Wildfly has written a COMMIT record to the  Wildlfly LOG-STORE
  • After this all transaction branches needs to be committed [ either by application server or by the DB admin ]
  • After a system crash Periodic Recovery Thread should use
    XA recover and XA commit to commit this transaction without any user intervention
Setup Testcase :

Press Init Accounts :Init_Accounts
11:04:21.039 Calling  checkAccounts() in progress ... 
11:04:21.088 Database Name:BANKA -- Account: User99_at_BANKA -- Balance: 10000.0
11:04:21.128 Database Name:BANKB -- Account: User98_at_BANKB -- Balance: 10000.0
11:04:21.128 Leaving checkAccounts() without any Error 

Breakpoints : Class oracle.jdbc.xa.OracleXAResource : Methode entry for prepare()  and commit()

Start application by pressing                : TransferMoney_via_2PC

Use debugger and stop processing at Entry to first OracleXAResource.commit call [ Press F5 until you reach this code point ]
Method breakpoint hit in oracle.jdbc.xa.client.OracleXAResource.prepare at line 693 by thread default task-6.
Method breakpoint hit in oracle.jdbc.xa.client.OracleXAResource.prepare at line 693 by thread default task-6.
Method breakpoint hit in oracle.jdbc.xa.client.OracleXAResource.commit at line 553 by thread default task-6.
-> Note:  Wildfly has already written this transaction to its log 

Verify Wildfly LOG-STORE
[oracle@wls1 XA]$  $WILDFLY_HOME/bin/jboss-cli.sh --connect --file=xa.cli
{"outcome" => "success"}
0:ffffc0a805c9:-46761d77:55438636:7ee
-> Wildfly has saved the transaction to its LOG-STORE

Check transaction status at Database level
SQL> select host_name,instance_name from v$instance;
HOST_NAME               INSTANCE_NAME
------------------------------ ----------------
hract22.example.com           bankA_1

SQL> select STATE, LOCAL_TRAN_ID, GLOBAL_TRAN_ID , to_char(FAIL_TIME,'DD-MON-YY HH24:MI:SS') FAIL_TIME,
  2       to_char(FORCE_TIME,'DD-MON-YY HH24:MI:SS') FORCE_TIME,  to_char(RETRY_TIME,'DD-MON-YY HH24:MI:SS') RETRY_TIME,
  3       HOST, OS_USER, COMMIT# from dba_2pc_pending;

STATE         LOCAL_TRAN_ID        GLOBAL_TRAN_ID
---------------- ---------------------- --------------------------------------------------------------------------------
FAIL_TIME       FORCE_TIME          RETRY_TIME     HOST               OS_USER       COMMIT#
------------------ ------------------ ------------------ ------------------------- --------------- ----------------
prepared     7.6.4154        131077.00000000000000000000FFFFC0A805C9B989E28955438636000007EE31
03-MAY-15 08:44:32              03-MAY-15 08:45:05 wls1.example.com       oracle       24656988

SQL> select host_name,instance_name from v$instance;
HOST_NAME               INSTANCE_NAME
------------------------------ ----------------
hract22.example.com           bankb_3

SQL> select STATE, LOCAL_TRAN_ID, GLOBAL_TRAN_ID , to_char(FAIL_TIME,'DD-MON-YY HH24:MI:SS') FAIL_TIME,
  2       to_char(FORCE_TIME,'DD-MON-YY HH24:MI:SS') FORCE_TIME,  to_char(RETRY_TIME,'DD-MON-YY HH24:MI:SS') RETRY_TIME,
  3       HOST, OS_USER, COMMIT# from dba_2pc_pending;

STATE         LOCAL_TRAN_ID        GLOBAL_TRAN_ID
---------------- ---------------------- --------------------------------------------------------------------------------
FAIL_TIME       FORCE_TIME          RETRY_TIME     HOST               OS_USER       COMMIT#
------------------ ------------------ ------------------ ------------------------- --------------- ----------------
prepared     16.17.405        131077.00000000000000000000FFFFC0A805C9B989E28955438636000007EE31
03-MAY-15 08:44:29              03-MAY-15 08:45:20 wls1.example.com       oracle       9841282
**********************************************************
***  -> kill java process/program and restart Wildlfy ****  
**********************************************************
[oracle@wls1 XA]$ ps -elf | grep java
0 S oracle    9264  9210 19  80   0 - 414974 futex_ May02 ?       01:44:51 /usr/java/latest/bin/java -D[Standalone] -Xms128m -Xmx512m -XX:MaxPermSize=256m -server ..
[oracle@wls1 XA]$ kill -9 9264

-> Restart Wildfly - Wildfly Server Log reports 

2015-05-03 08:50:45,017 WARN  [com.arjuna.ats.jta] (Periodic Recovery) ARJUNA016037: 
  Could not find new XAResource to use for recovering non-serializable XAResource XAResourceRecord 
  < resource:null, txid:< formatId=131077, gtrid_length=29, bqual_length=36, 
   tx_uid=0:ffffc0a805c9:-46761d77:55438636:7ee, node_name=1, 
     branch_uid=0:ffffc0a805c9:-46761d77:55438636:7f0, subordinatenodename=null, 
     eis_name=java:jboss/datasources/xa_rac12g_bankb >, 
   heuristic: TwoPhaseOutcome.FINISH_OK, product: 
     Oracle/Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
     With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
     Advanced Analytics and Real Application Testing options, 
    jndiName: java:jboss/datasources/xa_rac12g_bankb com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord@519e22cc >

2015-05-03 08:50:45,017 WARN  [com.arjuna.ats.jta] (Periodic Recovery) ARJUNA016037: 
Could not find new XAResource to use for recovering non-serializable XAResource XAResourceRecord 
  < resource:null, txid:< formatId=131077, gtrid_length=29, bqual_length=36, 
   tx_uid=0:ffffc0a805c9:-46761d77:55438636:7ee, node_name=1, 
   branch_uid=0:ffffc0a805c9:-46761d77:55438636:7f3, subordinatenodename=null, 
   eis_name=java:jboss/datasources/xa_rac12g_banka >, 
   heuristic: TwoPhaseOutcome.FINISH_OK, 
     product: Oracle/Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
     With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
     Advanced Analytics and Real Application Testing options, 
    jndiName: java:jboss/datasources/xa_rac12g_banka com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord@64fea8f0 >

-> Periodic Recovery Thread successfully re-connects to both databases
   Periodic Recovery Thread calls XA recover to retrieve the prepared XIDs from the server
   Periodic Recovery Thread finally commits the data as Wildfly LOG-STORE confirms that all branches are prepared  

Verify that that the data was really committed 
GLOBAL_NAME
--------------------------------------------------------------------------------
BANKA
ACCOUNT                 BALANCE
-------------------------------- ----------
User99_at_BANKA                6000


GLOBAL_NAME
--------------------------------------------------------------------------------
BANKB
ACCOUNT                 BALANCE
-------------------------------- ----------
User98_at_BANKB               14000

Verify Wildfly log :
[oracle@wls1 XA]$  $WILDFLY_HOME/bin/jboss-cli.sh --connect --file=xa.cli
{"outcome" => "success"}

-> After Wildfly restart the transaction was committed and the Wildfly LOG-Store was cleaned !

Testing automatic transaction recovery before a COMMIT record was written to Wildlfly LOG-STORE

  • Only one XA resources have voted to COMMIT the transaction [ ==only 1 prepared  XA transaction branch ]
  • Therefore Wildfly has not written a COMMIT record to  the  Wildlfly LOG-STORE
  • After a system crash the whole transaction should be rolled back
  • After a system crash Periodic Recovery Thread should use  XA recover and  XA rollback to rollback this transaction without any user intervention
Prepare Testcase

Press Init Accounts : Init_Accounts
11:04:21.039 Calling  checkAccounts() in progress ... 
11:04:21.088 Database Name:BANKA -- Account: User99_at_BANKA -- Balance: 10000.0
11:04:21.128 Database Name:BANKB -- Account: User98_at_BANKB -- Balance: 10000.0
11:04:21.128 Leaving checkAccounts() without any Error 

Set Breakpoints : Class oracle.jdbc.xa.OracleXAResource : Methode entry for prepare(), commit() and rollback()
-> Don't forget to set breackpoint for rollback to avoid the Periodic Recovery Thread to rollback the 
   transaction before we have rebooted Wildfly server !

Start Application by pressing: TransferMoney_via_2PC
Use debugger and stop at Entry of 2.nd  oracle.jdbc.xa.client.OracleXAResource.prepare call
Method breakpoint hit in oracle.jdbc.xa.client.OracleXAResource.prepare at line 693 by thread default task-6.
Method breakpoint hit in oracle.jdbc.xa.client.OracleXAResource.prepare at line 693 by thread default task-6.

Verify Wildfly LOG-STORE
[oracle@wls1 XA]$   $WILDFLY_HOME/bin/jboss-cli.sh --connect --file=xa.cli
{"outcome" => "success"}
-> No transaction reference in Wildfly LOG-STORE

SQL> select host_name,instance_name from v$instance;
HOST_NAME               INSTANCE_NAME
------------------------------ ----------------
hract22.example.com           bankA_1

SQL> select STATE, LOCAL_TRAN_ID, GLOBAL_TRAN_ID , to_char(FAIL_TIME,'DD-MON-YY HH24:MI:SS') FAIL_TIME,
  2       to_char(FORCE_TIME,'DD-MON-YY HH24:MI:SS') FORCE_TIME,  to_char(RETRY_TIME,'DD-MON-YY HH24:MI:SS') RETRY_TIME,
  3       HOST, OS_USER, COMMIT# from dba_2pc_pending;
no rows selected

SQL> set echo off
Connected.
SQL> select host_name,instance_name from v$instance;
HOST_NAME               INSTANCE_NAME
------------------------------ ----------------
hract22.example.com           bankb_3
SQL> select STATE, LOCAL_TRAN_ID, GLOBAL_TRAN_ID , to_char(FAIL_TIME,'DD-MON-YY HH24:MI:SS') FAIL_TIME,
  2       to_char(FORCE_TIME,'DD-MON-YY HH24:MI:SS') FORCE_TIME,  to_char(RETRY_TIME,'DD-MON-YY HH24:MI:SS') RETRY_TIME,
  3       HOST, OS_USER, COMMIT# from dba_2pc_pending;
STATE         LOCAL_TRAN_ID        GLOBAL_TRAN_ID
---------------- ---------------------- --------------------------------------------------------------------------------
FAIL_TIME       FORCE_TIME          RETRY_TIME     HOST               OS_USER       COMMIT#
------------------ ------------------ ------------------ ------------------------- --------------- ----------------
prepared     14.2.389        131077.00000000000000000000FFFFC0A805C9614B927B5545C53A000000A631
03-MAY-15 09:23:47              03-MAY-15 09:26:12 wls1.example.com       oracle       9848205
-> No entry in Wildfls LOG-STORE - only one Transaction Branch is prepared on instance bankb_3

Note : During testing Periodic recovery Threads stops at our breakpoint in oracle.jdbc.xa.client.OracleXAResource.rollback
"Periodic Recovery"
oracle.jdbc.xa.client.OracleXAResource.rollback(OracleXAResource.java:945)
org.jboss.jca.adapters.jdbc.xa.XAManagedConnection.rollback(XAManagedConnection.java:346)
org.jboss.jca.core.tx.jbossts.XAResourceWrapperImpl.rollback(XAResourceWrapperImpl.java:196)
com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.handleOrphan(XARecoveryModule.java:786)
com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.xaRecoverySecondPass(XARecoveryModule.java:692)
com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.bottomUpRecovery(XARecoveryModule.java:431)
com.arjuna.ats.internal.jta.recovery.arjunacore.XARecoveryModule.periodicWorkSecondPass(XARecoveryModule.java:212)
com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:789)
com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:371)
-> This means with a breakpoint at rollback() the Periodic Recovery Thread would have already 
   delete this XA transaction. 

Now Kill Java program and verify whether records can be locked  for DML operations: 
[oracle@wls1 XA]$ ps -elf | grep java
0 S oracle   14493 14438 18  80   0 - 413832 futex_ 08:50 ?       00:07:13 /usr/java/latest/bin/java -D[Standalone] -Xms128m -Xmx512m -XX:MaxPermSize=256m -server ...
[oracle@wls1 XA]$ kill -9 14493
GLOBAL_NAME
--------------------------------------------------------------------------------
BANKA
ACCOUNT                 BALANCE
-------------------------------- ----------
User99_at_BANKA               10000
Connected.
--> No problem at bankA

GLOBAL_NAME
--------------------------------------------------------------------------------
BANKB
select * from accounts for update              *
ERROR at line 1:
ORA-01591: lock held by in-doubt distributed transaction 14.2.389
-> As we have only one prepared Transaction - ORA-1591 comes up only at bankB

-> Now Restart Wildfly to fix the ORA-1591 problem 
SQL> @check_it
GLOBAL_NAME
--------------------------------------------------------------------------------
BANKA
ACCOUNT                 BALANCE
-------------------------------- ----------
User99_at_BANKA               10000

GLOBAL_NAME
--------------------------------------------------------------------------------
BANKB
ACCOUNT                 BALANCE
-------------------------------- ----------
User98_at_BANKB               10000
-> After Wildfly restart the transaction was rolled back and we can access 
   both records for DML operations. 

Heuristically committed transaction – A major Pitfall for DBAs

Assume we have following configruation:
System A : RAC database running database bankA running London
System B : RAC database running database bankB running in Frankfurt 
System C : Wildfly application server 

After a network outage all the systems are not able to communicate with each other but 
these systems are still up and running.
 
Now System C - our Wildfly server - is crashing during a 2-Phase-Commit. 
The crash occurs after having prepared both transaction an for database bankA, bankB 
and Wildfly has already written a COMMIT record to its LOG-STAORE.

Setup testcase: Init_Accounts
09:10:41.653 Calling  checkAccounts() in progress ... 
09:10:41.677 Database Name:BANKA -- Account: User99_at_BANKA -- Balance: 10000.0
09:10:41.717 Database Name:BANKB -- Account: User98_at_BANKB -- Balance: 10000.0
09:10:41.719 Leaving checkAccounts() without any Error 

Setup testcase:
Tx Timeout : 60  Change_Timeout

-> Set breakpoint in commit() metode of oracle.jdbc.xa.OracleXAResource 
Method breakpoint hit in oracle.jdbc.xa.client.OracleXAResource.commit at line 553 by thread default task-10.
Thread default task-10 stopped at OracleXAResource.java:553.

Start the application by pressing: TransferMoney_via_2PC
-> Wait until we get ORA-1591 accessing these records:
BANKA:
SQL> connect scott/tiger@ract2-scan.grid12c.example.com:1521/banka
SQL> select * from accounts for update;
select * from accounts for update              *
ERROR at line 1:
ORA-01591: lock held by in-doubt distributed transaction 18.24.2100

BANKB:
SQL> connect scott/tiger@ract2-scan.grid12c.example.com:1521/bankb
Connected.
SQL> select * from accounts for update;
select * from accounts for update
ERROR at line 1:
ORA-01591: lock held by in-doubt distributed transaction 12.26.389

Now a lot of end users calls in and raise complaints to their DBAs that a major application running 
on RAC DB bankA and RAC DB bankB are not working  anymore.

The DBA on bankA ( London)  checks   dba_2pc_pending and found a prepared TX:
SQL> select STATE, LOCAL_TRAN_ID, GLOBAL_TRAN_ID , to_char(FAIL_TIME,'DD-MON-YY HH24:MI:SS') FAIL_TIME,
  2       to_char(FORCE_TIME,'DD-MON-YY HH24:MI:SS') FORCE_TIME,  to_char(RETRY_TIME,'DD-MON-YY HH24:MI:SS') RETRY_TIME,
  3       HOST, OS_USER, COMMIT# from dba_2pc_pending;

STATE         LOCAL_TRAN_ID        GLOBAL_TRAN_ID
---------------- ---------------------- --------------------------------------------------------------------------------
FAIL_TIME       FORCE_TIME          RETRY_TIME     HOST               OS_USER       COMMIT#
------------------ ------------------ ------------------ ------------------------- --------------- ----------------
prepared     18.24.2100        131077.00000000000000000000FFFFC0A805C9B989E289554386360000038B31
02-MAY-15 09:12:03              02-MAY-15 09:17:31 wls1.example.com       oracle       24303841
 
The DBA on bankB ( Frankfurt ) checks  dba_2pc_pending and found a prepared TX: 
SQL> select STATE, LOCAL_TRAN_ID, GLOBAL_TRAN_ID , to_char(FAIL_TIME,'DD-MON-YY HH24:MI:SS') FAIL_TIME,
  2       to_char(FORCE_TIME,'DD-MON-YY HH24:MI:SS') FORCE_TIME,  to_char(RETRY_TIME,'DD-MON-YY HH24:MI:SS') RETRY_TIME,
  3       HOST, OS_USER, COMMIT# from dba_2pc_pending;

STATE         LOCAL_TRAN_ID        GLOBAL_TRAN_ID
---------------- ---------------------- --------------------------------------------------------------------------------
FAIL_TIME       FORCE_TIME          RETRY_TIME     HOST               OS_USER       COMMIT#
------------------ ------------------ ------------------ ------------------------- --------------- ----------------
prepared     12.26.389        131077.00000000000000000000FFFFC0A805C9B989E289554386360000038B31
02-MAY-15 09:12:02              02-MAY-15 09:18:30 wls1.example.com       oracle       9400626

Now things goes wrong. 
-> The database itself does not have any ideas whether Wildfly has written a COMMIT record to the LOG-STORE or not. 
   The DBAs doesn't check the Wildfly LOG-STORE 
   Due to the pressure to get ONLINE asap the  DBA's does the following :

DBA for RAC DB bankA ( London) 
SQL> connect sys/sys@ract2-scan.grid12c.example.com:1521/bankA as sysdba
Connected.
SQL> select STATE, LOCAL_TRAN_ID  from dba_2pc_pending;
STATE         LOCAL_TRAN_ID
---------------- ----------------------
prepared     18.24.2100

SQL> rollback force '18.24.2100';
Rollback complete.

DBA for RAC DB bankB ( Frankfurt) 
SQL>  connect sys/sys@ract2-scan.grid12c.example.com:1521/bankB as sysdba
Connected.
SQL>  select STATE, LOCAL_TRAN_ID  from dba_2pc_pending;
STATE         LOCAL_TRAN_ID
---------------- ----------------------
prepared     12.26.389
SQL> commit force '12.26.389';
Commit complete.

Now both DBAs verify whether we can access these records for DML operation
SQL> connect scott/tiger@ract2-scan.grid12c.example.com:1521/banka
Connected.
SQL> 
SQL> select * from accounts for update;
ACCOUNT                 BALANCE
-------------------------------- ----------
User99_at_BANKA               10000


SQL> connect scott/tiger@ract2-scan.grid12c.example.com:1521/bankb
SQL> select * from accounts for update;
ACCOUNT                 BALANCE
-------------------------------- ----------
User98_at_BANKB               14000

-> Works fine - really ?
   The DBAs have logically corrupted ozr system integrity !!
   We have successfully added the money to User98_at_BANKB but not reduced the money 
   from account  User99_at_BANKA !
   
Note this should never happen at your production system -  or you may loose your DBA job very easily !!
                         
How we can avoid this problem ?
1. Check whether Wildfly has written a commit record 
[oracle@wls1 XA]$  $WILDFLY_HOME/bin/jboss-cli.sh --connect --file=xa.cli
{"outcome" => "success"}
0:ffffc0a805c9:-46761d77:55438636:38b
--> There is a  COMMIT record for XID :  0:ffffc0a805c9:-46761d77:55438636:38b -  so it is save to call : 
       COMMIT FORCE 'Local_Tran_ID'
    If we can't find the XID within Wildfly it is save to call on all involved databases : 
       ROLLBACK FORCE 'Local_Tran_ID';   

A closer look on Exceptions 
ut.commit() fail with :
TX Info:     
10:21:58.711 No TX active - before ut.begin() TX status:  6 - STATUS_NO_TRANSACTION - Transaction Timeout: 60
10:21:58.720 Setting New Tansaction Timeout : 60
10:22:04.640 After  ut.begin() TX status:  0 - STATUS_ACTIVE
10:22:04.709 Tx status before ut.commit() - TX status:  0 - STATUS_ACTIVE - Now sleeping : 1 [ seconds] 
10:28:03.579 FATAL ERROR: 
10:28:03.582 Tx status after Exception before ut.rollback(): TX status:  6 - STATUS_NO_TRANSACTION - Exception: null
10:28:03.585 Tx status after Exception: FATAL error during ut.rollback() :TX status:  6 - STATUS_NO_TRANSACTION 

Java Code throws javax.transaction.HeuristicMixedException : 
10:28:03.558 Error in JSF top level function: transferMoney() - Need to ROLLBACK transaction
10:28:03.576 javax.transaction.HeuristicMixedException
    at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1174)
    at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:126)
    at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:75)
    at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.commit(ServerVMClientUserTransaction.java:173)
    at com.hhu.XaTestBean.transferMoney(XaTestBean.java:778)

Wildfly server.log reports :
2015-05-02 10:27:56,445 WARN  [com.arjuna.ats.jta] (default task-1) 
ARJUNA016036: commit on < formatId=131077, gtrid_length=29, bqual_length=36, 
tx_uid=0:ffffc0a805c9:-46761d77:55438636:4d3, node_name=1, 
branch_uid=0:ffffc0a805c9:-46761d77:55438636:4d5, subordinatenodename=null, 
eis_name=java:jboss/datasources/xa_rac12g_bankb > (XAResourceWrapperImpl@172657b2[xaResource=org.jboss.jca.adapters.jdbc.xa.XAManagedConnection@489e2cb1 pad=false 
  overrideRmValue=null productName=Oracle productVersion=Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options 
 jndiName=java:jboss/datasources/xa_rac12g_bankb]) 
  failed with exception $XAException.XAER_NOTA: oracle.jdbc.xa.OracleXAException
        at oracle.jdbc.driver.T4CXAResource.kputxrec(T4CXAResource.java:965)
        at oracle.jdbc.driver.T4CXAResource.doCommit(T4CXAResource.java:454)
        at oracle.jdbc.xa.client.OracleXAResource.commit(OracleXAResource.java:583)
        at org.jboss.jca.adapters.jdbc.xa.XAManagedConnection.commit(XAManagedConnection.java:338)
        at org.jboss.jca.core.tx.jbossts.XAResourceWrapperImpl.commit(XAResourceWrapperImpl.java:107)
        at com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord.topLevelCommit(XAResourceRecord.java:461)
        at com.arjuna.ats.arjuna.coordinator.BasicAction.doCommit(BasicAction.java:2810)
        at com.arjuna.ats.arjuna.coordinator.BasicAction.doCommit(BasicAction.java:2726)
        at com.arjuna.ats.arjuna.coordinator.BasicAction.phase2Commit(BasicAction.java:1820)
        at com.arjuna.ats.arjuna.coordinator.BasicAction.End(BasicAction.java:1504)
        at com.arjuna.ats.arjuna.coordinator.TwoPhaseCoordinator.end(TwoPhaseCoordinator.java:96)
        at com.arjuna.ats.arjuna.AtomicAction.commit(AtomicAction.java:162)
        at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1166)
        at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:126)
        at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:75)
        at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.commit(ServerVMClientUserTransaction.java:173)
        at com.hhu.XaTestBean.transferMoney(XaTestBean.java:778) [classes:]
->  JAVA Code is failing with $XAException.XAER_NOTA: oracle.jdbc.xa.OracleXAException as the related XA transaction is 
    already committed using COMMIT FORCE 'Local_Tran_ID'  or rolled back using ROLLBACK force  'Local_Tran_ID' at Database level.

Cleanup system state after above getting a heuristically committed XA transaction 
 - Try to repair the logical corruption from any logfile we can find tracking SQL change and TX id.
 - Cleanup the system [ Wildfly LOG-STORE and Oracle dba_2pc_pending table ]

Cleanup Wildfly LOG-STORE
Verify currents XA transaction status  
[oracle@wls1 XA]$  $WILDFLY_HOME/bin/jboss-cli.sh --connect --file=xa.cli
{"outcome" => "success"}
0:ffffc0a805c9:-46761d77:55438636:38b
0:ffffc0a805c9:-46761d77:55438636:439
0:ffffc0a805c9:-46761d77:55438636:4d3

SQL> connect sys/sys@ract2-scan.grid12c.example.com:1521/bankb as sysdba
SQL>  select STATE, LOCAL_TRAN_ID, GLOBAL_TRAN_ID from dba_2pc_pending;
STATE         LOCAL_TRAN_ID        GLOBAL_TRAN_ID
---------------- ---------------------- ----------------------------------------------------------------------
forced commit     12.26.389        131077.00000000000000000000FFFFC0A805C9B989E289554386360000038B31
forced commit     17.21.390        131077.00000000000000000000FFFFC0A805C9B989E289554386360000043931
forced commit     14.26.382        131077.00000000000000000000FFFFC0A805C9B989E28955438636000004D331

SQL> connect sys/sys@ract2-scan.grid12c.example.com:1521/banka as sysdba
SQL> select STATE, LOCAL_TRAN_ID, GLOBAL_TRAN_ID from dba_2pc_pending;
STATE         LOCAL_TRAN_ID        GLOBAL_TRAN_ID
---------------- ---------------------- ----------------------------------------------------------------------
forced rollback  18.24.2100        131077.00000000000000000000FFFFC0A805C9B989E289554386360000038B31
forced rollback  14.1.2092        131077.00000000000000000000FFFFC0A805C9B989E289554386360000043931
forced rollback  16.4.2077        131077.00000000000000000000FFFFC0A805C9B989E28955438636000004D331

Purge the XA transaction from Wildfly LOG-STORE
$WILDFLY_HOME/bin/jboss-cli.sh --connect 
   "/subsystem=transactions/log-store=log-store/transactions=0\:ffffc0a805c9\:-46761d77\:55438636\:38b:delete"
{"outcome" => "success"}
[oracle@wls1 XA]$ $WILDFLY_HOME/bin/jboss-cli.sh --connect
    "/subsystem=transactions/log-store=log-store/transactions=0\:ffffc0a805c9\:-46761d77\:55438636\:439:delete"
{"outcome" => "success"}
[oracle@wls1 XA]$  $WILDFLY_HOME/bin/jboss-cli.sh --connect 
     "/subsystem=transactions/log-store=log-store/transactions=0\:ffffc0a805c9\:-46761d77\:55438636\:4d3:delete"
{"outcome" => "success"}
[oracle@wls1 XA]$ $WILDFLY_HOME/bin/jboss-cli.sh --connect --file=xa.cli
{"outcome" => "success"}

Finally purge XA transactions at DB level run on both RAC nodes:
SQL> execute DBMS_TRANSACTION.PURGE_MIXED('18.24.2100');
SQL> execute DBMS_TRANSACTION.PURGE_MIXED('14.1.2092');
SQL> execute DBMS_TRANSACTION.PURGE_MIXED('16.4.2077');

Understanding ORA-1591 : lock held by in-doubt distributed transaction

Change Transaction Timeout: 60  Change_Timeout 
Set Breakpoints in : oracle.jdbc.xa.OracleXAResource prepare() and commit() functions 
Press F5 untill we reach first Commit breakpoint:
  Method breakpoint hit in oracle.jdbc.xa.client.OracleXAResource.prepare at line 693 by thread default task-1.
  Method breakpoint hit in oracle.jdbc.xa.client.OracleXAResource.prepare at line 693 by thread default task-1.
  Method breakpoint hit in oracle.jdbc.xa.client.OracleXAResource.commit at line 553 by thread default task-1.

Press : TransferMoney_via_2PC
Transaction status - immediate after stopping: 
Connected.
SQL> select host_name,instance_name from v$instance;
HOST_NAME               INSTANCE_NAME
------------------------------ ----------------
hract22.example.com           bankA_2

SQL> select * from  sys.dba_pending_transactions;
  FORMATID GLOBALID                                                      BRANCHID
---------- ------------------------------------------------------------ -------------------------------------------------------
    131077 00000000000000000000FFFFC0A805C9B989E289554386360000005931    00000000000000000000FFFFC0A805C9B989E289554386360000005
                                                                         E0000000200000000

SQL> select STATE, LOCAL_TRAN_ID, GLOBAL_TRAN_ID , to_char(FAIL_TIME,'DD-MON-YY HH24:MI:SS') FAIL_TIME,
  2       to_char(FORCE_TIME,'DD-MON-YY HH24:MI:SS') FORCE_TIME,  to_char(RETRY_TIME,'DD-MON-YY HH24:MI:SS') RETRY_TIME,
  3       HOST, OS_USER, COMMIT# from dba_2pc_pending;
no rows selected


SQL> select host_name,instance_name from v$instance;
HOST_NAME               INSTANCE_NAME
------------------------------ ----------------
hract22.example.com           bankb_3

SQL> select * from  sys.dba_pending_transactions;
  FORMATID GLOBALID                                                      BRANCHID
---------- ------------------------------------------------------------ -------------------------------------------------------
    131077 00000000000000000000FFFFC0A805C9B989E289554386360000005931    00000000000000000000FFFFC0A805C9B989E289554386360000005
                                                                         B0000000300000000

SQL> select STATE, LOCAL_TRAN_ID, GLOBAL_TRAN_ID , to_char(FAIL_TIME,'DD-MON-YY HH24:MI:SS') FAIL_TIME,
  2       to_char(FORCE_TIME,'DD-MON-YY HH24:MI:SS') FORCE_TIME,  to_char(RETRY_TIME,'DD-MON-YY HH24:MI:SS') RETRY_TIME,
  3       HOST, OS_USER, COMMIT# from dba_2pc_pending;
no rows selected

Script to track XA transactions within Wildfly:
[oracle@wls1 XA]$ cat xa.cli
/subsystem=transactions/log-store=log-store/:probe()
ls /subsystem=transactions/log-store=log-store/transactions

[oracle@wls1 XA]$  $WILDFLY_HOME/bin/jboss-cli.sh --connect --file=xa.cli
{"outcome" => "success"}
0:ffffc0a805c9:-46761d77:55438636:59

-> The prepared transaction branchess are visible in sys.dba_pending_transactions - but not yet in  dba_2pc_pending
   Wildfly reports the transaction in its LOG-STORE

Now wait 60 seconds until Oracle times out the XA Transaction and  creates an in-doubt transaction 
SQL> select host_name,instance_name from v$instance;
HOST_NAME                      INSTANCE_NAME
------------------------------ ----------------
hract22.example.com            bankA_2
SQL> select * from  sys.dba_pending_transactions;
  FORMATID GLOBALID                                                     BRANCHID
---------- ------------------------------------------------------------ -------------------------------------------------------
    131077 00000000000000000000FFFFC0A805C9B989E289554386360000005931    00000000000000000000FFFFC0A805C9B989E289554386360000005
                                                                         E0000000200000000
SQL> select STATE, LOCAL_TRAN_ID, GLOBAL_TRAN_ID , to_char(FAIL_TIME,'DD-MON-YY HH24:MI:SS') FAIL_TIME,
  2       to_char(FORCE_TIME,'DD-MON-YY HH24:MI:SS') FORCE_TIME,  to_char(RETRY_TIME,'DD-MON-YY HH24:MI:SS') RETRY_TIME,
  3       HOST, OS_USER, COMMIT# from dba_2pc_pending;
STATE         LOCAL_TRAN_ID        GLOBAL_TRAN_ID
---------------- ---------------------- --------------------------------------------------------------------------------
FAIL_TIME       FORCE_TIME          RETRY_TIME     HOST               OS_USER       COMMIT#
------------------ ------------------ ------------------ ------------------------- --------------- ----------------
prepared     3.18.4315        131077.00000000000000000000FFFFC0A805C9B989E289554386360000005931
01-MAY-15 16:15:02              01-MAY-15 16:15:35 wls1.example.com       oracle       24161321

SQL> select host_name,instance_name from v$instance;
HOST_NAME               INSTANCE_NAME
------------------------------ ----------------
hract22.example.com           bankb_3

SQL> select * from  sys.dba_pending_transactions;
  FORMATID GLOBALID                                                      BRANCHID
---------- ------------------------------------------------------------ -------------------------------------------------------
    131077 00000000000000000000FFFFC0A805C9B989E289554386360000005931    00000000000000000000FFFFC0A805C9B989E289554386360000005
                                                                         B0000000300000000
SQL> select STATE, LOCAL_TRAN_ID, GLOBAL_TRAN_ID , to_char(FAIL_TIME,'DD-MON-YY HH24:MI:SS') FAIL_TIME,
  2       to_char(FORCE_TIME,'DD-MON-YY HH24:MI:SS') FORCE_TIME,  to_char(RETRY_TIME,'DD-MON-YY HH24:MI:SS') RETRY_TIME,
  3       HOST, OS_USER, COMMIT# from dba_2pc_pending;
STATE         LOCAL_TRAN_ID        GLOBAL_TRAN_ID
---------------- ---------------------- --------------------------------------------------------------------------------
FAIL_TIME       FORCE_TIME          RETRY_TIME     HOST               OS_USER       COMMIT#
------------------ ------------------ ------------------ ------------------------- --------------- ----------------
prepared     15.23.381        131077.00000000000000000000FFFFC0A805C9B989E289554386360000005931
01-MAY-15 16:15:00              01-MAY-15 16:15:51 wls1.example.com       oracle       9263234

-> The prepared transaction branches  are visible in sys.dba_pending_transactions and  dba_2pc_pending
   Wildfly reports the transactions as stored in the log store !

Note: Accessing a record  being part of a prepared distributed transaction will throw ORA-1591 
      and print out the local transaction ID. 

Connected.
select * from accounts for update              *
ERROR at line 1:
ORA-01591: lock held by in-doubt distributed transaction 3.18.4315

Connected.
select * from accounts for update
ERROR at line 1:
ORA-01591: lock held by in-doubt distributed transaction 15.23.381
 
Now Press F5 again until we get back control via our HTML page

Note: Even the entry was in dba_2pc_pending running the commit sucessfully finshed the distributed transaction 
      Entries in  sys.dba_pending_transactions and  dba_2pc_pending are removed and Wildfly LOG-STORE 
      was cleaned up.

Understanding ORA-2049 error: distributed transaction waiting for lock

Lock a record via sqlplus 
SQL>  select BALANCE from Accounts for update;
   BALANCE
----------
     10000
Press:  TransferMoney_via_2PC
 
Generic Error:     
17:05:19.248 Error in JSF top level function: transferMoney() - Need to ROLLBACK transaction
17:05:19.248 ORA-02049: timeout: distributed transaction waiting for lock
17:05:19.249 java.sql.SQLSyntaxErrorException: ORA-02049: timeout: distributed transaction waiting for lock
    at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:450)
    at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:399)
    at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:1059)
    at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:522)
    at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:257)
    at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:587)
    at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:210)
    at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:30)
    at oracle.jdbc.driver.T4CStatement.executeForRows(T4CStatement.java:931)
    at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1150)
    at oracle.jdbc.driver.OracleStatement.executeUpdateInternal(OracleStatement.java:1707)
    at oracle.jdbc.driver.OracleStatement.executeUpdate(OracleStatement.java:1670)
    at oracle.jdbc.driver.OracleStatementWrapper.executeUpdate(OracleStatementWrapper.java:310)
    at org.jboss.jca.adapters.jdbc.WrappedStatement.executeUpdate(WrappedStatement.java:375)
    at com.hhu.XaTestBean.doUpdate(XaTestBean.java:603)

Transaction status 
17:04:18.911 No TX active - before ut.begin() TX status:  6 - STATUS_NO_TRANSACTION - Transaction Timeout: 5
17:04:18.911 Setting New Tansaction Timeout : 5
17:04:18.990 After  ut.begin() TX status:  0 - STATUS_ACTIVE
17:05:19.250 FATAL ERROR: 
17:05:19.250 Tx status after Exception before ut.rollback(): TX status:  9 - STATUS_ROLLING_BACK 
            - Exception: ORA-02049: timeout: distributed transaction waiting for lock
17:05:19.284 Tx status after Exception - but  successful ut.rollback(): TX status:  6 - STATUS_NO_TRANSACTION 

Verify the related database init.ora parameter
SQL> show parameter distributed_lock_timeout
NAME                     TYPE     VALUE
------------------------------------ ----------- ------------------------------
distributed_lock_timeout         integer     60


 Note: It's not a good idea to increase distributed_lock_timeout without understanding what is going on.
       In a first approach try to re-run the failed code.
       Any OLTP application should not lock a record for more than 60 seconds
   
       It if a session times out within Oracle, then the Oracle database’s Process Monitor (PMON) will delete 
         the transaction branch if not prepared  or cause the transaction to become in-doubt if prepared.
      
     Generic Timeout Settings : Transaction Timeout for JDBC  <  DISTRIBUTED_LOCK_TIMEOUT . 

Explanation:

  • The java code tries to lock the record which is locked by our sqlplus session
  • After waiting 60 seconds the SQL processing returns an exception:
          java.sql.SQLSyntaxErrorException: ORA-02049: timeout: distributed transaction waiting for lock
  •  This wait time is controlled by the init.ora parameter distributed_lock_timeout
    SQL> show parameter distributed_lock_timeout
    NAME                        TYPE     VALUE
    —————————— ———– —————-
    distributed_lock_timeout         integer     60
  • As for any other exception the XA transactions needs to be rolled back. 

 

Understanding Transaction Timeout

Setup testcase: 
Transaction Timeout: 5         
Sleep Time:         10  Change_Timeout
Press                :  TransferMoney_via_2PC

Error Stack 
17:53:21.729 Error in JSF top level function: transferMoney() - Need to ROLLBACK transaction
17:53:21.729 ARJUNA016102: The transaction is not active! Uid is 0:ffffc0a805c9:410b1af0:55424451:c6
17:53:21.730 javax.transaction.RollbackException: ARJUNA016102: The transaction is not active! Uid is 0:ffffc0a805c9:410b1af0:55424451:c6
    at com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionImple.commitAndDisassociate(TransactionImple.java:1156)
    at com.arjuna.ats.internal.jta.transaction.arjunacore.BaseTransaction.commit(BaseTransaction.java:126)
    at com.arjuna.ats.jbossatx.BaseTransactionManagerDelegate.commit(BaseTransactionManagerDelegate.java:75)
    at org.jboss.tm.usertx.client.ServerVMClientUserTransaction.commit(ServerVMClientUserTransaction.java:173)
    at com.hhu.XaTestBean.transferMoney(XaTestBean.java:769)

Tx Status:
17:53:11.556 No TX active - before ut.begin() TX status:  6 - STATUS_NO_TRANSACTION - Transaction Timeout: 5
17:53:11.556 Setting New Tansaction Timeout : 5
17:53:11.563 After  ut.begin() TX status:  0 - STATUS_ACTIVE
17:53:11.726 Tx status before ut.commit() - TX status:  0 - STATUS_ACTIVE - Now sleeping : 10 [ seconds] 
17:53:21.730 FATAL ERROR: 
17:53:21.730 Tx status after Exception before ut.rollback(): TX status:  6 - STATUS_NO_TRANSACTION - 
             Exception: ARJUNA016102: The transaction is not active! Uid is 0:ffffc0a805c9:410b1af0:55424451:c6
17:53:21.730 Tx status after Exception: FATAL error during ut.rollback() :TX status:  6 - STATUS_NO_TRANSACTION - 
             Exception: java.lang.IllegalStateException: BaseTransaction.rollback - ARJUNA016074: no transaction!

Note :
  Transaction Timeout for JDBC  <  DISTRIBUTED_LOCK_TIMEOUT ( default : 60 seconds ) .

Code  Explanation:

  • After 10 seconds [ our sleeptime ] the transaction processing step has finished and ut.commit() is called
  • Because we have reached transaction Timeout of 5 seconds the ut.commit() call fails
  • In the exception handler the ut.rollback() call becomes a NOOP as the parent transaction was already deleted.

 

Debugging  UserTransaction API and XAResource API by using a Netbeans Debugger

Using Netbeans to debug the JAVA XA code more in detail 
1. Start Wildfly in Debug Mode
2. Add Breakpoint: 
   Debugger   : JAVA    Breakpoint Type : Methode
   Class Name :  oracle.jdbc.xa.OracleXAResource       
   Check      : All methods for a given Class
   Method Name: <all methods>
   Stop on    : Method Entry 
3. Debug -> Debug Main Program
4. Set breakpoint at line ut.begin() 
5. Debug -> Run to Cursor  
6. Press F5 until our HTML page gets  back control 

Breakpoint hit at line 762 in class com.hhu.XaTestBean by thread default task-15.
Thread default task-15 stopped at XaTestBean.java:762.

ut.beginn() translates to the following XA calls
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.setTransactionTimeout at line 1063 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.client.OracleXAResource.start at line 163 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.saveAndAlterAutoCommitModeForGlobalTransaction at line 797 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.checkError at line 1188 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.createOrUpdateXid at line 296 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.getMatchingXidListEntry at line 234 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.enterGlobalTxnMode at line 1179 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.isSameRM at line 995 by thread default task-15.
-> Transaction for bankA is started, isSameRM() is used to check whether bankA and bankB are using the same Resource Manager 
   As we have 2 different RAC systems this call should return false ! 

Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.getConnection at line 945 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.setTransactionTimeout at line 1063 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.client.OracleXAResource.start at line 163 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.saveAndAlterAutoCommitModeForGlobalTransaction at line 797 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.checkError at line 1188 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.createOrUpdateXid at line 296 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.getMatchingXidListEntry at line 234 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.enterGlobalTxnMode at line 1179 by thread default task-15.
-> Transaction for bankB is started 

ut.commit() translates to the following XA calls 
Method breakpoint hit in oracle.jdbc.xa.client.OracleXAResource.end at line 358 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.suspendStacked at line 430 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.isSameXid at line 256 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.isXidSuspended at line 398 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.getMatchingXidListEntry at line 234 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.removeXidFromList at line 244 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.isSameXid at line 256 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.isXidListEmpty at line 274 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.exitGlobalTxnMode at line 1167 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.checkError at line 1188 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.restoreAutoCommitModeForGlobalTransaction at line 768 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.client.OracleXAResource.prepare at line 693 by thread default task-15.
-> For bankA the XA transaction is finished by OracleXAResource.end() and prepared by OracleXAResource.prepare()

Method breakpoint hit in oracle.jdbc.xa.client.OracleXAResource.end at line 358 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.suspendStacked at line 430 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.isSameXid at line 256 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.isXidSuspended at line 398 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.getMatchingXidListEntry at line 234 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.removeXidFromList at line 244 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.isSameXid at line 256 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.isXidListEmpty at line 274 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.exitGlobalTxnMode at line 1167 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.checkError at line 1188 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.restoreAutoCommitModeForGlobalTransaction at line 768 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.client.OracleXAResource.prepare at line 693 by thread default task-15.
--> For bankB the XA transaction is finished by OracleXAResource.end() and prepared by OracleXAResource.prepare()
    Now as all XA branches are prepared the global transaction can be commited !

Method breakpoint hit in oracle.jdbc.xa.client.OracleXAResource.commit at line 553 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.removeXidFromList at line 244 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.isSameXid at line 256 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.exitGlobalTxnMode at line 1167 by thread default task-15.
--> Commit XA transaction at bankA 

Method breakpoint hit in oracle.jdbc.xa.client.OracleXAResource.commit at line 553 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.removeXidFromList at line 244 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.isSameXid at line 256 by thread default task-15.
Method breakpoint hit in oracle.jdbc.xa.OracleXAResource.exitGlobalTxnMode at line 1167 by thread default task-15.
--> Commit XA transaction at bankB 

Note : The Periodic Recovery Thread is also using our XAResources and tries to complete any failed XA transaction
        

 

SQL Scripts and Wildfly Scripts

Create tables : cr_tab.sql
connect scott/tiger@ract2-scan.grid12c.example.com:1521/banka 
drop table ACCOUNTS;
CREATE TABLE ACCOUNTS
( ACCOUNT varchar2(32) CONSTRAINT ACCOUNTS_PK PRIMARY KEY,
  BALANCE NUMBER(10,2) constraint S_LOWER_CHK  check (BALANCE >= 0)
);
insert into ACCOUNTS values ('User99_at_BANKA', 10000);
commit;
select * from global_name;
select * from accounts;
update Accounts set balance = balance - 11000;
rollback;

connect scott/tiger@ract2-scan.grid12c.example.com:1521/bankb 
drop table ACCOUNTS;
CREATE TABLE ACCOUNTS
( ACCOUNT varchar2(32) CONSTRAINT ACCOUNTS_PK PRIMARY KEY,
  BALANCE NUMBER(10,2) constraint S_LOWER_CHK  check (BALANCE >= 0)
);

insert into ACCOUNTS values ('User98_at_BANKB', 10000);
commit;
select global_name from global_name;
select * from accounts;
update Accounts set balance = balance - 11000;
rollback;

Test whether the records are locked : check_it.sql
connect scott/tiger@ract2-scan.grid12c.example.com:1521/banka 
select * from global_name;
select * from accounts for update;

connect scott/tiger@ract2-scan.grid12c.example.com:1521/bankb 
select global_name from global_name;
select * from accounts for update;

Generic script the test XA staus of both RAC databases: check_xa.sql 
set echo on
connect sys/sys@ract2-scan.grid12c.example.com:1521/banka as sysdba
@tx 
connect sys/sys@ract2-scan.grid12c.example.com:1521/bankb as sysdba
@tx 
exit

Script to display database details for distributed transactions: tx.sql
/*
   If you run a 2PC you need to run  tx.sql on both RAC databases .
   Sample:
    sqlplus  sys/sys@ract2-scan.grid12c.example.com:1521/banka as sysdba @tx
    sqlplus  sys/sys@ract2-scan.grid12c.example.com:1521/bankb as sysdba @tx
*/ 
set echo off
set linesize 132
set pagesize 132
col HOST_NAME format A30
col GLOBAL_TRAN_ID format A80
col GLOBALID format A60
col BRANCHID format A55
col HOST format A25
col OS_USER format A15
set echo on 
select host_name,instance_name from v$instance;
select * from  sys.dba_pending_transactions;
select STATE, LOCAL_TRAN_ID, GLOBAL_TRAN_ID , to_char(FAIL_TIME,'DD-MON-YY HH24:MI:SS') FAIL_TIME,
    to_char(FORCE_TIME,'DD-MON-YY HH24:MI:SS') FORCE_TIME,  to_char(RETRY_TIME,'DD-MON-YY HH24:MI:SS') RETRY_TIME, 
    HOST, OS_USER, COMMIT# from dba_2pc_pending;
set echo off

Script to track XA transactions within Wildfly:
[oracle@wls1 XA]$ cat xa.cli
/subsystem=transactions/log-store=log-store/:probe()
ls /subsystem=transactions/log-store=log-store/transactions

Source Code

  JSF Code: 
        bankInfo.xhtml  - JSF page to display Transaction Info, Exception Info, failed SQL
        index.xhtml     - Main Entry JSF page - displays Account details 
        jdbc_info.xhtml - JSF Page to diplay JDBC driver details, Exceptions, Transaction processing Steps
        rac_info.xhtml  - JSF Page to display RAC instances details

  Maven Project Object Model    
       pom.xml

  JAVA Code:  
       XaTestBean.java

Reference

Leave a Reply

Your email address will not be published. Required fields are marked *