XA performance considerations with Oracle Clusterwide XA Transaction

Disclaimer: Note your performance results may be different as I run my test on a standard PC using Virtualbox VMs to run a 3-node cluster. My main concern is that slow network latency will produce non-accurate test results.

If your are able to rerun my test please post your results !

Table of Contents

Download and compile TPM source

Downlaod location and details about the test program Xatest_sw.java

[oracle@gract1 PERF]$ setenv CLASSPATH $ORACLE_HOME/jdbc/lib/ojdbc6_g.jar:.
[oracle@gract1 PERF]$ javac  Xatest_sw.java
[oracle@gract1 PERF]$ ls *.class
Message.class  stats.class  TXProducer.class Xatest_sw.class  XAWorker.class

Preparing system by running NON-XA related tests and tuning for COMMIT performance

Testing with a RAC system 12.1.0.1 ( Virtualbox ) having I/O problems

Test 1: Single Session to Single RAC Instance  
$  java Xatest sql on 1 1000 0 jdbc:oracle:thin:@gract1:1521/ERP
 +++ Connected to : ERP_3
 ++ Elapsed Time=24.347 [s]
 ++ Thread:  1
    CPU used by this session: 42
    SQL*Net roundtrips to/from client: 2007
 ++ Sum: CPU used by this session: 42
        SQL*Net roundtrips to/from client: 2007

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                    2004        0.00          0.01
  SQL*Net message from client                  2004        0.15          1.36
  row cache lock                                  1        0.00          0.00
  enq: HW - contention                            1        0.00          0.00
  gc current grant 2-way                          4        0.00          0.00
  db file sequential read                         2        0.00          0.00
  log file sync                                1001        0.08         21.59

Test Summary:
 - Single Session executes about 2000 roundtrips RT ( 1000 Insert + 1000 Commits )
 - System runs at about 40 TPS (1000 Inserts / Elapsed time 24 s) 
 - log file sync Wait event is very bad - MAX wait time is 8 ms to get REDO LOG I/O done
 - Note nearly each Commit is reported as a LOG FILE SYNC WAIT
Conclusion
 - System need to be tuned for LOG FILE SYNC/Commit performance

Test 2: 2 Sessions to 2 different RAC INSTANCES 
$  java Xatest sql on 2 1000  0 jdbc:oracle:thin:@gract1:1521/ERP jdbc:oracle:thin:@gract2:1521/ERP
 +++ Connected to : ERP_1
 +++ Connected to : ERP_3
 ++ Elapsed Time=32.529 [s]
 ++ Thread:  1
    CPU used by this session: 46
    SQL*Net roundtrips to/from client: 2007
 ++ Thread:  2
    CPU used by this session: 58
    SQL*Net roundtrips to/from client: 2006
 ++ Sum: CPU used by this session: 104
        SQL*Net roundtrips to/from client: 4013

Intepreting TKPROF output
Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                    2004        0.00          0.01
  SQL*Net message from client                  2004        0.24          4.95
  row cache lock                                  2        0.00          0.00
  enq: HW - contention                            2        0.00          0.00
  gc current grant 2-way                          5        0.00          0.00
  db file sequential read                         2        0.00          0.00
  gc current multi block request                  2        0.00          0.00
  log file sync                                1002        0.42         28.81
  gc current block 2-way                          5        0.00          0.00
  gc cr block 2-way                               4        0.00          0.00
  gc current grant busy                           4        0.00          0.00
Test Summary  
   - Both session executes about 2000 rountrips each ( 1000 Insert + 1000 Commits )
   - Single session runs at about 30 TPS (1000 Insertes / Elapsed time 30 second) = 60 TPS in total
   - Even the 2.nd session is connected to a 2.nd instance it doesn't scale up well ( expected 80 TPS count: 80 )
   - log file sync Wait event are very,very bad now - MAX wait time increased from 8 ms to 42 ms to get REDO LOG I/O done
   - Again Each Commit is reported as a LOG FILE SYNC WAIT ( we have now  2 session --> 2000 Commits ) 
Conclusion
   - Here the root problem is obvious - all I/O on my 3-node Virtualbox RAC system are using the same physical disk
   - This exmplains well the Max Wait time for log file sync increases to 42 ms as we have doubled the I/O rate
   - To get proper XA test results I'll put my redo logs on SDD disk first. 

Test results after switching to SDD disks : 2 sessions / 2 RAC Instances 
[oracle@gract1 PERF]$  java Xatest sql on 2 1000  0 jdbc:oracle:thin:@gract1:1521/ERP jdbc:oracle:thin:@gract2:1521/ERP
TRUNCATE- Thread: 1
 +++ Connected to : ERP_1
 +++ Connected to : ERP_3
 ++ Elapsed Time=5.062 [s]
 ++ Thread:  1
    CPU used by this session: 51
    SQL*Net roundtrips to/from client: 2007
 ++ Thread:  2
    CPU used by this session: 55
    SQL*Net roundtrips to/from client: 2006
++ Sum: CPU used by this session: 106
        SQL*Net roundtrips to/from client: 4013

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                    2004        0.00          0.01
  SQL*Net message from client                  2004        0.10          2.36
  row cache lock                                  2        0.00          0.00
  gc current grant 2-way                          4        0.00          0.00
  db file sequential read                         3        0.01          0.01
  gc current multi block request                  2        0.00          0.00
  log file sync                                 848        0.00          0.61
  library cache pin                               1        0.00          0.00
  KJC: Wait for msg sends to complete             1        0.00          0.00
  gc current block 2-way                          3        0.00          0.00
  gc cr block 2-way                               2        0.00          0.00
  gc current grant busy                           3        0.05          0.05
  enq: TT - contention                            1        0.00          0.00
  Disk file operations I/O                        1        0.00          0.00
  enq: FB - contention                            1        0.00          0.00
Test Summary
 - log file sync is still the dominant Wait events but reduced from 28 seconds to 0.61 seconds
 - TX rate is now about 200 TPS ( not bad runnining all 6 Cluster REDO LOGs on a single SSD disk ) 
Conclusion
 - Our reference platform using 2 Threads/2 instances and doing 2000 Inserts/4000 Roundtrips  performs at a Rate of 200 TPS
 - The Overall time running the testcase  is around 5.0 seconds
 - In our SQL tests we treat 2 inserts as a single TX to be comparable to a real XA 2PC commit.

Running XA tests

First XA test: 2 session pointing to a Single RACE Instance 
[oracle@gract1 PERF]$  java Xatest xa on 2 1000 0 jdbc:oracle:thin:@gract1:1521/ERP jdbc:oracle:thin:@gract1:1521/ERP
 +++ Connected to : ERP_3
 +++ Connected to : ERP_3
 ++ Elapsed Time=9.76 [s]
 ++ Thread:  1
    CPU used by this session: 77
    SQL*Net roundtrips to/from client: 4574
 ++ Thread:  2
    CPU used by this session: 71
    SQL*Net roundtrips to/from client: 4439
 ++ Sum: CPU used by this session: 148
        SQL*Net roundtrips to/from client: 9013

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                    4437        0.00          0.01
  SQL*Net message from client                  4437        0.25          5.69
  DFS lock handle                               858        0.00          0.32
  rdbms ipc reply                              1469        0.01          1.15
  latch free                                    389        0.00          0.00
  latch: ges resource hash list                 388        0.00          0.00
  log file sync                                 837        0.00          0.71
  Disk file operations I/O                        2        0.00          0.00
  KJC: Wait for msg sends to complete            21        0.00          0.00
  row cache lock                                  2        0.00          0.00
  enq: TT - contention                            1        0.00          0.00
  resmgr:cpu quantum                              1        0.00          0.00
  library cache lock                              1        0.00          0.00
  library cache pin                               1        0.00          0.00
  enq: FB - contention                            1        0.00          0.00
  gc current multi block request                  1        0.00          0.00

Test Summary: 
  - Server roundtrips increase from 2000 to 4500.  Why this ?
    The roundtrips  for a 2PC TX can be  calculated as following :  
      Branch 1                 Branch 2            RT Count 
      xa_start BR1             xa_start BR2         2  
      DML BR1                  DML BR2 a            2
      xa_end BR1               xa_end BR2           2
      xa_prepare BR1           Xa_prepare BR2       2
      -                        Xa_commit            1
    -->  9 Roundtrips  per XA TX - Note above  session runs 500 XA TX so 4500 RTs are well explainable 
  - For syncronizing  Clusterwide XA TX the DFS lock handle takes 0.32s - Not to much but this lock serializes all XA work.
  - When LGWR is waiting ( using semtimedop call) for posts from the user sessions, that wait time is accounted as 
     ‘rdbms ipc message’ event ( see Ravis post blow -  need to verify this )
Conclusion: 
  - The TX rate dropped from 200 TPS to 100 TPS which is well explainable as we have doubled 
    our Rountrips for out XA operation

TEST XA2 :  2 sessions connect to 2 different Instances - Branch affinity to a specific RAC instance 
Transaction Layout : All XA opration on certain XA Branche are running on the same RAC Instance:
  Instance 1                              Instance 2
  xa_start BR1                            xa_start BR2 
  DML      BR1                            DML BR2 
  xa_end   BR1                            xa_end BR2
  xa_prepare BR1 -> XA_RD_ONLY            xa_prepare BR2 --> XA_OK
  -- do nothing here --                   xa_commit  BR2

[oracle@gract1 PERF]$  java Xatest  xa on 2 1000 0 jdbc:oracle:thin:@gract1:1521/ERP jdbc:oracle:thin:@gract2:1521/ERP
 +++ Connected to : ERP_1
 +++ Connected to : ERP_3
 ++ Elapsed Time=35.347 [s]
 ++ Thread:  1
    CPU used by this session: 249
    SQL*Net roundtrips to/from client: 4779
 ++ Thread:  2
    CPU used by this session: 228
    SQL*Net roundtrips to/from client: 4234
 ++ Sum: CPU used by this session: 477
        SQL*Net roundtrips to/from client: 9013
psed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                    4776        0.00          0.02
  SQL*Net message from client                  4776        0.13         11.00
  DFS lock handle                              2550        1.00         10.28
  rdbms ipc reply                              1474        0.01          0.16
  row cache lock                                  1        0.00          0.00
  gc current grant busy                         809        0.04          0.38
  gc current grant 2-way                          4        0.00          0.00
  db file sequential read                         2        0.00          0.00
  gc current multi block request                  2        0.00          0.00
  enq: FB - contention                            1        0.00          0.00
  KJC: Wait for msg sends to complete            80        0.00          0.00
  log file sync                                4116        0.01          4.82
  library cache pin                               1        0.00          0.00
  gc current block 2-way                        180        0.00          0.08
  gc cr block 2-way                              65        0.00          0.02
  enq: TX - contention                          773        0.01          1.27
  Disk file operations I/O                        1        0.00          0.00
  gc current block busy                         533        0.00          1.08
  latch: ges resource hash list                1912        0.00          0.10
  resmgr:cpu quantum                              1        0.00          0.00
  transaction                                     1        1.00          1.00
  latch: enqueue hash chains                      2        0.00          0.00
  global enqueue expand wait                      2        0.00          0.00
  gc buffer busy release                          3        0.00          0.00
  latch free                                      2        0.00          0.00
  latch: row cache objects                        1        0.00          0.00
  gc buffer busy acquire                          1        0.00          0.00a
Test Summary
 - DFS lock handle waits increase dramaticalls from 0.32s to to 10.28 seconds with a max value of 1.0s
 - log file sync increases from  837 waits  taking 0.71s to 4116 waits and taking  4.82 s
 - We see some significant cluster waits  : gc current block busy - wait time 1.27
 - We see some Lock contention enq: TX - contention with 773 waits takine 1.27 s
Conclusion
 - The overall performance drops from 100 TPS - when  all branches pointing to the instance -  to 30 TPS

TEST XA3 :  2 sessions connect to 2 different Instances - NO Branch affinity to a specific RAC instance 
        Instance1                               Instance 2
        xa_start BR1                            xa_start BR2 
        DML      BR1                            DML BR2 
        xa_end   BR1                            xa_end BR2
                      <--- call switch_xid() ---->
        xa_prepare BR2 -> XA_RD_ONLY            xa_prepare BR1 --> XA_OK
        -- do nothing here --                   xa_commit  BR1 

[oracle@gract1 PERF]$ java Xatest_sw xa on 2 1000 0 jdbc:oracle:thin:@gract1:1521/ERP jdbc:oracle:thin:@gract2:1521/ERP
 +++ Connected to : ERP_1
 +++ Connected to : ERP_3
 ++ Elapsed Time=39.402 [s]
 ++ Thread:  1
    CPU used by this session: 270
    SQL*Net roundtrips to/from client: 4793
 ++ Thread:  2
    CPU used by this session: 234
    SQL*Net roundtrips to/from client: 4220
++ Sum: CPU used by this session: 504
        SQL*Net roundtrips to/from client: 9013

psed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                    4790        0.00          0.02
  SQL*Net message from client                  4790        2.53         14.26
  DFS lock handle                              2666        1.00         11.94
  row cache lock                                  3        0.00          0.00
  enq: HW - contention                            2        0.00          0.00
  gc current grant 2-way                          5        0.00          0.00
  db file sequential read                         2        0.00          0.00
  gc current multi block request                  2        0.00          0.00
  KJC: Wait for msg sends to complete            52        0.00          0.00
  log file sync                                4163        0.00          4.94
  enq: TX - contention                          786        0.01          1.32
  Disk file operations I/O                        1        0.00          0.00
  gc current block busy                         551        0.00          1.17
  gc current grant busy                         817        0.02          0.26
  library cache pin                               1        0.00          0.00
  gc current block 2-way                        178        0.00          0.09
  gc cr block 2-way                              31        0.00          0.01
  gc current block 3-way                          2        0.00          0.00
  enq: FB - contention                            1        0.00          0.00
  latch: ges resource hash list                2238        0.00          0.12
  rdbms ipc reply                              1607        0.00          0.15
  latch free                                     10        0.00          0.00
  latch: enqueue hash chains                      3        0.00          0.00
  gc buffer busy acquire                          1        0.00          0.00
  gc buffer busy release                          3        0.00          0.00
  latch: row cache objects                        1        0.00          0.00
  global enqueue expand wait                      1        0.00          0.00xa
Test Summary:
  - About 2 seconds of the 3 s longer application runtime is accounted to add waittime of the DFS lock handle  
  - DFS lock handle increases from 2550 waits taking 10.28 to 2666 waits taking 11.94 s  
Conclusion: 
   - Doing the prepare and processing step cross instance drops the TPS rate by around 10 %

Overall Summary

Using clusterwide transaction on multiple cluster instances can have a huge performance impact – try to avoid that !
If the SQL processing and the prepare call runs on different instances than we can expect another 10 % performance reduction.
Note the test results may differ depending on platform and Oracle release – it should only give you only an rough idea what you can expect when you use clusterwide transactions.
If using clusterwide XA transaction still try to use only a single instance. In that case DFS waits are still used – but less waits with much lower waittime
In any case don’t expect a huge performance gain when distributing your XA transaction to multiple instances

A closer look on DFS lock handle wait

Comparing waits for 10 XA transaction doing 2PC pointing to the same instance 
$ java Xatest_sw xa on 2 10 1 jdbc:oracle:thin:@gract1:1521/ERP jdbc:oracle:thin:@gract1:1521/ERP

WAIT #0: nam='DFS lock handle' ela= 661 type|mode=1146617859 id1=3336165566 id2=0 obj#=1 tim=36404200504
WAIT #0: nam='DFS lock handle' ela= 5401 type|mode=1146617861 id1=3336165566 id2=0 obj#=1 tim=36404327920
WAIT #0: nam='DFS lock handle' ela= 636 type|mode=1146617861 id1=1372417527 id2=0 obj#=-1 tim=36404399133
WAIT #0: nam='DFS lock handle' ela= 1966 type|mode=1146617861 id1=1372417527 id2=0 obj#=-1 tim=36404401476
WAIT #0: nam='DFS lock handle' ela= 366 type|mode=1146617861 id1=4217040809 id2=0 obj#=-1 tim=36404412218
WAIT #0: nam='DFS lock handle' ela= 5543 type|mode=1146617861 id1=4217040809 id2=0 obj#=-1 tim=36404418375
--> Only 6 waits - Max ela time 5543 

Comparing waits for 10 XA transaction doing 2PC pointing to different RAC instances
$ java Xatest_sw xa on 2 10 1 jdbc:oracle:thin:@gract1:1521/ERP jdbc:oracle:thin:@gract2:1521/ERP

WAIT #0: nam='DFS lock handle' ela= 31029 type|mode=1146617861 id1=3336165566 id2=0 obj#=92453 tim=36642306182
WAIT #0: nam='DFS lock handle' ela= 491 type|mode=1146617861 id1=3336165566 id2=0 obj#=92453 tim=36642332322
WAIT #0: nam='DFS lock handle' ela= 452 type|mode=1146617859 id1=3336165566 id2=0 obj#=92453 tim=36642339941
WAIT #0: nam='DFS lock handle' ela= 3880 type|mode=1146617861 id1=2419845326 id2=0 obj#=92453 tim=36642367345
WAIT #0: nam='DFS lock handle' ela= 4839 type|mode=1146617861 id1=2419845326 id2=0 obj#=-1 tim=36642374473
WAIT #0: nam='DFS lock handle' ela= 3269 type|mode=1146617861 id1=2419845326 id2=0 obj#=-1 tim=36642377996
WAIT #0: nam='DFS lock handle' ela= 3580 type|mode=1146617859 id1=2419845326 id2=0 obj#=-1 tim=36642384805
WAIT #0: nam='DFS lock handle' ela= 1634 type|mode=1146617861 id1=2419845326 id2=0 obj#=-1 tim=36642386597
WAIT #0: nam='DFS lock handle' ela= 3339 type|mode=1146617861 id1=1247011735 id2=0 obj#=-1 tim=36642407619
WAIT #0: nam='DFS lock handle' ela= 3747 type|mode=1146617861 id1=1247011735 id2=0 obj#=-1 tim=36642411531
WAIT #0: nam='DFS lock handle' ela= 2728 type|mode=1146617859 id1=1247011735 id2=0 obj#=-1 tim=36642415863
WAIT #0: nam='DFS lock handle' ela= 366 type|mode=1146617859 id1=1247011735 id2=0 obj#=-1 tim=36642425262
WAIT #0: nam='DFS lock handle' ela= 6349 type|mode=1146617859 id1=2012011321 id2=0 obj#=-1 tim=36642451711
WAIT #0: nam='DFS lock handle' ela= 3268 type|mode=1146617861 id1=2012011321 id2=0 obj#=-1 tim=36642455143
WAIT #0: nam='DFS lock handle' ela= 2643 type|mode=1146617861 id1=4227502411 id2=0 obj#=-1 tim=36642474412
WAIT #0: nam='DFS lock handle' ela= 1959 type|mode=1146617859 id1=4227502411 id2=0 obj#=-1 tim=36642483878
WAIT #0: nam='DFS lock handle' ela= 1211 type|mode=1146617861 id1=4227502411 id2=0 obj#=-1 tim=36642485179
WAIT #0: nam='DFS lock handle' ela= 11718 type|mode=1146617861 id1=1372417527 id2=0 obj#=-1 tim=36642511084
WAIT #0: nam='DFS lock handle' ela= 5743 type|mode=1146617861 id1=1372417527 id2=0 obj#=-1 tim=36642517024
WAIT #0: nam='DFS lock handle' ela= 3634 type|mode=1146617859 id1=1372417527 id2=0 obj#=-1 tim=36642522303
WAIT #0: nam='DFS lock handle' ela= 829 type|mode=1146617861 id1=1372417527 id2=0 obj#=-1 tim=36642523311
WAIT #0: nam='DFS lock handle' ela= 2886 type|mode=1146617861 id1=4217040809 id2=0 obj#=-1 tim=36642543243
WAIT #0: nam='DFS lock handle' ela= 1796 type|mode=1146617861 id1=4217040809 id2=0 obj#=-1 tim=36642545280
WAIT #0: nam='DFS lock handle' ela= 265 type|mode=1146617859 id1=4217040809 id2=0 obj#=-1 tim=36642547563
WAIT #0: nam='DFS lock handle' ela= 1333 type|mode=1146617861 id1=4217040809 id2=0 obj#=-1 tim=36642552281
WAIT #0: nam='DFS lock handle' ela= 6677 type|mode=1146617861 id1=233389099 id2=0 obj#=-1 tim=36642584821
WAIT #0: nam='DFS lock handle' ela= 3113 type|mode=1146617861 id1=233389099 id2=0 obj#=-1 tim=36642589082
WAIT #0: nam='DFS lock handle' ela= 5758 type|mode=1146617859 id1=233389099 id2=0 obj#=-1 tim=36642599617
WAIT #0: nam='DFS lock handle' ela= 3695 type|mode=1146617861 id1=233389099 id2=0 obj#=-1 tim=36642603466
WAIT #0: nam='DFS lock handle' ela= 14638 type|mode=1146617861 id1=4083478780 id2=0 obj#=0 tim=36642637211
WAIT #0: nam='DFS lock handle' ela= 3250 type|mode=1146617861 id1=4083478780 id2=0 obj#=0 tim=36642643439
WAIT #0: nam='DFS lock handle' ela= 12738 type|mode=1146617859 id1=4083478780 id2=0 obj#=0 tim=36642660398
WAIT #0: nam='DFS lock handle' ela= 3571 type|mode=1146617861 id1=4083478780 id2=0 obj#=0 tim=36642664149
WAIT #0: nam='DFS lock handle' ela= 8979 type|mode=1146617861 id1=1699082971 id2=0 obj#=0 tim=36642695974
WAIT #0: nam='DFS lock handle' ela= 3209 type|mode=1146617861 id1=1699082971 id2=0 obj#=-1 tim=36642700417
--> Much more waits and increased wait time - max ela time quite often > 10.000

An Oracle White Paper June 2010 – XA and Oracle controlled Distributed Transaction

Extract from following PDF

Distributed Transactions and Database Locking
All tightly coupled distributed transactions acquire a DX (Distributed Transaction) enqueue while they
are actively working on the transaction. This enqueue (memory lock) is used to serialize access to the
database across multiple branches of a distributed transaction, such that only one branch of the
transaction can have SQL active at any point in time.

Within a single instance database deployment all distributed transactions are tightly coupled by default.
The XA transaction manager can override this at the start of a distributed transaction.
Similarly within a multiple instance RAC environment all distributed transactions are tightly coupled by
default. However prior to Oracle Database 11gR1 RAC, DX enqueues were maintained independently
within each instance. Since Oracle Database 11gR1 RAC this has changed and a cluster wide DX
enqueue is acquired for each tightly coupled distributed transaction. This cluster wide DX enqueue is
managed by the Global Enqueue Service (GES) within RAC, and control of this enqueue is transferred
between database instances when distributed transactions span more than one database instance in the
cluster. However, only one tightly coupled transaction branch can be active at any one time.

Reference

One thought on “XA performance considerations with Oracle Clusterwide XA Transaction”

Riani says:

August 21, 2019 at 3:57 am

Thanks, I have recently been looking for info about this topic for ages and yours is the best I’ve discovered so far. Can I keep contact you if something error on my testing ? Appreciate it!