Disclaimer: Note your performance results may be different as I run my test on a standard PC using Virtualbox VMs to run a 3-node cluster. My main concern is that slow network latency will produce non-accurate test results.
If your are able to rerun my test please post your results !
Table of Contents
Download and compile TPM source
[oracle@gract1 PERF]$ setenv CLASSPATH $ORACLE_HOME/jdbc/lib/ojdbc6_g.jar:. [oracle@gract1 PERF]$ javac Xatest_sw.java [oracle@gract1 PERF]$ ls *.class Message.class stats.class TXProducer.class Xatest_sw.class XAWorker.class
Preparing system by running NON-XA related tests and tuning for COMMIT performance
Testing with a RAC system 12.1.0.1 ( Virtualbox ) having I/O problems Test 1: Single Session to Single RAC Instance $ java Xatest sql on 1 1000 0 jdbc:oracle:thin:@gract1:1521/ERP +++ Connected to : ERP_3 ++ Elapsed Time=24.347 [s] ++ Thread: 1 CPU used by this session: 42 SQL*Net roundtrips to/from client: 2007 ++ Sum: CPU used by this session: 42 SQL*Net roundtrips to/from client: 2007 Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ SQL*Net message to client 2004 0.00 0.01 SQL*Net message from client 2004 0.15 1.36 row cache lock 1 0.00 0.00 enq: HW - contention 1 0.00 0.00 gc current grant 2-way 4 0.00 0.00 db file sequential read 2 0.00 0.00 log file sync 1001 0.08 21.59 Test Summary: - Single Session executes about 2000 roundtrips RT ( 1000 Insert + 1000 Commits ) - System runs at about 40 TPS (1000 Inserts / Elapsed time 24 s) - log file sync Wait event is very bad - MAX wait time is 8 ms to get REDO LOG I/O done - Note nearly each Commit is reported as a LOG FILE SYNC WAIT Conclusion - System need to be tuned for LOG FILE SYNC/Commit performance Test 2: 2 Sessions to 2 different RAC INSTANCES $ java Xatest sql on 2 1000 0 jdbc:oracle:thin:@gract1:1521/ERP jdbc:oracle:thin:@gract2:1521/ERP +++ Connected to : ERP_1 +++ Connected to : ERP_3 ++ Elapsed Time=32.529 [s] ++ Thread: 1 CPU used by this session: 46 SQL*Net roundtrips to/from client: 2007 ++ Thread: 2 CPU used by this session: 58 SQL*Net roundtrips to/from client: 2006 ++ Sum: CPU used by this session: 104 SQL*Net roundtrips to/from client: 4013 Intepreting TKPROF output Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ SQL*Net message to client 2004 0.00 0.01 SQL*Net message from client 2004 0.24 4.95 row cache lock 2 0.00 0.00 enq: HW - contention 2 0.00 0.00 gc current grant 2-way 5 0.00 0.00 db file sequential read 2 0.00 0.00 gc current multi block request 2 0.00 0.00 log file sync 1002 0.42 28.81 gc current block 2-way 5 0.00 0.00 gc cr block 2-way 4 0.00 0.00 gc current grant busy 4 0.00 0.00 Test Summary - Both session executes about 2000 rountrips each ( 1000 Insert + 1000 Commits ) - Single session runs at about 30 TPS (1000 Insertes / Elapsed time 30 second) = 60 TPS in total - Even the 2.nd session is connected to a 2.nd instance it doesn't scale up well ( expected 80 TPS count: 80 ) - log file sync Wait event are very,very bad now - MAX wait time increased from 8 ms to 42 ms to get REDO LOG I/O done - Again Each Commit is reported as a LOG FILE SYNC WAIT ( we have now 2 session --> 2000 Commits ) Conclusion - Here the root problem is obvious - all I/O on my 3-node Virtualbox RAC system are using the same physical disk - This exmplains well the Max Wait time for log file sync increases to 42 ms as we have doubled the I/O rate - To get proper XA test results I'll put my redo logs on SDD disk first. Test results after switching to SDD disks : 2 sessions / 2 RAC Instances [oracle@gract1 PERF]$ java Xatest sql on 2 1000 0 jdbc:oracle:thin:@gract1:1521/ERP jdbc:oracle:thin:@gract2:1521/ERP TRUNCATE- Thread: 1 +++ Connected to : ERP_1 +++ Connected to : ERP_3 ++ Elapsed Time=5.062 [s] ++ Thread: 1 CPU used by this session: 51 SQL*Net roundtrips to/from client: 2007 ++ Thread: 2 CPU used by this session: 55 SQL*Net roundtrips to/from client: 2006 ++ Sum: CPU used by this session: 106 SQL*Net roundtrips to/from client: 4013 Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ SQL*Net message to client 2004 0.00 0.01 SQL*Net message from client 2004 0.10 2.36 row cache lock 2 0.00 0.00 gc current grant 2-way 4 0.00 0.00 db file sequential read 3 0.01 0.01 gc current multi block request 2 0.00 0.00 log file sync 848 0.00 0.61 library cache pin 1 0.00 0.00 KJC: Wait for msg sends to complete 1 0.00 0.00 gc current block 2-way 3 0.00 0.00 gc cr block 2-way 2 0.00 0.00 gc current grant busy 3 0.05 0.05 enq: TT - contention 1 0.00 0.00 Disk file operations I/O 1 0.00 0.00 enq: FB - contention 1 0.00 0.00 Test Summary - log file sync is still the dominant Wait events but reduced from 28 seconds to 0.61 seconds - TX rate is now about 200 TPS ( not bad runnining all 6 Cluster REDO LOGs on a single SSD disk ) Conclusion - Our reference platform using 2 Threads/2 instances and doing 2000 Inserts/4000 Roundtrips performs at a Rate of 200 TPS - The Overall time running the testcase is around 5.0 seconds - In our SQL tests we treat 2 inserts as a single TX to be comparable to a real XA 2PC commit.
Running XA tests
First XA test: 2 session pointing to a Single RACE Instance [oracle@gract1 PERF]$ java Xatest xa on 2 1000 0 jdbc:oracle:thin:@gract1:1521/ERP jdbc:oracle:thin:@gract1:1521/ERP +++ Connected to : ERP_3 +++ Connected to : ERP_3 ++ Elapsed Time=9.76 [s] ++ Thread: 1 CPU used by this session: 77 SQL*Net roundtrips to/from client: 4574 ++ Thread: 2 CPU used by this session: 71 SQL*Net roundtrips to/from client: 4439 ++ Sum: CPU used by this session: 148 SQL*Net roundtrips to/from client: 9013 Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ SQL*Net message to client 4437 0.00 0.01 SQL*Net message from client 4437 0.25 5.69 DFS lock handle 858 0.00 0.32 rdbms ipc reply 1469 0.01 1.15 latch free 389 0.00 0.00 latch: ges resource hash list 388 0.00 0.00 log file sync 837 0.00 0.71 Disk file operations I/O 2 0.00 0.00 KJC: Wait for msg sends to complete 21 0.00 0.00 row cache lock 2 0.00 0.00 enq: TT - contention 1 0.00 0.00 resmgr:cpu quantum 1 0.00 0.00 library cache lock 1 0.00 0.00 library cache pin 1 0.00 0.00 enq: FB - contention 1 0.00 0.00 gc current multi block request 1 0.00 0.00 Test Summary: - Server roundtrips increase from 2000 to 4500. Why this ? The roundtrips for a 2PC TX can be calculated as following : Branch 1 Branch 2 RT Count xa_start BR1 xa_start BR2 2 DML BR1 DML BR2 a 2 xa_end BR1 xa_end BR2 2 xa_prepare BR1 Xa_prepare BR2 2 - Xa_commit 1 --> 9 Roundtrips per XA TX - Note above session runs 500 XA TX so 4500 RTs are well explainable - For syncronizing Clusterwide XA TX the DFS lock handle takes 0.32s - Not to much but this lock serializes all XA work. - When LGWR is waiting ( using semtimedop call) for posts from the user sessions, that wait time is accounted as ‘rdbms ipc message’ event ( see Ravis post blow - need to verify this ) Conclusion: - The TX rate dropped from 200 TPS to 100 TPS which is well explainable as we have doubled our Rountrips for out XA operation TEST XA2 : 2 sessions connect to 2 different Instances - Branch affinity to a specific RAC instance Transaction Layout : All XA opration on certain XA Branche are running on the same RAC Instance: Instance 1 Instance 2 xa_start BR1 xa_start BR2 DML BR1 DML BR2 xa_end BR1 xa_end BR2 xa_prepare BR1 -> XA_RD_ONLY xa_prepare BR2 --> XA_OK -- do nothing here -- xa_commit BR2 [oracle@gract1 PERF]$ java Xatest xa on 2 1000 0 jdbc:oracle:thin:@gract1:1521/ERP jdbc:oracle:thin:@gract2:1521/ERP +++ Connected to : ERP_1 +++ Connected to : ERP_3 ++ Elapsed Time=35.347 [s] ++ Thread: 1 CPU used by this session: 249 SQL*Net roundtrips to/from client: 4779 ++ Thread: 2 CPU used by this session: 228 SQL*Net roundtrips to/from client: 4234 ++ Sum: CPU used by this session: 477 SQL*Net roundtrips to/from client: 9013 psed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ SQL*Net message to client 4776 0.00 0.02 SQL*Net message from client 4776 0.13 11.00 DFS lock handle 2550 1.00 10.28 rdbms ipc reply 1474 0.01 0.16 row cache lock 1 0.00 0.00 gc current grant busy 809 0.04 0.38 gc current grant 2-way 4 0.00 0.00 db file sequential read 2 0.00 0.00 gc current multi block request 2 0.00 0.00 enq: FB - contention 1 0.00 0.00 KJC: Wait for msg sends to complete 80 0.00 0.00 log file sync 4116 0.01 4.82 library cache pin 1 0.00 0.00 gc current block 2-way 180 0.00 0.08 gc cr block 2-way 65 0.00 0.02 enq: TX - contention 773 0.01 1.27 Disk file operations I/O 1 0.00 0.00 gc current block busy 533 0.00 1.08 latch: ges resource hash list 1912 0.00 0.10 resmgr:cpu quantum 1 0.00 0.00 transaction 1 1.00 1.00 latch: enqueue hash chains 2 0.00 0.00 global enqueue expand wait 2 0.00 0.00 gc buffer busy release 3 0.00 0.00 latch free 2 0.00 0.00 latch: row cache objects 1 0.00 0.00 gc buffer busy acquire 1 0.00 0.00a Test Summary - DFS lock handle waits increase dramaticalls from 0.32s to to 10.28 seconds with a max value of 1.0s - log file sync increases from 837 waits taking 0.71s to 4116 waits and taking 4.82 s - We see some significant cluster waits : gc current block busy - wait time 1.27 - We see some Lock contention enq: TX - contention with 773 waits takine 1.27 s Conclusion - The overall performance drops from 100 TPS - when all branches pointing to the instance - to 30 TPS TEST XA3 : 2 sessions connect to 2 different Instances - NO Branch affinity to a specific RAC instance Instance1 Instance 2 xa_start BR1 xa_start BR2 DML BR1 DML BR2 xa_end BR1 xa_end BR2 <--- call switch_xid() ----> xa_prepare BR2 -> XA_RD_ONLY xa_prepare BR1 --> XA_OK -- do nothing here -- xa_commit BR1 [oracle@gract1 PERF]$ java Xatest_sw xa on 2 1000 0 jdbc:oracle:thin:@gract1:1521/ERP jdbc:oracle:thin:@gract2:1521/ERP +++ Connected to : ERP_1 +++ Connected to : ERP_3 ++ Elapsed Time=39.402 [s] ++ Thread: 1 CPU used by this session: 270 SQL*Net roundtrips to/from client: 4793 ++ Thread: 2 CPU used by this session: 234 SQL*Net roundtrips to/from client: 4220 ++ Sum: CPU used by this session: 504 SQL*Net roundtrips to/from client: 9013 psed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ SQL*Net message to client 4790 0.00 0.02 SQL*Net message from client 4790 2.53 14.26 DFS lock handle 2666 1.00 11.94 row cache lock 3 0.00 0.00 enq: HW - contention 2 0.00 0.00 gc current grant 2-way 5 0.00 0.00 db file sequential read 2 0.00 0.00 gc current multi block request 2 0.00 0.00 KJC: Wait for msg sends to complete 52 0.00 0.00 log file sync 4163 0.00 4.94 enq: TX - contention 786 0.01 1.32 Disk file operations I/O 1 0.00 0.00 gc current block busy 551 0.00 1.17 gc current grant busy 817 0.02 0.26 library cache pin 1 0.00 0.00 gc current block 2-way 178 0.00 0.09 gc cr block 2-way 31 0.00 0.01 gc current block 3-way 2 0.00 0.00 enq: FB - contention 1 0.00 0.00 latch: ges resource hash list 2238 0.00 0.12 rdbms ipc reply 1607 0.00 0.15 latch free 10 0.00 0.00 latch: enqueue hash chains 3 0.00 0.00 gc buffer busy acquire 1 0.00 0.00 gc buffer busy release 3 0.00 0.00 latch: row cache objects 1 0.00 0.00 global enqueue expand wait 1 0.00 0.00xa Test Summary: - About 2 seconds of the 3 s longer application runtime is accounted to add waittime of the DFS lock handle - DFS lock handle increases from 2550 waits taking 10.28 to 2666 waits taking 11.94 s Conclusion: - Doing the prepare and processing step cross instance drops the TPS rate by around 10 %
Overall Summary
- Using clusterwide transaction on multiple cluster instances can have a huge performance impact – try to avoid that !
- If the SQL processing and the prepare call runs on different instances than we can expect another 10 % performance reduction.
- Note the test results may differ depending on platform and Oracle release – it should only give you only an rough idea what you can expect when you use clusterwide transactions.
- If using clusterwide XA transaction still try to use only a single instance. In that case DFS waits are still used – but less waits with much lower waittime
- In any case don’t expect a huge performance gain when distributing your XA transaction to multiple instances
A closer look on DFS lock handle wait
Comparing waits for 10 XA transaction doing 2PC pointing to the same instance $ java Xatest_sw xa on 2 10 1 jdbc:oracle:thin:@gract1:1521/ERP jdbc:oracle:thin:@gract1:1521/ERP WAIT #0: nam='DFS lock handle' ela= 661 type|mode=1146617859 id1=3336165566 id2=0 obj#=1 tim=36404200504 WAIT #0: nam='DFS lock handle' ela= 5401 type|mode=1146617861 id1=3336165566 id2=0 obj#=1 tim=36404327920 WAIT #0: nam='DFS lock handle' ela= 636 type|mode=1146617861 id1=1372417527 id2=0 obj#=-1 tim=36404399133 WAIT #0: nam='DFS lock handle' ela= 1966 type|mode=1146617861 id1=1372417527 id2=0 obj#=-1 tim=36404401476 WAIT #0: nam='DFS lock handle' ela= 366 type|mode=1146617861 id1=4217040809 id2=0 obj#=-1 tim=36404412218 WAIT #0: nam='DFS lock handle' ela= 5543 type|mode=1146617861 id1=4217040809 id2=0 obj#=-1 tim=36404418375 --> Only 6 waits - Max ela time 5543 Comparing waits for 10 XA transaction doing 2PC pointing to different RAC instances $ java Xatest_sw xa on 2 10 1 jdbc:oracle:thin:@gract1:1521/ERP jdbc:oracle:thin:@gract2:1521/ERP WAIT #0: nam='DFS lock handle' ela= 31029 type|mode=1146617861 id1=3336165566 id2=0 obj#=92453 tim=36642306182 WAIT #0: nam='DFS lock handle' ela= 491 type|mode=1146617861 id1=3336165566 id2=0 obj#=92453 tim=36642332322 WAIT #0: nam='DFS lock handle' ela= 452 type|mode=1146617859 id1=3336165566 id2=0 obj#=92453 tim=36642339941 WAIT #0: nam='DFS lock handle' ela= 3880 type|mode=1146617861 id1=2419845326 id2=0 obj#=92453 tim=36642367345 WAIT #0: nam='DFS lock handle' ela= 4839 type|mode=1146617861 id1=2419845326 id2=0 obj#=-1 tim=36642374473 WAIT #0: nam='DFS lock handle' ela= 3269 type|mode=1146617861 id1=2419845326 id2=0 obj#=-1 tim=36642377996 WAIT #0: nam='DFS lock handle' ela= 3580 type|mode=1146617859 id1=2419845326 id2=0 obj#=-1 tim=36642384805 WAIT #0: nam='DFS lock handle' ela= 1634 type|mode=1146617861 id1=2419845326 id2=0 obj#=-1 tim=36642386597 WAIT #0: nam='DFS lock handle' ela= 3339 type|mode=1146617861 id1=1247011735 id2=0 obj#=-1 tim=36642407619 WAIT #0: nam='DFS lock handle' ela= 3747 type|mode=1146617861 id1=1247011735 id2=0 obj#=-1 tim=36642411531 WAIT #0: nam='DFS lock handle' ela= 2728 type|mode=1146617859 id1=1247011735 id2=0 obj#=-1 tim=36642415863 WAIT #0: nam='DFS lock handle' ela= 366 type|mode=1146617859 id1=1247011735 id2=0 obj#=-1 tim=36642425262 WAIT #0: nam='DFS lock handle' ela= 6349 type|mode=1146617859 id1=2012011321 id2=0 obj#=-1 tim=36642451711 WAIT #0: nam='DFS lock handle' ela= 3268 type|mode=1146617861 id1=2012011321 id2=0 obj#=-1 tim=36642455143 WAIT #0: nam='DFS lock handle' ela= 2643 type|mode=1146617861 id1=4227502411 id2=0 obj#=-1 tim=36642474412 WAIT #0: nam='DFS lock handle' ela= 1959 type|mode=1146617859 id1=4227502411 id2=0 obj#=-1 tim=36642483878 WAIT #0: nam='DFS lock handle' ela= 1211 type|mode=1146617861 id1=4227502411 id2=0 obj#=-1 tim=36642485179 WAIT #0: nam='DFS lock handle' ela= 11718 type|mode=1146617861 id1=1372417527 id2=0 obj#=-1 tim=36642511084 WAIT #0: nam='DFS lock handle' ela= 5743 type|mode=1146617861 id1=1372417527 id2=0 obj#=-1 tim=36642517024 WAIT #0: nam='DFS lock handle' ela= 3634 type|mode=1146617859 id1=1372417527 id2=0 obj#=-1 tim=36642522303 WAIT #0: nam='DFS lock handle' ela= 829 type|mode=1146617861 id1=1372417527 id2=0 obj#=-1 tim=36642523311 WAIT #0: nam='DFS lock handle' ela= 2886 type|mode=1146617861 id1=4217040809 id2=0 obj#=-1 tim=36642543243 WAIT #0: nam='DFS lock handle' ela= 1796 type|mode=1146617861 id1=4217040809 id2=0 obj#=-1 tim=36642545280 WAIT #0: nam='DFS lock handle' ela= 265 type|mode=1146617859 id1=4217040809 id2=0 obj#=-1 tim=36642547563 WAIT #0: nam='DFS lock handle' ela= 1333 type|mode=1146617861 id1=4217040809 id2=0 obj#=-1 tim=36642552281 WAIT #0: nam='DFS lock handle' ela= 6677 type|mode=1146617861 id1=233389099 id2=0 obj#=-1 tim=36642584821 WAIT #0: nam='DFS lock handle' ela= 3113 type|mode=1146617861 id1=233389099 id2=0 obj#=-1 tim=36642589082 WAIT #0: nam='DFS lock handle' ela= 5758 type|mode=1146617859 id1=233389099 id2=0 obj#=-1 tim=36642599617 WAIT #0: nam='DFS lock handle' ela= 3695 type|mode=1146617861 id1=233389099 id2=0 obj#=-1 tim=36642603466 WAIT #0: nam='DFS lock handle' ela= 14638 type|mode=1146617861 id1=4083478780 id2=0 obj#=0 tim=36642637211 WAIT #0: nam='DFS lock handle' ela= 3250 type|mode=1146617861 id1=4083478780 id2=0 obj#=0 tim=36642643439 WAIT #0: nam='DFS lock handle' ela= 12738 type|mode=1146617859 id1=4083478780 id2=0 obj#=0 tim=36642660398 WAIT #0: nam='DFS lock handle' ela= 3571 type|mode=1146617861 id1=4083478780 id2=0 obj#=0 tim=36642664149 WAIT #0: nam='DFS lock handle' ela= 8979 type|mode=1146617861 id1=1699082971 id2=0 obj#=0 tim=36642695974 WAIT #0: nam='DFS lock handle' ela= 3209 type|mode=1146617861 id1=1699082971 id2=0 obj#=-1 tim=36642700417 --> Much more waits and increased wait time - max ela time quite often > 10.000
An Oracle White Paper June 2010 – XA and Oracle controlled Distributed Transaction
- Extract from following PDF
Distributed Transactions and Database Locking All tightly coupled distributed transactions acquire a DX (Distributed Transaction) enqueue while they are actively working on the transaction. This enqueue (memory lock) is used to serialize access to the database across multiple branches of a distributed transaction, such that only one branch of the transaction can have SQL active at any point in time. Within a single instance database deployment all distributed transactions are tightly coupled by default. The XA transaction manager can override this at the start of a distributed transaction. Similarly within a multiple instance RAC environment all distributed transactions are tightly coupled by default. However prior to Oracle Database 11gR1 RAC, DX enqueues were maintained independently within each instance. Since Oracle Database 11gR1 RAC this has changed and a cluster wide DX enqueue is acquired for each tightly coupled distributed transaction. This cluster wide DX enqueue is managed by the Global Enqueue Service (GES) within RAC, and control of this enqueue is transferred between database instances when distributed transactions span more than one database instance in the cluster. However, only one tightly coupled transaction branch can be active at any one time.
Reference
- http://orainternals.wordpress.com/2011/11/08/troubleshooting-dfs-lock-handle-waits/
- http://orainternals.wordpress.com/2008/07/07/tuning-log-file-sync-wait-events/
- http://www.oracle.com/technetwork/database/clustering/overview/distributed-transactions-and-xa-163941.pdf
- https://orastory.wordpress.com/category/dx-enqueue/
Thanks, I have recently been looking for info about this topic for ages and yours is the best I’ve discovered so far. Can I keep contact you if something error on my testing ? Appreciate it!