Table of Contents
Overview
- Download compile and run TPM Java code : Xatest_sw.java
Goal : Write a low level TPM to test Oracle Custerwide Transaction Parallelize work where possilbe by using queues and threads Currently the code runs with 3 Threads ( 1 TXProducer and 2 Worker threads ) using 3 queues ( Global receiver and 2 worker queues ) Wishlist - Not yet implemented Using UCP connection pool to test more threads for a more cluster load test Using xa_recover to test XA transaction recovery Status WARNING : This code isn't tested well - use it at you own risk and run it only on your test system - and of course there is no SUPPORT !
What is coming next ?
- Simulating Weblogig feature : XA Transaction without Transaction Logs- and Optimization
For details see: WebLogic Server 12.1.3 New JTA feature description -XA Transaction without Transaction Logs- and Optimization XA transaction layout sample ( run this in parallel ) Instance 1: xa_start DM xa_end BR 001 Instance 2: xa_start DM xa_end BR 002 Instance 3: xa_start DM xa_end BR 103 ( Note we flag branch with 10 ) --> This means Instance 3 is our Determiner Resource Manager flagged with branch prefix 10 Now run on the NON-Determiner Resource Manager Instance 1 and Instance 2 the xa_prepare ( run this in parallel ) Instance 1: xa_prepare BR 001 Instance 2: xa_prepare BR 002 Now run prepare on the Determiner RM Instance 3: xa_prepare BR 103 --> When this goes through we that we need to commit the TX even the last prepare crashes before returning. After the crash we need to call xa_recover on all RMs. When we get back a flagged pending transaction we need to commit the data running xa_commit. Questions: Do we need call xa_commit on all instances as on RAC we only nned to commit the last branch? Need test whether this us true for a recovered TX. Note If we don't get back a flagged TX from a Determiner RM we need to rollback.
Short Code explanation of TXProducer and XAWorker class
TXProducer has a generic Queue and spawns up to 2 XAWorker threads for handling 2 XA branches in parallel.
The XAWorkers inherits the global queue receiver RQ from TXProducer and use that queue to send back status messages to the
TXProducer after running an specific XA operation. The local XAWorker queues are named LQ1 and LQ2.
The XAWorkers are reading from their local queue where they are waiting on worker requests assigned from the TXProducer thread .
Typical message protocol:
Loop :
TXProducer ---> write message | local Queue LQ1 created by XAWorker 1 | ---> XAWorker 1 reads from LQ1
---> write message | local Queue LQ2 created by XAWorker 2 | ---> XAWorker 2 reads from LQ2
++++ --> After XAWorker processed their job <--- ++++
TXProducer <--- reads messages | global Receive Queue RQ created by TXProdcuer | <--- XAWorker 1 writes processing status to RQ
| global Receive Queue RQ created by TXProdcuer | <--- XAWorker 2 writes processing status to RQ
+++ TXProduce reads XA status message and assigns next worker actions +++
To test clusterwide XA transaction XAWorker xaw[0] should connect to Instance 1 and XAWorker xaw[1] should connect to Instance 2.
But for performance testing you can point both connections to a single instance too.
TXProducer holds a message array: in_mesg[0] and in_mesg[1].
XAWorker process xaw[0] uses message array mesg[0] and xaw[1] uses message array mesg[2] - don't change the thread_id in these specific messages.
Each XAWorker has its own queue to communicate with the TXProducer instance. The TXProducer use write_message(Message out) to send a message
the the XAWorker.
Depending on the m.thread_id() the TXProducer picks up the right XAWorker queue by calling
s = xaw[ out.get_thread_id()-1].get_queue();
This way we can guarante the the write_message call will send the message to the correct queue.
write_message(in_mesg[0]) -> in_mesg[0].get_thread_id() returns 1 --> Queue 1 --> Thread 1
write_message(in_mesg[1]) -> in_mesg[0].get_thread_id() returns 2 --> Queue 2 --> Thread 1
For generic messages m you need to use m.set_thread_id() ( 1 or 2 ) to address the right queue
Note the write_message() API is not clustered and need to invoked for every worker thread you want to send an action.
The read_message() is different. As we need all branches for a XA transaction sychronized at a certain point ( after xa_end, after xa_prepare )
before we can schedule the next XA processing step the read message API needs to make sure that all outstanding branches/instances have
returned a corecet XA status code to the TXProducer thread.
A typical scenario
TXProducer sends a xa_prepare message mesg[0] to xa_worker xaw[0] by using queue xaw[0].get_queue() and message in_mesg[0]
TXProducer sends a xa_prepare message mesg[1] to xa_worker xaw[1] by using queue xaw[1].get_queue() and message in_mesg[1]
After the XAWorkers have finished their work they are sending back their reponse by using the global resceiver queue RQ.
This way of working is true for xa_processing ( thread of xa_start() DML xa_end () ) and for the xa_prepare step.
For the xa_commit step we send only a commit message to that node where have get return XA_OK (= 0 ) from the xa_prepare ( RAC specific )
We don't send an xa_commit to that instance returning XA_RDONLY ( -3 )
Testing XID affinity accross Instances
One major goal of this program is to test the impact when a XA transaction branch is processed and prepared by a different instance XA transaction layout using 2 Branches Br1,Br2 and switching XIDs using 2PC Instance1 Instance 2 xa_start Br1 xa_start Br2 DML Br1 DML Br2 xa_end Br1 xa_end Br2 <--- call switch_xid() ----> xa_prepare Br2 -> XA_OK xa_prepare Br1 --> XA_RDONLY xa_commit Br2 -- do nothing here -- To see how the code works set debug level 1 and run only with a single transcation: [oracle@gract1 PERF]$ java Xatest_sw xa on 2 1 1 jdbc:oracle:thin:@gract1:1521/ERP jdbc:oracle:thin:@gract2:1521/ERP .. + log_XA() 0x1025-00000001.01 Thr_ID:1 Inst:ERP_3 Xa_err:0 Xa_prep:0 xa_end + log_XA() 0x1025-00000001.02 Thr_ID:2 Inst:ERP_1 Xa_err:0 Xa_prep:0 xa_end ------------- Switching XIDs before XA_prepare + log_XA() 0x1025-00000001.02 Thr_ID:1 Inst:ERP_3 Xa_err:0 Xa_prep:0 xa_prepare + log_XA() 0x1025-00000001.01 Thr_ID:2 Inst:ERP_1 Xa_err:0 Xa_prep:3 xa_prepare + log_XA() 0x1025-00000001.02 Thr_ID:1 Inst:ERP_3 Xa_err:0 Xa_prep:0 xa_commit --> The processiong step for XID 0x1025-00000001.01 occurs on Instance ERP_3 whereas the prepare step for the same XID happens on Instance ERP_1 The processiong step for XID 0x1025-00000001.02 occurs on Instance ERP_1 whereas the prepare step for the same XID happens on Instance ERP_3 Here we can see that the XA transaction is switching from the XA branche for the prepare The follwing lines are active ( // Here comes only 2PC-XA processing { stats.dump_info(" ------------- Switching XIDs before XA_prepare " , 1); switch_xids(in_mesg[0], in_mesg[1] ); If you comment the line ( around line 325 ) the following output should be seen // Here comes only 2PC-XA processing { // stats.dump_info(" ------------- Switching XIDs before XA_prepare " , 1); // switch_xids(in_mesg[0], in_mesg[1] ); + log_XA() 0x1025-00000001.01 Thr_ID:1 Inst:ERP_3 Xa_err:0 Xa_prep:0 xa_end + log_XA() 0x1025-00000001.02 Thr_ID:2 Inst:ERP_1 Xa_err:0 Xa_prep:0 xa_end + log_XA() 0x1025-00000001.01 Thr_ID:1 Inst:ERP_3 Xa_err:0 Xa_prep:0 xa_prepare + log_XA() 0x1025-00000001.02 Thr_ID:2 Inst:ERP_1 Xa_err:0 Xa_prep:3 xa_prepare + log_XA() 0x1025-00000001.01 Thr_ID:1 Inst:ERP_3 Xa_err:0 Xa_prep:0 xa_commit --> Here XID 0x1025-00000001.01 is processed by instance 1 for the xa processing, xa_prepare and xa_commit step XID 0x1025-00000001.02 is processed by instance 2 for the xa processing, xa_prepare step
Parameters usage and GRANTs
$ java Xatest_sw xa on 2 5 1 jdbc:oracle:thin:@gract1:1521/ERP jdbc:oracle:thin:@gract2:1521/ERP | | | | | | +--> URL for Branch 1 | | | | | +-- URL for Branch 2 | | | | +---- Debug level : 0,1,2,3,4 | | | +------ Number of Tansactions | | +--------- Number of Transaction Brances: Valid : 1 or 2 | +------------ 10046 tracing ON/OFF +--------------- Mode : xa/sql - run XA transaction / run pure SQL tests Needed Grants: grant select on v_$instance to scott; grant select on v_$statname to scott; grant select on v_$mystat to scott; grant alter session to scott;
Error handling
- Errors sould be printed out as soon as the occcur using JAVA excpetions : e.getErrorCode() and e.getMessage () - Set message status to error : m.set_status("error") - TXProducers read_message() methode will check for errors and terminated the program if needed - *** Not yet implemented: Here we should rollback/recover failed XA transactions ( ORA-1591 errors )
Reference
Amazing blog! what a nice blog and informative to read and share your knowledge with your reader. It’s great to found your blog and read your post. Keep writing, Regards