Table of Contents
COMMIT processing flow
- User session issues a commit and goes into a ‘log file sync’ wait
- LGWR is posted to flush the redo buffer to disk
- LGWR issues the I/O and waits on ‘log file parallel write’
- LGWR returns from I/O, clears wait, and posts the user session to signal completion
- User session clears ‘log file sync’
- Always test your: sequential REDO write performance with OS tools like dd
AWR statistics
AWR statistics ----------------------------------------------------------------------------------- Event Wait Class Waits Time(s) Avg Time(ms) %DB time ------------------------------ ------------- ------------ ------------ ------------- ----------- log file sync Commit 100,030 2,084.1 20.8 95.5 log file parallel write System I/O 14,032 0.9 0.1 0.0 redo synch time 208,871 s redo synch writes 100,030
- This application runs about 100.000 commits waiting 208 second for log file sync Wait Event
- Over 95.5 % of the DB time is waiting on this event – which is really bad scenario
- Note log file parallel write is with 0.9s quite low compared to 200s for log file sync wait event
- CPU scheduling and/or Performance of REDO disk ( delay between issuing ASYC IO request via io_submit() and getting I/O status via io_getevents() ) may be responsible for the WAIT Time
Prepare testcase
Java program : UCPDemo.java Download ucp.jar : http://www.oracle.com/technetwork/database/enterprise-edition/downloads/ucp-085009.html Classpath : setenv CLASSPATH $ORACLE_HOME/jdbc/lib/ojdbc6.jar:/home/oracle/RAC/JDBC/ucp.jar:. Compile : javac UCPDemo.java Create table : create table rac_perftest ( id number, inst_name varchar(8), host_name varchar(24), ins_date date);
Runtime environment
Session1 : instance grac1: redo logs on SDD disk using 10 threads to do 10.000 parallel inserts per thread on a 3-node RAC system Session2 : instance grac2: redo logs on USB3 disk using 10 threads to do 10.000 parallel inserts per thread on a 3-node RAC system Check Redo log location: THREAD# GROUP# STATUS MEMBER ---------- ---------- ---------------- ---------------------------------------------------------------- 1 1 CURRENT +SSD/grac4/onlinelog/group_1.258.831669597 1 2 ACTIVE +SSD/grac4/onlinelog/group_2.259.831669619 2 3 CURRENT +DATA/grac4/onlinelog/group_3.269.826220095 2 4 INACTIVE +DATA/grac4/onlinelog/group_4.270.82622
- Redo logs for instance 1 are located on SSD disks whereas redo logs for instance 2 are located on USB3 disk
- To move redo logs to SSD disk please read following link.
Test results for session 1 ( Redo on SDD )
$ java UCPDemo grac41 10 10000 -noseq -nodebug Started at: Thu Nov 21 19:16:57 CET 2013 - URL: jdbc:oracle:thin:@//grac4-scan.grid4.example.com:1521/grac41 - Number of Threads: 10 - tx_cnt:10000 - mode: 2 Debug mode: false -> SQL: insert into rac_perftest values (?, ?, ?, sysdate) -> Create UCP POOL Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com Thread_id: 8 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 193 - Thread TPS: 147.08 Thread_id: 6 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 194 - Thread TPS: 147.04 Thread_id: 4 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 234 - Thread TPS: 147.03 Thread_id: 3 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 372 - Thread TPS: 146.98 Thread_id: 1 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 157 - Thread TPS: 146.90 Thread_id: 9 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 192 - Thread TPS: 146.80 Thread_id: 5 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 370 - Thread TPS: 146.78 Thread_id: 10 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 220 - Thread TPS: 146.77 Thread_id: 2 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 382 - Thread TPS: 146.73 Thread_id: 7 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 191 - Thread TPS: 146.71 Runtime : 68 [s] - TPS: 1460.07 Count system calls by using strace # strace -tt -T -cp 6290 Process 6290 attached - interrupt to quit ^C^CProcess 6290 detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 36.35 2.738053 261 10488 io_submit 33.42 2.517088 7 343083 gettimeofday 21.66 1.631455 169 9649 semop 3.11 0.234546 11 20831 sendmsg 2.80 0.210728 5 44220 times 1.41 0.105844 10 10488 io_getevents 1.03 0.077204 3 25817 3735 recvmsg 0.13 0.010036 4 2542 poll 0.04 0.002791 34 81 semctl
Test results for session 2 ( Redo on USB3 disk )
$ java UCPDemo grac42 10 10000 -noseq -nodebug Started at: Sun Nov 17 13:06:31 CET 2013 - URL: jdbc:oracle:thin:@//grac4-scan.grid4.example.com:1521/grac42 - Number of Threads: 10 - tx_cnt:10000 - mode: 2 Debug mode: false -> SQL: insert into rac_perftest values (?, ?, ?, sysdate) -> Create UCP POOL Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com Thread_id: 8 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 407 - Thread TPS: 35.06 Thread_id: 4 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 634 - Thread TPS: 35.04 Thread_id: 1 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 606 - Thread TPS: 35.04 Thread_id: 5 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 226 - Thread TPS: 34.93 Thread_id: 2 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 217 - Thread TPS: 34.89 Thread_id: 9 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 225 - Thread TPS: 34.85 Thread_id: 7 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 353 - Thread TPS: 34.84 Thread_id: 6 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 206 - Thread TPS: 34.83 Thread_id: 10 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 189 - Thread TPS: 34.80 Thread_id: 3 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]: 1 - MAX Commit time [ms]: 486 - Thread TPS: 34.79 Runtime : 287 [s] - TPS: 347.50 # strace -tt -T -cp 6290 ( lgwr PID ) Process 6290 attached - interrupt to quit % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 79.75 2.714592 207 13115 io_submit 15.30 0.520801 20 25788 sendmsg 3.03 0.103031 10 10138 semop 0.78 0.026472 6 4515 poll 0.43 0.014569 0 56431 times 0.32 0.011000 15 728 11 semtimedop 0.20 0.006669 0 30411 4343 recvmsg 0.09 0.003146 4 891 semctl 0.04 0.001390 0 13138 io_getevents
Compare test results
- Run time improves from 287s to 68s ( about 4x )
- Having Redo on SDD ( 1460 TPS ) performs about 4x faster than having Redo logs on USB3 disk (347 TPS )
- The number of lgwr AIOs are reduced for the SSD case by 30 % ( SSD: 13115 AIOs, USB Disk : 10488 )
- Both test are using AIOs (each io_submit/io_getevents represents 1 AIO )
- 100.000 Commits are translated/reduced to about 1000/1300 AIOs ( Group Commit feature )
- Note using strace for lgwr can have severe performance impacts ( like: strace -tt -T -p lgwr_PID )
- During above tests attaching strace reduces the SSD test case performance from 1400 TPS to 500 TPS )
Comparision and interpretation of AWR reports
Workload Comparison ~~~~~~~~~~~~~~~~~~~ 1st Per Sec 2nd Per Sec %Diff 1st Per Txn 2nd Per Txn %Diff --------------- --------------- ------ --------------- --------------- ------ DB time: 6.0 6.7 13.3 0.0 0.0 100.0 Physical write (blocks): 55.5 13.3 -76.0 0.0 0.0 -20.0 Write IO requests: 19.5 4.7 -76.0 0.0 0.0 0.0 Executes (SQL): 1,149.6 319.9 -72.2 1.0 1.0 1.0 Transactions: 1,125.3 309.4 -72.5 --> First session1 runs about 4x more TPS and writes about 4x more redo blocks per second thab session 2 Session1 ( SSD disk) ------------------------------------------------------------------------------------------------ Event Wait Class Waits Time(s) Avg Time(ms) %DB time ------------------------------ ------------- ------------ ------------ ------------- ----------- log file sync Commit 100,018 408.5 4.1 77.2 log file parallel write System I/O 22,179 1.1 0.0 0.2 redo synch time 41,333 redo synch writes 100,027 Session2 ( USB3 disk ) ------------------------------------------------------------------------------------------------ Event Wait Class Waits Time(s) Avg Time(ms) %DB time ------------------------------ ------------- ------------ ------------ ------------- ----------- log file sync Commit 100,030 2,084.1 20.8 95.5 log file parallel write System I/O 14,032 0.9 0.1 0.0 redo synch time 208,871 redo synch writes 100,030
- Lgwr wait time ( redo sync time ) was reduced by factor 4 using SSD disk
- SSD disks does not improve log file parallel write wait time
- If redo synch time is high always try to use a faster disk system like SSD for your redo logs
- If lgwr is busy all sessions waiting on lgwr are reporting log file sync event
- sometimes AWR data are not very accurate