Log File Sync wait event and SSD disks

 COMMIT processing flow

  • User session issues a commit and goes into a ‘log file sync’ wait
  • LGWR is posted to flush the redo buffer to disk
  • LGWR issues the I/O and waits on ‘log file parallel write’
  • LGWR returns from I/O, clears wait, and posts the user session to signal completion
  • User session clears ‘log file sync’
  • Always test your: sequential REDO write performance with OS tools like dd

AWR statistics

AWR statistics
----------------------------------------------------------------------------------- 
Event                          Wait Class           Waits      Time(s)  Avg Time(ms)    %DB time    
------------------------------ ------------- ------------ ------------ ------------- ----------- 
 log file sync                 Commit             100,030      2,084.1          20.8        95.5
 log file parallel write       System I/O          14,032          0.9           0.1         0.0
 redo synch time                208,871 s     
 redo synch writes              100,030 
  • This  application runs about 100.000 commits waiting 208 second for log file sync Wait Event
  • Over 95.5 % of the DB time is waiting on this event – which is really bad scenario
  • Note log file parallel write is with 0.9s quite low compared to 200s for log file sync wait event
  • CPU scheduling and/or Performance of REDO disk ( delay between issuing ASYC IO request via  io_submit() and getting  I/O status via io_getevents() )  may be responsible for the WAIT Time

Prepare testcase

 Java program     : UCPDemo.java
 Download ucp.jar : http://www.oracle.com/technetwork/database/enterprise-edition/downloads/ucp-085009.html
 Classpath        : setenv CLASSPATH $ORACLE_HOME/jdbc/lib/ojdbc6.jar:/home/oracle/RAC/JDBC/ucp.jar:.
 Compile          : javac UCPDemo.java
 Create table     : create table rac_perftest ( id number, inst_name varchar(8), host_name varchar(24), ins_date date);

 

Runtime environment

Session1 : instance grac1: redo logs on  SDD disk using 10 threads to do 10.000 parallel inserts per thread on a 3-node RAC system
Session2 : instance grac2: redo logs on USB3 disk using 10 threads to do 10.000 parallel inserts per thread on a 3-node RAC system
Check Redo log location:
   THREAD#     GROUP# STATUS           MEMBER
---------- ---------- ---------------- ----------------------------------------------------------------
         1        1 CURRENT           +SSD/grac4/onlinelog/group_1.258.831669597
         1        2 ACTIVE            +SSD/grac4/onlinelog/group_2.259.831669619
         2        3 CURRENT           +DATA/grac4/onlinelog/group_3.269.826220095
         2        4 INACTIVE          +DATA/grac4/onlinelog/group_4.270.82622
  • Redo logs for  instance 1 are located on SSD disks whereas redo logs for instance 2 are  located on USB3 disk  
  • To move redo logs to SSD disk please read following link. 

 

Test results for session 1 ( Redo on SDD )

$ java UCPDemo grac41 10 10000  -noseq -nodebug
Started at:  Thu Nov 21 19:16:57 CET 2013 - URL: jdbc:oracle:thin:@//grac4-scan.grid4.example.com:1521/grac41 - Number of Threads: 10 - tx_cnt:10000 - mode: 2 Debug mode: false
  -> SQL: insert into rac_perftest values (?, ?, ?, sysdate) 
  -> Create UCP POOL 
Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com
Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com
Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com
Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com
Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com
Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com
Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com
Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com
Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com
Connected at: 19:16:58 to instance: grac41 at host: grac41.example.com
Thread_id:   8 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  193 - Thread TPS: 147.08
Thread_id:   6 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  194 - Thread TPS: 147.04
Thread_id:   4 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  234 - Thread TPS: 147.03
Thread_id:   3 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  372 - Thread TPS: 146.98
Thread_id:   1 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  157 - Thread TPS: 146.90
Thread_id:   9 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  192 - Thread TPS: 146.80
Thread_id:   5 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  370 - Thread TPS: 146.78
Thread_id:  10 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  220 - Thread TPS: 146.77
Thread_id:   2 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  382 - Thread TPS: 146.73
Thread_id:   7 - Instance: grac41 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  191 - Thread TPS: 146.71
Runtime : 68 [s] - TPS: 1460.07

Count system calls by using strace
#  strace -tt -T -cp 6290 
Process 6290 attached - interrupt to quit
^C^CProcess 6290 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 36.35    2.738053         261     10488           io_submit
 33.42    2.517088           7    343083           gettimeofday
 21.66    1.631455         169      9649           semop
  3.11    0.234546          11     20831           sendmsg
  2.80    0.210728           5     44220           times
  1.41    0.105844          10     10488           io_getevents
  1.03    0.077204           3     25817      3735 recvmsg
  0.13    0.010036           4      2542           poll
  0.04    0.002791          34        81           semctl

Test results for session 2 ( Redo on USB3 disk )

$ java UCPDemo grac42  10 10000 -noseq -nodebug
Started at:  Sun Nov 17 13:06:31 CET 2013 - URL: jdbc:oracle:thin:@//grac4-scan.grid4.example.com:1521/grac42 - Number of Threads: 10 - tx_cnt:10000 - mode: 2 Debug mode: false
  -> SQL: insert into rac_perftest values (?, ?, ?, sysdate) 
  -> Create UCP POOL 
Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com
Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com
Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com
Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com
Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com
Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com
Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com
Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com
Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com
Connected at: 13:06:31 to instance: grac42 at host: grac42.example.com
Thread_id:   8 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  407 - Thread TPS: 35.06
Thread_id:   4 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  634 - Thread TPS: 35.04
Thread_id:   1 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  606 - Thread TPS: 35.04
Thread_id:   5 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  226 - Thread TPS: 34.93
Thread_id:   2 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  217 - Thread TPS: 34.89
Thread_id:   9 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  225 - Thread TPS: 34.85
Thread_id:   7 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  353 - Thread TPS: 34.84
Thread_id:   6 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  206 - Thread TPS: 34.83
Thread_id:  10 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  189 - Thread TPS: 34.80
Thread_id:   3 - Instance: grac42 - Row Inserted : 10000 - MIN Commit time [ms]:    1 - MAX Commit time [ms]:  486 - Thread TPS: 34.79
Runtime : 287 [s] - TPS: 347.50

#  strace -tt -T -cp 6290  ( lgwr PID ) 
Process 6290 attached - interrupt to quit
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 79.75    2.714592         207     13115           io_submit
 15.30    0.520801          20     25788           sendmsg
  3.03    0.103031          10     10138           semop
  0.78    0.026472           6      4515           poll
  0.43    0.014569           0     56431           times
  0.32    0.011000          15       728        11 semtimedop
  0.20    0.006669           0     30411      4343 recvmsg
  0.09    0.003146           4       891           semctl
  0.04    0.001390           0     13138           io_getevents

 

Compare test results

 

  • Run time improves from 287s to 68s ( about 4x )
  • Having Redo on SDD ( 1460 TPS ) performs about 4x faster than having Redo logs on USB3 disk (347 TPS )
  •  The number of lgwr AIOs are reduced for the SSD case by 30 % ( SSD: 13115 AIOs, USB Disk : 10488 )
  • Both test are using AIOs (each io_submit/io_getevents represents 1 AIO )
  • 100.000 Commits are translated/reduced to about 1000/1300 AIOs (  Group Commit feature )
  • Note using strace for lgwr can have severe performance impacts ( like:  strace -tt -T -p lgwr_PID )
  • During above tests  attaching strace  reduces the SSD test case performance from 1400 TPS to 500 TPS )

Comparision and interpretation of AWR reports

Workload Comparison
~~~~~~~~~~~~~~~~~~~                      1st Per Sec          2nd Per Sec      %Diff              1st Per Txn          2nd Per Txn      %Diff
                                     ---------------      ---------------     ------          ---------------      ---------------     ------
                       DB time:                  6.0                  6.7       13.3                      0.0                  0.0      100.0
       Physical write (blocks):                 55.5                 13.3      -76.0                      0.0                  0.0      -20.0
             Write IO requests:                 19.5                  4.7      -76.0                      0.0                  0.0        0.0
                Executes (SQL):              1,149.6                319.9      -72.2                      1.0                  1.0        1.0
                  Transactions:              1,125.3                309.4      -72.5
--> First session1 runs about 4x more TPS and writes about 4x more redo blocks per second thab session 2

Session1 ( SSD disk)                                                                                     
------------------------------------------------------------------------------------------------ 
Event                          Wait Class           Waits      Time(s)  Avg Time(ms)    %DB time    
------------------------------ ------------- ------------ ------------ ------------- ----------- 
 log file sync                 Commit             100,01       408.5           4.1        77.2
 log file parallel write       System I/O          22,179          1.1           0.0         0.2  
 redo synch time                41,333           
 redo synch writes             100,027          

Session2 ( USB3 disk )
------------------------------------------------------------------------------------------------ 
Event                          Wait Class           Waits      Time(s)  Avg Time(ms)    %DB time    
------------------------------ ------------- ------------ ------------ ------------- ----------- 
 log file sync                 Commit             100,030      2,084.1          20.8        95.5
 log file parallel write       System I/O          14,032          0.9           0.1         0.0
 redo synch time                208,871     
 redo synch writes              100,030
  • Lgwr wait time  ( redo sync time ) was reduced by factor 4 using SSD disk
  • SSD disks does not improve log file parallel write  wait time
  • If redo synch time is high always try to use a faster disk system like SSD for your redo logs
  • If lgwr is busy all sessions waiting on lgwr are reporting  log file sync event
  • sometimes AWR data are not very accurate

References

Leave a Reply

Your email address will not be published. Required fields are marked *