Debugging a RAC Node Eviction with CHM

 

Reference for CHM statistics – you may review this first

Display and interpreting  CHM data causing a Node Eviction

  • Try to find two time windows we can easily compare  a good and a bad  system state

    • 03-22-14 09.55.00 – 09.55.59 : [ WORKING SCENARIO ]
    • 03-22-14 10.03.00 – 10.03.59 : [ PROBLEM SCENARIO ]  – system freeze , heavy swapping, no CPU available

System Metrics: 
WORKING SCENARIO : ~/print_sys.sh  grac41_CHMOS pcpus | egrep '#pcpus|cpuq:|03-22-14 09.55'
#pcpus: 2   #vcpus: 2   cpuht: N   chipname: Intel(R)  swaptotal: 5210108  physmemtotal: 4354292 #sysfdlimit: 6815744 #disks: 27 #nics: 6  
                   cpu:  cpuq: memfree:  mcache:  swapfree:  ior:  iow:  ios: swpin:  swpout: pgin: pgout: netr: netw: procs: rtprocs:  #fds: nicErrors:  
03-22-14 09.55.00  71.57   4   1851948    931104   3524948    546   599   582    135       0   390    473   754   797    365       14  19488     0 
03-22-14 09.55.05  72.62  85   1831188    932568   3525120    208   692   244     42       0   115    404   357   379    364       14  19488     0 
03-22-14 09.55.10  99.51  71   1765596    932800   3525144    101   127    37     12       0    55     72    58    18    364       15  19520     0  CPU > 90% 
03-22-14 09.55.15  99.75  85   1696272    933380   3525172    151   381    48     15       0    83    196    18    14    369       15  19552     0  CPU > 90% 
03-22-14 09.55.20  100.0  89   1616644    933664   3525196     39    79    26      1       0    21     44    32    18    376       15  19520     0  CPU > 90% 
03-22-14 09.55.25  99.39  89   1557244    933736   3525228    122    99    40     25       0    70     60    30    15    366       15  19552     0  CPU > 90% 
03-22-14 09.55.30  100.0  75   1498456    934120   3525236    164   872    63      6       0    90    442    14    10    369       15  19648     0  CPU > 90% 
03-22-14 09.55.35  100.0  80   1437248    934300   3525288    145   441   108     38       0    80    227    69    35    378       15  19648     0  CPU > 90% 
03-22-14 09.55.40  99.87  68   1381740    934524   3525376    220   291    59     55       0   146    157    30    22    367       15  19648     0  CPU > 90% 
03-22-14 09.55.45  99.75  67   1315868    935176   3525424    161   418    56     43       0    82    215    38    16    367       15  19584     0  CPU > 90% 
03-22-14 09.55.50  99.51  77   1268612    935332   3525460    126   170    51     38       0    74     94    24    16    367       15  19648     0  CPU > 90% 
03-22-14 09.55.55  99.87 111   1207984    935820   3525508    209   921   148     25       0   106    493    31    23    373       15  19584     0  CPU > 90% 
Interpreting the data: 
 - cpuq increases from 4 to 111 at the end of this monitor window and remains high after 09.55.05
 - CPU usage increases for 71 % to about 100 % at the end of this monitor window and remains high after 09.55.10
 - Free memory decrease from 1851948 kByte to 1207984 Kbyte in consistent chunks
 - Paging and Swapping is not yet problem 

PROBLEM SCENARIO:
% ~/print_sys.sh  grac41_CHMOS pcpus | egrep '#pcpus|cpuq:|03-22-14 10.03'
#pcpus: 2   #vcpus: 2   cpuht: N   chipname: Intel(R)  swaptotal: 5210108  physmemtotal: 4354292 #sysfdlimit: 6815744 #disks: 27 #nics: 6  
                   cpu:  cpuq: memfree:  mcache:  swapfree:  ior:  iow:  ios: swpin:  swpout: pgin: pgout: netr: netw: procs: rtprocs:  #fds: nicErrors:  
03-22-14 10.03.00  11.74   1     93960    171796    172100   6049  3475   286   1899    1593  3033   1650    14    10    379       15  19584     0  MEMLOW < 10%
03-22-14 10.03.05   7.52   1    103496    172052    167384   6020  3678   129   1292    1524  3006   1580    24    10    384       15  19584     0  MEMLOW < 10%
03-22-14 10.03.10   6.60   2    113480    173020    188364   4170  1933   669   1078     960  2030    996    18     9    388       15  19488     0  MEMLOW < 10%
03-22-14 10.03.15   7.24   1    109800    173584    181808   6279  3576   190   1495    1655  3189   1679    27    17    381       15  19520     0  MEMLOW < 10%
03-22-14 10.03.20  24.20  61     85236    173572    195528   4258  2247   789   1174    1135  2088   1172    48    11    394       15  19456     0  MEMLOW < 10%
03-22-14 10.03.25  19.96  64     87344    166056    138244    255 11165   113     69    5647    98   5650    19     4    394       15  19456     0  MEMLOW < 10%
03-22-14 10.03.30  100.0  17     89932    164992    129100 35716 475205 126771 31478  472380 35716  472380  823   885    394       15  19456     0  CPU > 90%  MEMLOW < 10%
03-22-14 10.03.35  12.18   1     87380    168188    120572   4599  5959   278   1435    2051  2234   2175    49     9    379       15  19392     0  MEMLOW < 10%
03-22-14 10.03.40   6.85   1     86364    172672    113828   4973  3378   129   1288    1649  2490   1791    15     4    390       15  19456     0  MEMLOW < 10%
03-22-14 10.03.45   5.22   1     86804    175592    112524   2504  1439   352    523     364  1314    452    15    14    392       15  19456     0  MEMLOW < 10%
03-22-14 10.03.50   6.89   1     86768    175572    105128   4016  3705    93    836    1697  1908   1791     8     3    391       15  19488     0  MEMLOW < 10%
03-22-14 10.03.55   51.5 124     69692    171176     18184    956 35388   311    213   17409   416  17555    13     8    393       15  19488     0  MEMLOW < 10%

  - memfree with  69 MByte and swapfree with 181 MByte are reaching critical values 
  - swpout activity reaches high values 472 Mbyte at  10.03.30
  - Heavy pagein activity ( 35716 pagess ) at  10.03.30
  - runq shows only 1 processes but very likely we have a high count of blocked processes 
  - CPU is off from 100 % but again very likely due to waiting on swapping/paging activities 
  - This system is ready to freeze even runq and CPU usage looks healtyh ( 10.03.40 10.03.45 10.03.50 )  

Lets investigate our swap device
WORKING SCENARIO : 
% ~/print_disk.sh grac41_CHMOS dm-1 pcpus dm-1 | egrep 'ior:|Device|03-22-14 09.55'
Device_name: dm-1         ior:       iow:       ios:      qlen:   wait:   type:       Disk/Part 
'03-22-14 09.55.00'    135.175      0.000         33          0      18     SWAP      dm-1 
'03-22-14 09.55.05'     42.441      0.000         10          0      18     SWAP      dm-1 
'03-22-14 09.55.10'     12.808      0.000          3          0      20     SWAP      dm-1 
'03-22-14 09.55.15'     15.168      0.000          3          0      68     SWAP      dm-1 
'03-22-14 09.55.20'      1.600      0.000          0          0      83     SWAP      dm-1 
'03-22-14 09.55.25'     25.629      0.000          6          0      20     SWAP      dm-1 
'03-22-14 09.55.30'      6.394      0.000          1          0      22     SWAP      dm-1 
'03-22-14 09.55.35'     38.419      0.000          9          0      98     SWAP      dm-1 
'03-22-14 09.55.40'     55.234      0.000         13          0      23     SWAP      dm-1 
'03-22-14 09.55.45'     43.253      0.000         10          0      35     SWAP      dm-1 
'03-22-14 09.55.50'     38.357      0.000          9          0      33     SWAP      dm-1 
  - At 3-22-14 09.55 swap device looks good - low qlen - low ios ( about 15 for the complete monitor window )
  - No writes to swap device only some reads 

PROBLEM SCENARIO : ~/print_disk.sh grac41_CHMOS dm-1 pcpus dm-1 | egrep 'ior:|Device|03-22-14 10.03'
Device_name: dm-1         ior:       iow:       ios:      qlen:   wait:   type:       Disk/Part 
'03-22-14 10.03.00'   1899.896   1593.513        873       1188    4885     SWAP      dm-1   :  Disk spent too much time waiting for I/O (> 100 msecs) : 4885 ms
'03-22-14 10.03.05'   1292.842   1524.049        704       1268    2217     SWAP      dm-1   :  Disk spent too much time waiting for I/O (> 100 msecs) : 2217 ms
'03-22-14 10.03.10'   1078.232    960.563        509        988    1827     SWAP      dm-1   :  Disk spent too much time waiting for I/O (> 100 msecs) : 1827 ms
'03-22-14 10.03.15'   1495.491   1655.267        787        534     780     SWAP      dm-1   :  Disk spent too much time waiting for I/O (> 100 msecs) : 780 ms
'03-22-14 10.03.20'   1174.470   1135.348        577        418     779     SWAP      dm-1   :  Disk spent too much time waiting for I/O (> 100 msecs) : 779 ms
'03-22-14 10.03.25'     69.639   5658.866       1432        801     448     SWAP      dm-1   :  Disk spent too much time waiting for I/O (> 100 msecs) : 448 ms
'03-22-14 10.03.30'  31478.585 475205.568     126671       1831       0     SWAP      dm-1 
'03-22-14 10.03.35'   1435.188   2016.783        862       2002    2494     SWAP      dm-1   :  Disk spent too much time waiting for I/O (> 100 msecs) : 2494 ms
'03-22-14 10.03.40'   1288.316   1649.205        734       1111    1282     SWAP      dm-1   :  Disk spent too much time waiting for I/O (> 100 msecs) : 1282 ms
'03-22-14 10.03.45'    523.714    364.601        222        701    3597     SWAP      dm-1   :  Disk spent too much time waiting for I/O (> 100 msecs) : 3597 ms
'03-22-14 10.03.50'    836.427   1697.619        633        601    1001     SWAP      dm-1   :  Disk spent too much time waiting for I/O (> 100 msecs) : 1001 ms
'03-22-14 10.03.55'    213.961  17409.152       4405        225     101     SWAP      dm-1   :  Disk spent too much time waiting for I/O (> 100 msecs) : 101 ms
  - Swap devices is very very busy - at 10.03.30 we have 126671 IOs and at  10.03.00 the service time was 4,8 seconds !
  - At 10.03.30 475 Mbyte swap was written - this is really not an acceptable value 
  - CHM reports for all snaps : Disk spent too much time waiting for I/O (> 100 msecs) 

Lets now investigate our Top CPU consumers
WORKING SCENARIO:  ~/print_top.sh grac41_CHMOS topcpu: | egrep 'topcpu|03-22-14 09.55'
-> Search Str 1   :  topcpu:
                         topcpu_processs CPU_usage    topprivmem_process    PRIVMEM      topshmmem_process     SHMMEM         topfd_processs FD_usage  topthread_process THERAD_NO 
03-22-14 09.55.00            java(12693)    17.60             java(2566)     233508         ologgerd(5661)      64536             java(2566)   357  console-kit-dae(4484)    64  
03-22-14 09.55.05            java(12693)    15.59             java(2566)     233508    oraclegrac41(12762)      65056             java(2566)   357  console-kit-dae(4484)    64  
03-22-14 09.55.10        ocssd.bin(5283)     5.99             java(2566)     233508    oraclegrac41(12762)      66112             java(2566)   357  console-kit-dae(4484)    64  
03-22-14 09.55.15       mp_stress(14181)   180.80             java(2566)     233508    oraclegrac41(12762)      66112             java(2566)   357  console-kit-dae(4484)    64  
03-22-14 09.55.20       mp_stress(14181)   179.19             java(2566)     233520    oraclegrac41(12762)      66112             java(2566)   363  console-kit-dae(4484)    64  
03-22-14 09.55.25       mp_stress(14181)   177.91             java(2566)     233508    oraclegrac41(12762)      66112             java(2566)   359  console-kit-dae(4484)    64  
03-22-14 09.55.30       mp_stress(14181)   179.90       mp_stress(14181)     273036    oraclegrac41(12762)      66112             java(2566)   359  console-kit-dae(4484)    64  
03-22-14 09.55.35       mp_stress(14181)   178.80       mp_stress(14181)     331544    oraclegrac41(12762)      66112             java(2566)   359  console-kit-dae(4484)    64  
03-22-14 09.55.40       mp_stress(14181)   178.58       mp_stress(14181)     390128    oraclegrac41(12762)      66112             java(2566)   359  console-kit-dae(4484)    64  
03-22-14 09.55.45       mp_stress(14181)   174.20       mp_stress(14181)     448892    oraclegrac41(12762)      66112             java(2566)   359  console-kit-dae(4484)    64  
03-22-14 09.55.50       mp_stress(14181)   183.35       mp_stress(14181)     507468    oraclegrac41(12762)      66112             java(2566)   359  console-kit-dae(4484)    64  
03-22-14 09.55.55       mp_stress(14181)   179.77       mp_stress(14181)     566240    oraclegrac41(12762)      66112             java(2566)   359  console-kit-dae(4484)    64  
 - since 09.55.15 program mp_stress is eating up our CPU ! 
 - mp_stress remains our top CPU consumer for the remining 10 snapshots - Not a good situation ! 

PROBLEM SCENARIO: ~/print_top.sh grac41_CHMOS topcpu: | egrep 'topcpu|03-22-14 10.03'
-> Search Str 1   :  topcpu:
                         topcpu_processs CPU_usage    topprivmem_process    PRIVMEM      topshmmem_process     SHMMEM         topfd_processs FD_usage  topthread_process THERAD_NO 
03-22-14 10.03.00       mp_stress(16867)   166.39       mp_stress(16867)    3072956         ologgerd(5661)      64536            java(17355)   376  console-kit-dae(4484)    64  
03-22-14 10.03.05   asm_vktm_+ASM1(7540)     0.95       mp_stress(16867)    3069048         ologgerd(5661)      64536            java(17355)   376  console-kit-dae(4484)    64  
03-22-14 10.03.10 ora_vktm_grac41(27027)     4.80       mp_stress(16867)    3064628         ologgerd(5661)      64536             java(2566)   373  console-kit-dae(4484)    64  
03-22-14 10.03.15 ora_vktm_grac41(27027)     4.59       mp_stress(16867)    3060724         ologgerd(5661)      64536             java(2566)   373  console-kit-dae(4484)    64  
03-22-14 10.03.20   asm_vktm_+ASM1(7540)     5.39       mp_stress(16867)    3055136         ologgerd(5661)      64536             java(2566)   373  console-kit-dae(4484)    64  
03-22-14 10.03.25   asm_vktm_+ASM1(7540)     5.39       mp_stress(16867)    3055136         ologgerd(5661)      64536             java(2566)   373  console-kit-dae(4484)    64  
03-22-14 10.03.30   asm_vktm_+ASM1(7540)     5.39       mp_stress(16867)    3055136         ologgerd(5661)      64536             java(2566)   373  console-kit-dae(4484)    64  
03-22-14 10.03.35       mp_stress(16867)   157.25       mp_stress(16867)    3126540         ologgerd(5661)      64536             java(2566)   373  console-kit-dae(4484)    64  
03-22-14 10.03.40       mp_stress(16867)     5.14       mp_stress(16867)    3139152         ologgerd(5661)      64536             java(2566)   373  console-kit-dae(4484)    64  
03-22-14 10.03.45   asm_vktm_+ASM1(7540)     5.39       mp_stress(16867)    3134516         ologgerd(5661)      64536            java(17527)   378  console-kit-dae(4484)    64  
03-22-14 10.03.50 ora_vktm_grac41(27027)     4.79       mp_stress(16867)    3132116         ologgerd(5661)      64536             java(2566)   377  console-kit-dae(4484)    64  
03-22-14 10.03.55 ora_vktm_grac41(27027)     5.19       mp_stress(16867)    3128736         ologgerd(5661)      64536             java(2566)   377  console-kit-dae(4484)    64  
  - from 10.03.35 on mp_stress is only 2 times the top CPU consumer        
  - mp_press  holds now 3.12 GByte memory - this system is ready to freeze ! 

Process details for mp_stress program:
WORKING SCENARIO: 
% ~/print_proc.sh  grac41_CHMOS  mp_stress | egrep 'name:|03-22-14 09.55'
                    name:          pid: #procfdlimit: cpuusage: privmem:   shm:   #fd: #threads: priority: nice: 
'03-22-14 09.55.10' 'mp_stress'  14181         1024        0.0    38740      556     3        5      -51    0
'03-22-14 09.55.15' 'mp_stress'  14181         1024      180.8    97248      556     3        5      -51    0  CPU usage is abnormally high (> 90): 180.8  
'03-22-14 09.55.20' 'mp_stress'  14181         1024     179.19   156216      556     3        5      -51    0  CPU usage is abnormally high (> 90): 179.19  
'03-22-14 09.55.25' 'mp_stress'  14181         1024     177.91   214264      556     3        5      -51    0  CPU usage is abnormally high (> 90): 177.91  
'03-22-14 09.55.30' 'mp_stress'  14181         1024     179.90   273036      556     3        5      -51    0  CPU usage is abnormally high (> 90): 179.90  
'03-22-14 09.55.35' 'mp_stress'  14181         1024     178.80   331544      556     3        5      -51    0  CPU usage is abnormally high (> 90): 178.80  
'03-22-14 09.55.40' 'mp_stress'  14181         1024     178.58   390128      556     3        5      -51    0  CPU usage is abnormally high (> 90): 178.58  
'03-22-14 09.55.45' 'mp_stress'  14181         1024     174.20   448892      556     3        5      -51    0  CPU usage is abnormally high (> 90): 174.20  
'03-22-14 09.55.50' 'mp_stress'  14181         1024     183.35   507468      556     3        5      -51    0  CPU usage is abnormally high (> 90): 183.35  
'03-22-14 09.55.55' 'mp_stress'  14181         1024     179.77   566240      556     3        5      -51    0  CPU usage is abnormally high (> 90): 179.77  
 - From 09.55.15 on mp_stress takes a lot of CPU  ( 180% from max value of 200% for 2 CPU system)
 - Memory usage is increasing for 38Mbyte to 566 Mbyte at the end of our monitor window 
 - mp_stress program is running on realtime priortiy of -51 which is quite high
 - 5 Threads ( == 4 Worker Threads ) will be lead to heavy CPU contentions as we have 2 RT threads
   per CPU  


PROBLEM SCENARIO:~/print_proc.sh  grac41_CHMOS  mp_stress | egrep 'name:|03-22-14 10.03'
                    name:          pid: #procfdlimit: cpuusage: privmem:   shm:   #fd: #threads: priority: nice: 
'03-22-14 10.03.00' 'mp_stress'  16867         1024     166.39  3072956      556     3        5      -51    0  CPU usage is abnormally high (> 90): 166.39  
'03-22-14 10.03.05' 'mp_stress'  16867         1024       0.15  3069048      556     3        5      -51    0
'03-22-14 10.03.10' 'mp_stress'  16867         1024        0.0  3064628      556     3        5      -51    0
'03-22-14 10.03.15' 'mp_stress'  16867         1024        0.0  3060724      556     3        5      -51    0
'03-22-14 10.03.20' 'mp_stress'  16867         1024        0.0  3055136      556     3        5      -51    0
'03-22-14 10.03.25' 'mp_stress'  16867         1024        0.0  3055136      556     3        5      -51    0
'03-22-14 10.03.30' 'mp_stress'  16867         1024        0.0  3055136      556     3        5      -51    0
'03-22-14 10.03.35' 'mp_stress'  16867         1024     157.25  3126540      556     3        5      -51    0  CPU usage is abnormally high (> 90): 157.25  
'03-22-14 10.03.40' 'mp_stress'  16867         1024       5.14  3139152      556     3        5      -51    0
'03-22-14 10.03.45' 'mp_stress'  16867         1024       0.19  3134516      556     3        5      -51    0
'03-22-14 10.03.50' 'mp_stress'  16867         1024        0.0  3132116      556     3        5      -51    0
'03-22-14 10.03.55' 'mp_stress'  16867         1024        0.0  3128736      556     3        5      -51    0
 - at 10.03.50 memory usage increase to 3 Gbyte !
 - now only 2 of 12 snapshots reports : CPU usage is abnormally high
 - mp_stress does not get very much CPU on all snapshots because system very likely has swapping problems
    when memory increased from 38 Mbyte to 3 Gbyte in less than 8 minutes !

Process details for LMS0 process
As RAC alert.logs tells us LMS0 process has troubles 
   IPC Send timeout detected. Sender: ospid 9116 [oracle@grac43.example.com (LMS0)]
lets monitor LSM0 process too: 
WORKING SCENARIO:  ~/print_proc.sh  grac41_CHMOS ora_lms0 | egrep 'name:|03-22-14 09.55'
                    name:               pid: #procfdlimit:  cpuusage: privmem:   shm:    #fd: #threads: priority: nice: 
'03-22-14 09.55.00' 'ora_lms0_grac41'  27131        65536      13.80     9984    47660    19        1       -2    0
'03-22-14 09.55.05' 'ora_lms0_grac41'  27131        65536      12.59     9984    49768    19        1       -2    0
'03-22-14 09.55.10' 'ora_lms0_grac41'  27131        65536       4.59     9988    50504    19        1       -2    0
'03-22-14 09.55.15' 'ora_lms0_grac41'  27131        65536        0.0     9988    50504    19        1       -2    0
'03-22-14 09.55.20' 'ora_lms0_grac41'  27131        65536        0.0     9988    50504    19        1       -2    0
'03-22-14 09.55.25' 'ora_lms0_grac41'  27131        65536        0.0     9988    50504    19        1       -2    0
'03-22-14 09.55.30' 'ora_lms0_grac41'  27131        65536        0.0     9988    50504    19        1       -2    0
'03-22-14 09.55.35' 'ora_lms0_grac41'  27131        65536        0.0     9988    50504    19        1       -2    0
'03-22-14 09.55.40' 'ora_lms0_grac41'  27131        65536        0.0     9988    50504    19        1       -2    0
'03-22-14 09.55.45' 'ora_lms0_grac41'  27131        65536        0.0     9988    50504    19        1       -2    0
'03-22-14 09.55.50' 'ora_lms0_grac41'  27131        65536        0.0     9988    50504    19        1       -2    0
'03-22-14 09.55.55' 'ora_lms0_grac41'  27131        65536        0.0     9988    50504    19        1       -2    0
--> Only till 09.55.15 LMS0 process gets CPU 

PROBLEM SCENARIO: ~/print_proc.sh  grac41_CHMOS ora_lms0 | egrep 'name:|03-22-14 10.03'
                    name:               pid: #procfdlimit:  cpuusage: privmem:   shm:   #fd: #threads: priority: nice: 
'03-22-14 10.03.00' 'ora_lms0_grac41'  27131        65536       0.25     3556    13040    19        1       -2    0
'03-22-14 10.03.05' 'ora_lms0_grac41'  27131        65536        0.0     3700    12820    19        1       -2    0
'03-22-14 10.03.10' 'ora_lms0_grac41'  27131        65536        0.0     3664    12720    19        1       -2    0
'03-22-14 10.03.15' 'ora_lms0_grac41'  27131        65536        0.0     3664    12652    19        1       -2    0
'03-22-14 10.03.20' 'ora_lms0_grac41'  27131        65536        0.0     3664    12640    19        1       -2    0
'03-22-14 10.03.25' 'ora_lms0_grac41'  27131        65536        0.0     3664    12640    19        1       -2    0
'03-22-14 10.03.30' 'ora_lms0_grac41'  27131        65536        0.0     3664    12640    19        1       -2    0
'03-22-14 10.03.35' 'ora_lms0_grac41'  27131        65536        0.0     3752    12120    19        1       -2    0
'03-22-14 10.03.40' 'ora_lms0_grac41'  27131        65536        0.0     3752    12112    19        1       -2    0
'03-22-14 10.03.45' 'ora_lms0_grac41'  27131        65536        0.0     3752    12104    19        1       -2    0
'03-22-14 10.03.50' 'ora_lms0_grac41'  27131        65536        0.0     3920    12072    19        1       -2    0
'03-22-14 10.03.55' 'ora_lms0_grac41'  27131        65536        0.0     3916    12000    19        1       -2    0
  - ora_lms0_grac41 does not  get  any CPU for most of the snapshots
  - ora_lms0_grac41 is not scheduled for CPU ( cpuusage = 0.0 ) as ora_lms0_grac41 runs with a priority of -2 whereas 
  - mp_stress runs on priority -51

Root cause

  • CPU starvation and  heavy paging/swapping was the root cause for IPC send timeout problem which leads to instance eviction of instamce 1
  • Of course bad I/O distribution also contributes to the above instance  eviction

Reference :

Leave a Reply

Your email address will not be published. Required fields are marked *