GES Locks and Deadlocks

——————————————————————————————–

This article  follows closely chapter 11 of following book:

 Expert Oracle RAC 12c  written by Rivaj Shamsudeen

——————————————————————————————–

System Setup

  • CRS: 11.2.0.4
  • Rdbms: 11.2.0.4
  • 3-Node cluster running on OEL 6.4

 

Overview

Resource types:
BL resource:      Buffer Lock resource are used in Cache Fusion and protect database buffers from concurrent 
                  modifications by multiple instances. The GCS layer handles BL resources
NON BL resource:  Resources like TX, TMS SQ ... are non-BL resources and are handled by the GES layer.
                  Each resource can have a list of locks ( GRANT Queue )
                  Each resource can have a list of waiting locks ( CONVERT Queue) 

GES 
Latches      :    Latches are instance specific
Enqueues     :    Enqueues can be both local to an instance and global in a clustered environment 
                  Enqueues serialize access to database resources

 

Resource Names and their relationiship to object IDs / Transaction IDs

Lock table creates a TM resource type which can be used to identify the involved object
As user SCOTT:
SQL> lock table emp in exclusive mode;
Table(s) Locked.

As user SYS retrieve object ID in Dec and Hex 
SQL> select    object_id ,lower(to_char(OBJECT_ID,'XXXXX')) ID_hex  from dba_objects where object_name = 'EMP';
 OBJECT_ID ID_HEX
---------- ------
     87452  1559c

Find the instance TM lock
SQL> select * from gv$lock where lmode=6 and type ='TM';
  INST_ID ADDR             KADDR            SID TY      ID1    ID2 LMODE   REQUEST CTIME      BLOCK
  ------ ---------------- ---------------- ---- -- --------- ----- ------  ------- ------- --------
       1 00007FE138BAC380 00007FE138BAC3E0   32 TM    87452      0      6        0    187         2

Find GES related lock 
SQL> select INST_ID,resource_name, RESP,ON_CONVERT_Q,ON_GRANT_Q,PERSISTENT_RES,MASTER_NODE, NEXT_CVT_LEVEL, VALUE_BLK_STATE
             from gv$ges_resource where resource_name like '%1559c%';
   INST_ID RESOURCE_NAME                  RESP             ON_CONVERT_Q ON_GRANT_Q PERSISTENT_RES MASTER_NODE NEXT_CVT_ VALUE_BLK_STATE
---------- ------------------------------ ---------------- ------------ ---------- -------------- ----------- --------- ------------------------
         3 [0x1559c][0x0],[TM][ext 0x0,0x 00000000AD981D28          0            1              0           2  KJUSERNL    KJUSERVS_NOVALUE
         1 [0x1559c][0x0],[TM][ext 0x0,0x 00000000ADA38638          0            1              0           2  KJUSERNL    KJUSERVS_NOVALUE

--> Summary
   The first part of the RESOURCE_NAME identifies the object_id in hex for TM resource 
   Note for other GES lock types like [TX],  [LB], .. the meaning for the resource name changes 

 

TM lock contention in a RAC eviroment

TM lock contention sample
Session 1 on instance 1
SQL> lock table emp in exclusive mode;
Table(s) Locked.

Session 2 on instance 2
SQL> lock table emp in exclusive mode;
--> hang 

Verify local table locks
SQL> select INST_ID, SID, TYPE, ID1, ID2, LMODE, REQUEST, CTIME, BLOCK, 
     lower(to_char(ID1,'XXXXX')) OBJ_HEX from gv$lock where lmode=6 and type ='TM';
   INST_ID	  SID TY	ID1	   ID2	    LMODE    REQUEST	  CTIME      BLOCK OBJ_HE
---------- ---------- -- ---------- ---------- ---------- ---------- ---------- ---------- ------
	 2	   47 TM      87452	     0		6	   0	  10106 	 2  1559c
-> Instance 2 holds a global lock on OBJECT with ID 87452 ( ID1) . This OBJECT_ID is 1559c in Hex. 

Check GES locks
SQL> select be.inst_id,  be.pid, resource_name1, be.state, owner_node, blocked, blocker , s.sid SESSION_ID
      from gv$ges_blocking_enqueue be, gv$process p, gv$session s
    	     where be.resource_name1 like '%1559c%' and p.spid=be.pid and be.inst_id  = p.inst_id 
             and  be.inst_id  = s.inst_id and s.paddr = p.addr
    	 order by resource_name1;
   INST_ID	  PID RESOURCE_NAME1		     STATE     OWNER_NODE    BLOCKED    BLOCKER SESSION_ID
---------- ---------- ------------------------------ -------- ---------- ---------- ---------- ----------
	 1	 9794 [0x1559c][0x0],[TM][ext 0x0,0x OPENING	       0	  1	     0	       32
	 2	22756 [0x1559c][0x0],[TM][ext 0x0,0x GRANTED	       1	  0	     1	       47
->Summary
    Instance 2 has been GRANTED this resource . The local session ID is 47
    Session with SID=32 on Instance 1 is waiting on this resource ( STATE = OPENING) and is blocked ( BLOCKED = 1) .

As session is waiting on Instance 1 lets have closer look on this session:
SQL> SELECT SUBSTR(SS.USERNAME,1,8) USERNAME, SS.OSUSER "USER", AR.MODULE || ' @ ' || SS.machine CLIENT, 
       SS.PROCESS Client_PID, SS.SID SID, TO_CHAR(AR.LAST_LOAD_TIME, 'DD-Mon HH24:MM:SS') LOAD_TIME, 
       SS.SECONDS_IN_WAIT,  AR.DISK_READS DISK_READS, AR.BUFFER_GETS BUFFER_GETS, SUBSTR(SS.LOCKWAIT,1,10) LOCKWAIT, 
        W.EVENT EVENT, SS.status, AR.SQL_fullTEXT SQL
    FROM V$SESSION_WAIT W, V$SQLAREA AR, V$SESSION SS, v$timer T WHERE SS.SQL_ADDRESS = AR.ADDRESS
    	AND SS.SQL_HASH_VALUE = AR.HASH_VALUE AND SS.SID = w.SID (+) AND ss.STATUS = 'ACTIVE' 
        AND W.EVENT != 'client message' and username='SCOTT'
    ORDER BY  SS.LOCKWAIT ASC, SS.USERNAME, AR.DISK_READS DESC;
USERNAME USER	     CLIENT		            CLIENT_PID	  SID   LOAD_TIME	     SECONDS_IN_WAIT DISK_READS
-------- ---------- ----------------------------    -----------  -----  ---------------  --------------- ----------
BUFFER_GETS LOCKWAIT   EVENT				STATUS	 SQL
----------- ---------- -------------------------------- -------- ---------- -------------------------------------------
SCOTT	 oracle     SQL*Plus @ grac41.example.com   9793	   32   25-Oct 11:10:18	     10467	    1
	 24 00000000AA enq: TM - contention		ACTIVE	 lock table emp in exclusive mode
--> Session 32 tries to run a lock table command and is now waiting 10467 seconds  on a TM contention.

Lets check the session on instance 2. Here we have a look as the previous SQL as the command creating 
the TM lock is already finished. If we are lucky we can find the command responsible for the TM lock.

SQL> SELECT SUBSTR(SS.USERNAME,1,8) USERNAME, SS.OSUSER "USER", AR.MODULE || ' @ ' || SS.machine CLIENT, 
      SS.PROCESS Client_PID, SS.SID SID, TO_CHAR(AR.LAST_LOAD_TIME, 'DD-Mon HH24:MM:SS') LOAD_TIME, 
      SS.SECONDS_IN_WAIT,  AR.DISK_READS DISK_READS, AR.BUFFER_GETS BUFFER_GETS, 
      SUBSTR(SS.LOCKWAIT,1,10) LOCKWAIT,SS.status, AR.SQL_fullTEXT SQL
    FROM  V$SQLAREA AR, V$SESSION SS  WHERE SS.PREV_SQL_ADDR = AR.ADDRESS
    	AND SS.PREV_HASH_VALUE = AR.HASH_VALUE	and username='SCOTT'
    ORDER BY  SS.LOCKWAIT ASC, SS.USERNAME, AR.DISK_READS DESC;
USERNAME USER	        CLIENT				 CLIENT_PID   SID  LOAD_TIME	   SECONDS_IN_WAIT DISK_READS
-------- -------------- -------------------------------- ----------- ----  --------------- --------------- ----------
BUFFER_GETS LOCKWAIT   STATUS	SQL
----------- ---------- -------- --------------------------------------------------------------------------------
SCOTT	 oracle 	SQL*Plus @ grac41.example.com	 10014         47  25-Oct 16:10:33	    6086	  0
	 24	       INACTIVE lock table emp in exclusive mode

--> Summary 
   Indeed session with SID 47 has run lock table emp in exclusive mode via a sqlplus session .

TX Enqueue contention in a RAC environment

Prepare TX Enqueue contention in RAC env
Session 1:
HOST_NAME               INSTANCE_NAME
------------------------------ ----------------
grac42.example.com           grac42

SQL> update emp set sal=sal+102 where empno=7902;
1 row updated.
SQL> select dbms_transaction.local_transaction_id from dual;
LOCAL_TRANSACTION_ID
--------------------------------------------------------------------------------
20.9.563
--> Note you can use the following statement to get your local transaction ID:
   SQL> select XIDUSN || '.' || XIDSLOT   || '.' ||  XIDSQN from v$transaction;

Session 2 on instance 1:
HOST_NAME               INSTANCE_NAME
------------------------------ ----------------
grac41.example.com           grac41
SQL> update emp set sal=sal+101 where empno=7902;
--> Session hangs

Check global TX locks 
SQL> select * from gv$lock where type='TX';
   INST_ID ADDR             KADDR             SID TY      ID1     ID2   LMODE   REQUEST  CTIME  BLOCK
---------- ---------------- ---------------- ---- --  --------  -----  ------- --------  -----  -----
         2 00000000AAF89718 00000000AAF89790   32 TX  1310729     563       6         0    101     2
         1 00000000AFD3BBA8 00000000AFD3BC00   57 TX  1310729     563       0         6     92     0
--> Session 32 on instance2 is holding and excl. lock ( LMODE=6 ) where as session 57 on instance 1 
       is requesting that lock 
    ID1,ID2 uniquely defined a lock - this means instance 1 and instance 2 are requesting the same 
        lock because ID1,ID2 are equal.

Checking GES locks
SQL> select be.inst_id,  be.pid, resource_name1, be.state, owner_node, blocked, blocker
      from gv$ges_blocking_enqueue be
             where be.resource_name1 like '%TX%';
INST_ID      PID RESOURCE_NAME1                    STATE    OWNER_NODEBLOCKED    BLOCKER
------- -------- -------------------------------  -------- ---------- ---------- ----------
     2     6740   [0x140009][0x233],[TX][ext 0x5  GRANTED           1      0         1
     1    22506   [0x140009][0x233],[TX][ext 0x5  OPENING           0      1         0
--> The TX ID 20.9.563 is reported in Hex Notation in our RESOURCE_NAME ( 20  = 0x14 - 9 = 0x9 - 563 = 0x223 )  
    This means we can always calculate the real  TX ID from a resource name with type [TX] ! 
--> PID 22506  on instance 1 can be easily detected as our sqlplus session - status opening  which means waiting

Let's running our session_wait script on instance 1 to learn more about process 22506
SQL> SELECT SUBSTR(SS.USERNAME,1,8) USERNAME, SS.OSUSER "USER", AR.MODULE || ' @ ' || SS.machine CLIENT, 
      SS.PROCESS Client_PID, p.spid SERVER_PID, SS.SID SID, TO_CHAR(AR.LAST_LOAD_TIME, 'DD-Mon HH24:MM:SS') LOAD_TIME, 
      SS.SECONDS_IN_WAIT,  AR.DISK_READS DISK_READS, AR.BUFFER_GETS BUFFER_GETS, SUBSTR(SS.LOCKWAIT,1,10) LOCKWAIT, 
      W.EVENT EVENT, SS.status, AR.SQL_fullTEXT SQL
    FROM V$SESSION_WAIT W, V$SQLAREA AR, V$SESSION SS, v$timer T, V$PROCESS p WHERE SS.SQL_ADDRESS = AR.ADDRESS
        AND SS.SQL_HASH_VALUE = AR.HASH_VALUE AND SS.SID = w.SID (+) AND ss.STATUS = 'ACTIVE' 
        AND W.EVENT != 'client message' and p.addr=ss.paddr
        and ss.username='SCOTT'
    ORDER BY  SS.LOCKWAIT ASC, SS.USERNAME, AR.DISK_READS DESC;
USERNAME   USER       CLIENT                          CLIENT_PID  SERVER_PID   SID LOAD_TIME       SECONDS_IN_WAIT
---------- ---------  -----------------------------  -----------  ------------ --- --------------- ---------------
DISK_READS BUFFER_GETS LOCKWAIT   EVENT                STATUS   SQL
---------- ----------- ---------- -------------------------------- -------- -------------------------------------
SCOTT       oracle     SQL*Plus @ grac41.example.com  22505        22506        57 25-Oct 20:10:34        1145
     0         6 00000000AF enq: TX - row lock contention    ACTIVE    update emp set sal=sal+101  where empno=7902
--> Session SID 57 with  OSPID 22506 runs following SQL:  update emp set sal=sal+101  where empno=7902; and gets 
    blocked to due TX - row lock contention

According to gv$lock SID 32 on instance 2 is holding that lock - lets run our previous SQL script 
SQL> SELECT SUBSTR(SS.USERNAME,1,8) USERNAME, SS.OSUSER "USER", AR.MODULE || ' @ ' || SS.machine CLIENT, 
       SS.PROCESS Client_PID, SS.SID SID, TO_CHAR(AR.LAST_LOAD_TIME, 'DD-Mon HH24:MM:SS') LOAD_TIME, 
       SS.SECONDS_IN_WAIT,  AR.DISK_READS DISK_READS,
       AR.BUFFER_GETS BUFFER_GETS, SUBSTR(SS.LOCKWAIT,1,10) LOCKWAIT,SS.status, AR.SQL_fullTEXT SQL
    FROM  V$SQLAREA AR, V$SESSION SS  WHERE SS.PREV_SQL_ADDR = AR.ADDRESS
        AND SS.PREV_HASH_VALUE = AR.HASH_VALUE
        and username='SCOTT'
    ORDER BY  SS.LOCKWAIT ASC, SS.USERNAME, AR.DISK_READS DESC;
USERNAME USER        CLIENT                           CLIENT_PID SID LOAD_TIME       SECONDS_IN_WAIT DISK_READS
-------- ----------  -------------------------------- ---------- --- --------------- --------------- ----------
BUFFER_GETS LOCKWAIT   STATUS    SQL
----------- ---------- -------- --------------------------------------------------------------------------------
SCOTT     oracle     SQL*Plus @ grac41.example.com     24529      32 26-Oct 08:10:53           6          0
      8           INACTIVE  update emp set sal=sal+102  where empno=7902
--> Here we see that this Session with SID 32 did run following SQL :  
      update emp set sal=sal+102  where empno=7902 and can identifid as the blocker  

The problem is that this OSPID for our server process doesn't match 
SQL> SELECT SUBSTR(SS.USERNAME,1,8) USERNAME, SS.OSUSER "USER", AR.MODULE || ' @ ' || SS.machine CLIENT, 
      SS.PROCESS Client_PID, SS.SID SID, p.spid SERVER_PID,  
      TO_CHAR(AR.LAST_LOAD_TIME, 'DD-Mon HH24:MM:SS') LOAD_TIME, SS.SECONDS_IN_WAIT,  AR.DISK_READS DISK_READS,
      AR.BUFFER_GETS BUFFER_GETS, SUBSTR(SS.LOCKWAIT,1,10) LOCKWAIT,SS.status, 
       AR.SQL_fullTEXT SQL
    FROM  V$SQLAREA AR, V$SESSION SS, V$PROCESS p WHERE SS.PREV_SQL_ADDR = AR.ADDRESS 
       and p.addr=ss.paddr AND SS.PREV_HASH_VALUE = AR.HASH_VALUE and ss. username='SCOTT'
    ORDER BY  SS.LOCKWAIT ASC, SS.USERNAME, AR.DISK_READS DESC;
USERNAME USER     CLIENT                             CLIENT_PID SID SERVER_PID LOAD_TIME
-------- -------_  -------------------------------- ----------- --- ---------- -----------
SECONDS_IN_WAIT DISK_READS BUFFER_GETS LOCKWAIT   STATUS   SQL
--------------- ---------- ----------- ---------- -------- -----------------------------------
SCOTT     oracle   SQL*Plus @ grac41.example.com     24529       32 16767      26-Oct 08:10:53
        904      0         8          INACTIVE  update emp set sal=sal+102    where empno=7902
--> From above query we can see that for SID 32 the reported SERVER_PID is 16767 
So what Server PID is reported by  gv$ges_blocking_enqueue ? 
     2     6740 [0x140009][0x233],[TX][ext 0x5 GRANTED               1      0         1
Run a ps command on instance 2 
#  ps -elf | grep 6740
0 S oracle    6740     1  0  80   0 - 392687 poll_s Oct25 ?       00:02:08 ora_lmd0_grac42
--> Ok the LMD0 process is reported as the grant owner !! 

 

Deadlocks in a RAC environment

  • Deadlock detection is executed by LMD process which will detetect deadlocks in nearly 10 seconds
  • After 10s a waiting session is chosen and the statement is rolled back followed by an ORA-60
Create a deadlock scenario:
Session 1 on grac41: SQL> update emp set sal=sal+101   where empno=7934;
Session 2 on grac42: SQL> update emp set sal=sal+102  where empno=7902
Session 1 on grac41: SQL> update emp set sal=sal+101  where empno=7902;
       --> Hang
Session 2 on grac42: SQL> update emp set sal=sal+102  where empno=7934;
       --> Deadlock !!

Print global locks
SQL> select * from gv$lock where type='TX';
   INST_ID ADDR         KADDR             SID TY      ID1      ID2  LMODE   REQUEST  CTIME BLOCK
---------- ---------------- ------------- --- -- ---------- ------  ------  -------  ----- ----------
     2 00000000AFD3AD90 00000000AFD3ADE8   32 TX     589829   1345      0         6      3       0
     2 00000000AAF89718 00000000AAF89790   32 TX    1114145    559      6         0     15       2
     1 00000000AAF97680 00000000AAF976F8   57 TX     589829   1345      6         0     11       2
     1 00000000AFD3B110 00000000AFD3B168   57 TX    1114145    559      0         6      9       0

Print GES locks
SQL> select be.inst_id,  be.pid, resource_name1, be.state, owner_node, blocked, blocker
      from gv$ges_blocking_enqueue be where be.resource_name1 like '%TX%';
INST_ID     PID RESOURCE_NAME1                 STATE      OWNER_NODE    BLOCKED    BLOCKER
------- ------- ------------------------------ ---------  ---------- ---------- ----------
     2     6740 [0x110021][0x22f],[TX][ext 0x5 GRANTED             1      0         1
     2    16767 [0x90005][0x541],[TX][ext 0x2, OPENING             1      1         0
     1    22506 [0x110021][0x22f],[TX][ext 0x5 OPENING             0      1         0
     1     7255 [0x90005][0x541],[TX][ext 0x2, GRANTED             0      0         1
--> Here we have 2 sessions involved with 2 locks ( Lock1: 589829/1345 Lock2: 1114145/559  )  
    From gv$ges_blocking_enqueue  we see the Deadlock: 
      Both GES locks are GRANTED on one Node but requested on the other Node 

Reading related  LMD Traces from grac41,grac2

grac42 lmd0 trace reports the following after a Deadlock detection:
Global blockers dump start:---------------------------------
DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x90005][0x541],[TX][ext 0x2,0x0]

Now nearly at the same time node grac41 reports an ENQUEUE DUMP REQUEST for the same  
ENQUEUE DUMP REQUEST: from 2 spnum 12 on [0x90005][0x541],[TX][ext 0x2,0x0] for reason 3 mtype 0
DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x90005][0x541],[TX][ext 0x2,0x0] 

First section is about the resource details ( of course Local inst should be different )
DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x90005][0x541],[TX][ext 0x2,0x0]
----------resource 0xad9a9a88----------------------
resname       : [0x90005][0x541],[TX][ext 0x2,0x0]
hash mask     : x3
Local inst    : 1
dir_inst      : 1
master_inst   : 1

grac41: 
grant mode    : KJUSERNL  KJUSERCR  KJUSERCW  KJUSERPR  KJUSERPW  KJUSEREX
count         : 1         0         0         0         0         1
...
Granted_locks : 1
Cvting_locks  : 1 
..
lp 0xadbf4ec0 gl KJUSEREX rp 0xad9a9a88 [0x90005][0x541],[TX][ext 0x2,0x0]
  master 1 gl owner 0xaf806b90 possible pid 22506 xid 30000-0001-000000CC bast 0 rseq 9 mseq 0
  history CVT2GRCANCL > REM_AST > MSGSENT > FREE > REF_RES > LOC_AST
  open opt KJUSERDEADLOCK
CONVERT_Q:
lp 0xadc06d30 gl KJUSERNL rl KJUSEREX rp 0xad9a9a88 [0x90005][0x541],[TX][ext 0x2,0x0]
  master 1 owner 2  bast 1 rseq 5 mseq 0x1
  history MSGSENT > FREE > REF_RES > REM_AST > GR2CVT > MSGSENT
  convert opt KJUSERGETVALUE

grac42:
grant mode    : KJUSERNL  KJUSERCR  KJUSERCW  KJUSERPR  KJUSERPW KJUSEREX 
count         : 1         0         0         0         0         0
..
Granted_locks : 0
Cvting_locks  : 1
..
GRANTED_Q :
CONVERT_Q:
lp 0xadd3c3a8 gl KJUSERNL rl KJUSEREX rp 0xada50a70 [0x90005][0x541],[TX][ext 0x2,0x0]
  master 1 gl owner 0xaf853170 possible pid 16767 xid 27000-0002-000000DD bast 0 rseq 5 mseq 0
  history LOC_AST > CLOSE > FREE > REF_RES > MSGSENT > GR2CVT
  convert opt KJUSERGETVALUE
--> grac41 has been granted this resource in exclusive mode - where grac42 is waiting on 
e   exclusive lock 

Information about Locks are printed in the Enqueue Section
----------enqueue 0xadbf4ec0------------------------
grant_level      : KJUSEREX
req_level        : KJUSEREX
..
pid              : 7255
possible pid     : 22506
..
user session for deadlock lock 0xadbf4ec0
  sid: 57 ser: 1007 audsid: 641691 user: 85/SCOTT
  pid: 48 O/S info: user: oracle, term: UNKNOWN, ospid: 22506
    image: oracle@grac41.example.com (TNS V1-V3)
  client details:
    O/S info: user: oracle, term: pts/7, ospid: 22505
    machine: grac41.example.com program: sqlplus@grac41.example.com (TNS V1-V3)
    application name: SQL*Plus, hash value=3669949024
  current SQL:
  update emp set sal=sal+101  where empno=7902
DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK

on grac2:
----------enqueue 0xadd3c3a8------------------------
Owner inst       : 2
grant_level      : KJUSERNL
req_level        : KJUSEREX
..
pid              : 16767
opid             : 16767
..
req_level        : KJUSEREer session for deadlock lock 0xadd3c3a8
  sid: 32 ser: 525 audsid: 660372 user: 85/SCOTT
  pid: 39 O/S info: user: grid, term: UNKNOWN, ospid: 16767
    image: oracle@grac42.example.com
  client details:
    O/S info: user: oracle, term: pts/5, ospid: 24529
    machine: grac41.example.com program: sqlplus@grac41.example.com (TNS V1-V3)
    application name: SQL*Plus, hash value=3669949024
  current SQL:
   update emp set sal=sal+102  where empno=7934
DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK

Summary:
  Process 16767 is running SQL : update emp set sal=sal+102  where empno=7934 and tries to aquire a a lock which is 
    hold by process 22506 ( see possible pid     : 22506) on  instance 1.
  Even the current reported SQLs : 
    Instance 1 : update emp set sal=sal+101  where empno=7902;
    Instance 2 : update emp set sal=sal+102  where empno=7934;
  you need to aware the the root cause could be a former SQL run in that transaction/session. 
  
Note: If you are lucky you can retrieve this info by using PREV_SQL_ADDR, PREV_HASH_VALUE from V$session

 

RAC and DFS lock handle with Sequences  [  SV Resource ]

  •  DFS stands for Distributed File System and has been used since OPS ( Oracle Prallel Server )
  •  DFS lock is used for typical RAC actions like Global Checkpointing or for DDL operations like truncate table –
  • DFS lock handles are also used for  sequence operation
  • Sequence Performance of combined options ordered from  worst to best for RAC

NOCACHE with ORDER

NOCACHE with NOORDER

CACHE with ORDER

CACHE with NOORDER

Create a sequence with oder attribute  and access this sequence from 2 different cluster nodes
Session 1 on Instance 1:
    SQL>  ALTER SESSION SET EVENTS '10046 trace name context forever, level 12 ';
    Session altered.
    SQL> create sequence t1_seq order cache 100;
    Sequence created.
    SQL> select t1_seq.nextval from dual;
       NEXTVAL
    ----------
         1
Session 2 on Instance 2:
    SQL> select t1_seq.nextval from dual;
       NEXTVAL
    ----------
         2
Session 1 on Instance 1:
    SQL>  select t1_seq.nextval from dual;
       NEXTVAL
    ----------
         3
Session 2 on Instance 2:
    SQL> select t1_seq.nextval from dual;
       NEXTVAL
    ----------
         4

Review created trace file from session 1 on instance 1:
PARSING IN CURSOR #140722143364984 len=32 dep=0 uid=85 oct=3 lid=85 tim=1382859615260938 hv=789899430 
  ad='a1adeb70'  sqlid='21ckb1wrj9u56' select t1_seq.nextval from dual
..
WAIT #140722143364984: nam='enq: SV -  contention' ela= 704 type|mode=1398145029 id1=87918 id2=0 obj#=0
   tim=1382859615265874 

The mode value ( 1398145029 ) in the tracefile can be translated to :
SQL> select chr(bitand(&&mode,-16777216)/16777215) || chr(bitand(&&mode,16711680)/65535) LOCK_TYPE,
     mod(&&mode,16) LOCK_MODE from dual;
LOCK_TYPE   LOCK_MODE
---------- ----------
SV            5
-> Lock type : SV  
-> Lock Mode: 5 ( == KJUSERPW -> Protected Write ) 

Create a resource name for this sequence:
SQL> select distinct '[0x'|| trim(to_char(object_id,'xxxxxxxx'))||'][0x0],[SV]' res_name, 
     object_id, owner from dba_objects where object_name=upper('&&obj_name')
RES_NAME                  OBJECT_ID OWNER
----------------------- ---------- ------------------------------
[0x1576e][0x0],[SV]         87918 SCOTT

First fetch from sequence and comvert value to internal Oracle Number format in Hex format:
SQL>   select t1_seq.nextval from dual;
   NEXTVAL
----------
    12
SQL> select dump(12) from dual;
DUMP(12)
-------------------
Typ=2 Len=2: 193,13
SQL> select to_char(193,'xx') || to_char(13,'xx') from dual;
TO_CHA
------
 c1  d

Verify that gv$ges_resource already knows this resource:  
SQL> select INST_ID,resource_name, VALUE_BLK_STATE ,  VALUE_BLK
        from gv$ges_resource where resource_name like
        (select distinct '%[0x'|| trim(to_char(object_id,'xxxxxxxx'))||'][0x0],[SV]%' 
     from dba_object    where  object_name='T1_SEQ' and owner='SCOTT');
   INST_ID RESOURCE_NAME                    VALUE_BLK_STATE    VALUE_BLK
---------- ------------------------------ -------------------- ----------------------------------
     1     [0x1576e][0x0],[SV][ext 0x0,0x KJUSERVS_VALUE       0x82c10e00000000000000000000000000 
     2     [0x1576e][0x0],[SV][ext 0x0,0x KJUSERVS_VALUE       0x82c10d00000000000000000000000000
--> c10d, c10e in value block represents the sequence value - As we have fetched the seq. on 
    instance 1 this represents the current value.

As expected locks are aquired on the SV resource to coordinate the sequence value. During 
sequence access GRANT_LEVEL and REQUEST_LEVEL should be change to  KJUSEREX. 
SQL> select INST_ID,resource_name1,pid, state , owner_node, grant_level, request_level
       from gv$ges_enqueue where resource_name1 like
       (select distinct '%[0x'|| trim(to_char(object_id,'xxxxxxxx'))||'][0x0],[SV]%' 
          from dba_objects         where  object_name='T1_SEQ' and owner='SCOTT');
   INST_ID RESOURCE_NAME1               PID STATE    OWNER_NODE GRANT_LEV REQUEST_L
---------- ------------------------   ----- -------  ---------- --------- ---------
     2 [0x1576e][0x0],[SV][ext 0x0,0x  6783 GRANTED           1 KJUSERNL  KJUSERNL
     1 [0x1576e][0x0],[SV][ext 0x0,0x  7300 GRANTED           0 KJUSERNL  KJUSERNL
     1 [0x1576e][0x0],[SV][ext 0x0,0x     0 GRANTED           1 KJUSERNL  KJUSERNL

Verify this PIDs from  gv$ges_enqueue:
#ps  -elf |grep 7300
0 S oracle    7300     1  0  80   0 - 389080 poll_s Oct25 ?       00:01:46 ora_lck0_grac41
#ps  -elf | grep 6783
0 S oracle    6783     1  0  80   0 - 389080 poll_s Oct25 ?       00:01:31 ora_lck0_grac42

 

Library Cache Locks/Pins in a RAC env

  • Library Cache Locks are globalized as resource types LA through LP
  • Library Cache  Pins are globalized as resource types NA through NZ
  •  GRD uses first 16 Byte of column  kglnahsv  from table x$kglob to create a KGL resourses (KGL = Kernel General Library )
  • KGL Resource Sample: [0x619d0263][0xc6de206e].[LB] – [0x619d0263] = Byte 1-8 – [0xc6de206e] = Byte 9-16 from column  kglnahsv

 

Create a KGL resource name from kglnahsv col of table x$kglob
SQL> select    '[0x' || substr(kglnahsv,1,8) || '][0x'|| substr(kglnahsv,9,8)    || ']' kgl_resname, 
     kglnaown kgl_owner, kglnaobj kgl_obj_name
        from  x$kglob where kglnaobj like upper('&&OBJECT_NAME');
KGL_RESNAME               KGL_OWNER             KGL_OBJNAME
------------------------ ---------------------- --------------------------------
[0x619d0263][0xc6de206e] SCOTT                  T_LIBTEST

Display KGL Resource Details for object :  SCOTT.T_LIBTEST
SQL> declare
    begin
      for c1 in
        (select     '[0x' || substr(kglnahsv,1,8) || '][0x'|| substr(kglnahsv,9,8)  || ']' kgl_resname, 
          kglnaown kgl_owner, kglnaobj kgl_obj_name
             from  x$kglob where kglnaobj like upper('&&OBJECT_NAME') )
         loop
           dbms_output.put_line('--------------------------------------------');
           dbms_output.put_line('Object Details:  ' || c1.kgl_owner || '.' || c1.kgl_obj_name);
           dbms_output.put_line('--------------------------------------------');
           dbms_output.put_line('Resource Details:  ');
           for c2 in
             (select resource_name,master_node from V$ges_resource where resource_name like '%'||c1.kgl_resname||'%')
             loop
                 dbms_output.put_line('->    Resource name: ' || c2.resource_name ||' - Master: ' 
                    || c2.master_node );
             end loop;
           dbms_output.put_line('Lock Details:  ');
           for c3 in
             ( select resource_name1, transaction_id0, pid, state, owner_node, grant_level from v$ges_enqueue 
               where resource_name1 like '%'||c1.kgl_resname||'%')
             loop
                dbms_output.put_line('->    Resource name: ' || c3.resource_name1 || ' - Owner_node: ' || 
                  c3.owner_node  ||  ' - Transaction_id0: ' || c3.transaction_id0  ||    ' - State: ' || 
                  c3.state || ' - Grant_level:  ' || c3.grant_level );
              end loop;
           dbms_output.put_line('--------------------------------------------');
        end loop;
   end;
   /
old   5:     from  x$kglob where kglnaobj like upper('&&OBJECT_NAME') )
new   5:     from  x$kglob where kglnaobj like upper('T_LIBTEST') )
--------------------------------------------
Object Details:  SCOTT.T_LIBTEST
--------------------------------------------
Resource Details:
->   Resource name: [0x619d0263][0xc6de206e],[NB][ - Master: 1
->   Resource name: [0x619d0263][0xc6de206e],[LB][ - Master: 0
Lock Details:
->   Resource name: [0x619d0263][0xc6de206e],[NB][ - Owner_node: 0 - Transaction_id0: 3145729 - State: GRANTED - Grant_level:  KJUSERPR
->   Resource name: [0x619d0263][0xc6de206e],[LB][ - Owner_node: 0 - Transaction_id0: 3145729 - State: GRANTED - Grant_level:  KJUSERPR
--------------------------------------------

 

Troubleshoot  a Library Cache Contention problem

Troubleshoot  a Library Cache Contention problem
Session 1 on instance grac1: Create a long running SQL statement
drop table t_libtest;
create table t_libtest ( n1 number );

set serveroutput on
declare 
  v_sqltext CLOB;
  c1 number;
begin
  v_sqltext := ' select a0.* from  t_libtest a0'; 
  c1 := dbms_sql.open_cursor;
  for i in 1..1024
    loop
    v_sqltext := v_sqltext || ' , t_libtest a'|| i;
    end loop;
  dbms_output.put_line(v_sqltext);
  dbms_sql.parse(c1, v_sqltext, dbms_sql.native);
end;
/

Session 2 on instance 2 adds a column 
  SQL>  alter table  T_LIBTEST add ( n3 number );
   --> Session hangs

Start investigation on instance 2 as this session hangs
SQL> SELECT SUBSTR(SS.USERNAME,1,8) USERNAME, SS.OSUSER "USER", AR.MODULE || ' @ ' || SS.machine CLIENT, 
       SS.PROCESS Client_PID, p.spid SERVER_PID, SS.SID SID, 
       TO_CHAR(AR.LAST_LOAD_TIME, 'DD-Mon HH24:MM:SS') LOAD_TIME, SS.SECONDS_IN_WAIT,  AR.DISK_READS DISK_READS,
      AR.BUFFER_GETS BUFFER_GETS, SUBSTR(SS.LOCKWAIT,1,10) LOCKWAIT, W.EVENT EVENT, SS.status, AR.SQL_fullTEXT SQL
    FROM V$SESSION_WAIT W, V$SQLAREA AR, V$SESSION SS, v$timer T, V$PROCESS p WHERE SS.SQL_ADDRESS = AR.ADDRESS
        AND SS.SQL_HASH_VALUE = AR.HASH_VALUE AND SS.SID = w.SID (+) AND ss.STATUS = 'ACTIVE' 
        AND W.EVENT != 'client message' and p.addr=ss.paddr and ss.username='SCOTT'
    ORDER BY  SS.LOCKWAIT ASC, SS.USERNAME, AR.DISK_READS DESC;
USERNAME USER    CLIENT                        CLIENT_PID  SERVER_PID SID LOAD_TIME       SECONDS_IN_WAIT DISK_READS
-------- ------  ----------------------------  ----------- ---------- --- --------------- --------------- ----------
BUFFER_GETS LOCKWAIT   EVENT                STATUS     SQL
----------- ---------- -------------------------------- -------- ----------------------------------------------------
SCOTT     oracle SQL*Plus @ grac41.example.com  10429       4838       53 27-Oct 15:10:44          364           0
     42           library cache lock        ACTIVE     alter table  T_LIBTE
--> Session with SID 53  and Server OPSID 4838 hangs running SQL:  alter table ..
--> Wait Event: library cache lock

Can we find this OS PID 4838 in  gv$ges_blocking_enqueue ?
SQL> select be.inst_id,  be.pid, resource_name1, be.state, owner_node, blocked, blocker
  2    from gv$ges_blocking_enqueue be where be.resource_name1 like '%[0x619d0263][0xc6de206e]%';
   INST_ID      PID RESOURCE_NAME1             STATE          OWNER_NODE    BLOCKED    BLOCKER
---------- ---------- ------------------------------ ---------------- ---------- ---------- ----------
     2     4838 [0x619d0263][0xc6de206e],[LB][ OPENING               1      1         0
     1    22506 [0x619d0263][0xc6de206e],[LB][ GRANTED               0      0         1
--> Yep PID tries to lock resource [0x619d0263][0xc6de206e] - but is blocked from instance 1 with PID 22506 

Check what OSPID 22506 is doing on instance 1 
SQL> set echo  off
SQL> @sess_curr_SQL.sql
SQL> SELECT s.inst_id, s.sid, s.serial#, s.status, p.spid, s.username,ar.SQL_fullTEXT CURRENT_SQL
        FROM   gv$session s JOIN gv$process p ON p.addr = s.paddr AND p.inst_id = s.inst_id
         JOIN gv$sqlarea ar on  s.SQL_ADDRESS = ar.ADDRESS AND s.SQL_HASH_VALUE = ar.HASH_VALUE 
         and ar.inst_id = s.inst_id
        WHERE  s.type != 'BACKGROUND' and s.username ='SCOTT';
   INST_ID      SID     SERIAL# STATUS   SPID         USERNAME    CURRENT_SQL
---------- ---------- ---------- -------- ---------- ---------- ---------------------------------
        2       53         719 ACTIVE   4838         SCOTT    alter table  T_LIBTEST
        1       57        1007 ACTIVE   22506        SCOTT    select a0.* from  t_libtest a0 , 
                                                              t_libtest a1 , t_libtest a2 , t_libtest a3 , t
--> A long running select blocks us from getting a lock on resource [0x619d0263][0xc6de206e].
--> After killing process 22506 on instance 1 and waiting on PMON cleanup the ALTER table should go through

References:

  • How to Find the Blocker of the ‘library cache pin’ in a RAC environment? (Doc ID 780514.1)
  •  Expert Oracle RAC 12c chapter 11  written by Rivaj Shamsudeen

2 thoughts on “GES Locks and Deadlocks”

  1. very informative, I really appreciate your time in explaining . thank you.

    I have a small doubt.. Under “TM lock contention in a RAC eviroment” Section
    I see the lock from instance 2, but you mentioned instance 1, am I understanding wrong or it is a typo?

    TM lock contention sample
    Session 1 on instance 1
    SQL> lock table emp in exclusive mode;
    Table(s) Locked.

    Session 2 on instance 2
    SQL> lock table emp in exclusive mode;
    –> hang

    Verify local table locks
    SQL> select INST_ID, SID, TYPE, ID1, ID2, LMODE, REQUEST, CTIME, BLOCK,
    lower(to_char(ID1,’XXXXX’)) OBJ_HEX from gv$lock where lmode=6 and type =’TM’;
    INST_ID SID TY ID1 ID2 LMODE REQUEST CTIME BLOCK OBJ_HE
    ———- ———- — ———- ———- ———- ———- ———- ———- ——
    2 47 TM 87452 0 6 0 10106 2 1559c

  2. What could have caused the deadlock ?
    Global Enqueue Services Deadlock detected (DID = 20_1_34). More information in file
    ========================================================================
    Global Wait-For-Graph(WFG) for GES Deadlock ID=[20_1_34]
    ————————————————————————
    Victim : (instance=2, lock=0x1218186a0)
    Start (master) Instance : 2
    Number of Locks involved : 4
    Number of Sessions involved : 2

    User session identified by:
    {
    User Name : rac2$
    User Machine : rac2
    OS Terminal Name : unknown
    OS Process ID : 12340
    OS Program Name : JDBC Thin Client
    Application Name : JDBC Thin Client (TNS V1-V3))3)
    Action Name : Flush KSXM hash table actionotestatistics Flush
    Current SQL : UPDATE “EMP” SET LASTACCESSDATE = SYSDATE WHERE ENAME = :bv_tok
    Session Number : 291
    Session Serial Number : 22037
    Server Process ORAPID : 97
    Server Process OSPID : 99080
    Instance : 2
    }
    waiting for Lock 0x1218186a0 (Transaction):
    {
    Lock Level : KJUSEREX
    Resource Name : TX 0x230021.0x628e(ext 0x0,0x2)
    GES Transaction ID : 61000-0002-00008004
    }
    which is blocked by Lock 0x130282bb0 (Transaction):
    {
    Lock Level : KJUSEREX
    Resource Name : TX 0x230021.0x628e(ext 0x0,0x2)
    GES Transaction ID : 76000-0001-00006C12
    }
    owned by the
    User session identified by:
    {
    User Name : rac2$
    User Machine : rac2
    OS Terminal Name : unknown
    OS Process ID : 123474
    OS Program Name : JDBC Thin Client
    Application Name : JDBC Thin Client (TNS V1-V3)V3)
    Action Name : Flush KSXM hash table actionncesactionave
    Current SQL : UPDATE “EMP” SET LASTACCESSDATE = SYSDATE WHERE ENAME = :bv_tok
    Session Number : 1804
    Session Serial Number : 58770
    Server Process ORAPID : 118
    Server Process OSPID : 118274
    Instance : 1
    }
    waiting for Lock 0x130d4f1d0 (Transaction):
    {
    Lock Level : KJUSEREX
    Resource Name : TX 0x14001d.0xb22f(ext 0x0,0x5)
    GES Transaction ID : 76000-0001-00006C12
    }
    which is blocked by Lock 0x120282bb0 (Transaction):
    {
    Lock Level : KJUSEREX
    Resource Name : TX 0x14001d.0xb22f(ext 0x0,0x5)
    GES Transaction ID : 61000-0002-00008004
    }
    owned by the first user session of the WFG.
    ————————————————————————
    End of Global WFG for GES Deadlock ID=[20_1_34]
    ========================================================================

Leave a Reply

Your email address will not be published. Required fields are marked *