——————————————————————————————–
This article follows closely chapter 11 of following book:
Expert Oracle RAC 12c written by Rivaj Shamsudeen
——————————————————————————————–
Table of Contents
System Setup
- CRS: 11.2.0.4
- Rdbms: 11.2.0.4
- 3-Node cluster running on OEL 6.4
Overview
Resource types: BL resource: Buffer Lock resource are used in Cache Fusion and protect database buffers from concurrent modifications by multiple instances. The GCS layer handles BL resources NON BL resource: Resources like TX, TMS SQ ... are non-BL resources and are handled by the GES layer. Each resource can have a list of locks ( GRANT Queue ) Each resource can have a list of waiting locks ( CONVERT Queue) GES Latches : Latches are instance specific Enqueues : Enqueues can be both local to an instance and global in a clustered environment Enqueues serialize access to database resources
Resource Names and their relationiship to object IDs / Transaction IDs
Lock table creates a TM resource type which can be used to identify the involved object As user SCOTT: SQL> lock table emp in exclusive mode; Table(s) Locked. As user SYS retrieve object ID in Dec and Hex SQL> select object_id ,lower(to_char(OBJECT_ID,'XXXXX')) ID_hex from dba_objects where object_name = 'EMP'; OBJECT_ID ID_HEX ---------- ------ 87452 1559c Find the instance TM lock SQL> select * from gv$lock where lmode=6 and type ='TM'; INST_ID ADDR KADDR SID TY ID1 ID2 LMODE REQUEST CTIME BLOCK ------ ---------------- ---------------- ---- -- --------- ----- ------ ------- ------- -------- 1 00007FE138BAC380 00007FE138BAC3E0 32 TM 87452 0 6 0 187 2 Find GES related lock SQL> select INST_ID,resource_name, RESP,ON_CONVERT_Q,ON_GRANT_Q,PERSISTENT_RES,MASTER_NODE, NEXT_CVT_LEVEL, VALUE_BLK_STATE from gv$ges_resource where resource_name like '%1559c%'; INST_ID RESOURCE_NAME RESP ON_CONVERT_Q ON_GRANT_Q PERSISTENT_RES MASTER_NODE NEXT_CVT_ VALUE_BLK_STATE ---------- ------------------------------ ---------------- ------------ ---------- -------------- ----------- --------- ------------------------ 3 [0x1559c][0x0],[TM][ext 0x0,0x 00000000AD981D28 0 1 0 2 KJUSERNL KJUSERVS_NOVALUE 1 [0x1559c][0x0],[TM][ext 0x0,0x 00000000ADA38638 0 1 0 2 KJUSERNL KJUSERVS_NOVALUE --> Summary The first part of the RESOURCE_NAME identifies the object_id in hex for TM resource Note for other GES lock types like [TX], [LB], .. the meaning for the resource name changes
TM lock contention in a RAC eviroment
TM lock contention sample Session 1 on instance 1 SQL> lock table emp in exclusive mode; Table(s) Locked. Session 2 on instance 2 SQL> lock table emp in exclusive mode; --> hang Verify local table locks SQL> select INST_ID, SID, TYPE, ID1, ID2, LMODE, REQUEST, CTIME, BLOCK, lower(to_char(ID1,'XXXXX')) OBJ_HEX from gv$lock where lmode=6 and type ='TM'; INST_ID SID TY ID1 ID2 LMODE REQUEST CTIME BLOCK OBJ_HE ---------- ---------- -- ---------- ---------- ---------- ---------- ---------- ---------- ------ 2 47 TM 87452 0 6 0 10106 2 1559c -> Instance 2 holds a global lock on OBJECT with ID 87452 ( ID1) . This OBJECT_ID is 1559c in Hex. Check GES locks SQL> select be.inst_id, be.pid, resource_name1, be.state, owner_node, blocked, blocker , s.sid SESSION_ID from gv$ges_blocking_enqueue be, gv$process p, gv$session s where be.resource_name1 like '%1559c%' and p.spid=be.pid and be.inst_id = p.inst_id and be.inst_id = s.inst_id and s.paddr = p.addr order by resource_name1; INST_ID PID RESOURCE_NAME1 STATE OWNER_NODE BLOCKED BLOCKER SESSION_ID ---------- ---------- ------------------------------ -------- ---------- ---------- ---------- ---------- 1 9794 [0x1559c][0x0],[TM][ext 0x0,0x OPENING 0 1 0 32 2 22756 [0x1559c][0x0],[TM][ext 0x0,0x GRANTED 1 0 1 47 ->Summary Instance 2 has been GRANTED this resource . The local session ID is 47 Session with SID=32 on Instance 1 is waiting on this resource ( STATE = OPENING) and is blocked ( BLOCKED = 1) . As session is waiting on Instance 1 lets have closer look on this session: SQL> SELECT SUBSTR(SS.USERNAME,1,8) USERNAME, SS.OSUSER "USER", AR.MODULE || ' @ ' || SS.machine CLIENT, SS.PROCESS Client_PID, SS.SID SID, TO_CHAR(AR.LAST_LOAD_TIME, 'DD-Mon HH24:MM:SS') LOAD_TIME, SS.SECONDS_IN_WAIT, AR.DISK_READS DISK_READS, AR.BUFFER_GETS BUFFER_GETS, SUBSTR(SS.LOCKWAIT,1,10) LOCKWAIT, W.EVENT EVENT, SS.status, AR.SQL_fullTEXT SQL FROM V$SESSION_WAIT W, V$SQLAREA AR, V$SESSION SS, v$timer T WHERE SS.SQL_ADDRESS = AR.ADDRESS AND SS.SQL_HASH_VALUE = AR.HASH_VALUE AND SS.SID = w.SID (+) AND ss.STATUS = 'ACTIVE' AND W.EVENT != 'client message' and username='SCOTT' ORDER BY SS.LOCKWAIT ASC, SS.USERNAME, AR.DISK_READS DESC; USERNAME USER CLIENT CLIENT_PID SID LOAD_TIME SECONDS_IN_WAIT DISK_READS -------- ---------- ---------------------------- ----------- ----- --------------- --------------- ---------- BUFFER_GETS LOCKWAIT EVENT STATUS SQL ----------- ---------- -------------------------------- -------- ---------- ------------------------------------------- SCOTT oracle SQL*Plus @ grac41.example.com 9793 32 25-Oct 11:10:18 10467 1 24 00000000AA enq: TM - contention ACTIVE lock table emp in exclusive mode --> Session 32 tries to run a lock table command and is now waiting 10467 seconds on a TM contention. Lets check the session on instance 2. Here we have a look as the previous SQL as the command creating the TM lock is already finished. If we are lucky we can find the command responsible for the TM lock. SQL> SELECT SUBSTR(SS.USERNAME,1,8) USERNAME, SS.OSUSER "USER", AR.MODULE || ' @ ' || SS.machine CLIENT, SS.PROCESS Client_PID, SS.SID SID, TO_CHAR(AR.LAST_LOAD_TIME, 'DD-Mon HH24:MM:SS') LOAD_TIME, SS.SECONDS_IN_WAIT, AR.DISK_READS DISK_READS, AR.BUFFER_GETS BUFFER_GETS, SUBSTR(SS.LOCKWAIT,1,10) LOCKWAIT,SS.status, AR.SQL_fullTEXT SQL FROM V$SQLAREA AR, V$SESSION SS WHERE SS.PREV_SQL_ADDR = AR.ADDRESS AND SS.PREV_HASH_VALUE = AR.HASH_VALUE and username='SCOTT' ORDER BY SS.LOCKWAIT ASC, SS.USERNAME, AR.DISK_READS DESC; USERNAME USER CLIENT CLIENT_PID SID LOAD_TIME SECONDS_IN_WAIT DISK_READS -------- -------------- -------------------------------- ----------- ---- --------------- --------------- ---------- BUFFER_GETS LOCKWAIT STATUS SQL ----------- ---------- -------- -------------------------------------------------------------------------------- SCOTT oracle SQL*Plus @ grac41.example.com 10014 47 25-Oct 16:10:33 6086 0 24 INACTIVE lock table emp in exclusive mode --> Summary Indeed session with SID 47 has run lock table emp in exclusive mode via a sqlplus session .
TX Enqueue contention in a RAC environment
Prepare TX Enqueue contention in RAC env Session 1: HOST_NAME INSTANCE_NAME ------------------------------ ---------------- grac42.example.com grac42 SQL> update emp set sal=sal+102 where empno=7902; 1 row updated. SQL> select dbms_transaction.local_transaction_id from dual; LOCAL_TRANSACTION_ID -------------------------------------------------------------------------------- 20.9.563 --> Note you can use the following statement to get your local transaction ID: SQL> select XIDUSN || '.' || XIDSLOT || '.' || XIDSQN from v$transaction; Session 2 on instance 1: HOST_NAME INSTANCE_NAME ------------------------------ ---------------- grac41.example.com grac41 SQL> update emp set sal=sal+101 where empno=7902; --> Session hangs Check global TX locks SQL> select * from gv$lock where type='TX'; INST_ID ADDR KADDR SID TY ID1 ID2 LMODE REQUEST CTIME BLOCK ---------- ---------------- ---------------- ---- -- -------- ----- ------- -------- ----- ----- 2 00000000AAF89718 00000000AAF89790 32 TX 1310729 563 6 0 101 2 1 00000000AFD3BBA8 00000000AFD3BC00 57 TX 1310729 563 0 6 92 0 --> Session 32 on instance2 is holding and excl. lock ( LMODE=6 ) where as session 57 on instance 1 is requesting that lock ID1,ID2 uniquely defined a lock - this means instance 1 and instance 2 are requesting the same lock because ID1,ID2 are equal. Checking GES locks SQL> select be.inst_id, be.pid, resource_name1, be.state, owner_node, blocked, blocker from gv$ges_blocking_enqueue be where be.resource_name1 like '%TX%'; INST_ID PID RESOURCE_NAME1 STATE OWNER_NODEBLOCKED BLOCKER ------- -------- ------------------------------- -------- ---------- ---------- ---------- 2 6740 [0x140009][0x233],[TX][ext 0x5 GRANTED 1 0 1 1 22506 [0x140009][0x233],[TX][ext 0x5 OPENING 0 1 0 --> The TX ID 20.9.563 is reported in Hex Notation in our RESOURCE_NAME ( 20 = 0x14 - 9 = 0x9 - 563 = 0x223 ) This means we can always calculate the real TX ID from a resource name with type [TX] ! --> PID 22506 on instance 1 can be easily detected as our sqlplus session - status opening which means waiting Let's running our session_wait script on instance 1 to learn more about process 22506 SQL> SELECT SUBSTR(SS.USERNAME,1,8) USERNAME, SS.OSUSER "USER", AR.MODULE || ' @ ' || SS.machine CLIENT, SS.PROCESS Client_PID, p.spid SERVER_PID, SS.SID SID, TO_CHAR(AR.LAST_LOAD_TIME, 'DD-Mon HH24:MM:SS') LOAD_TIME, SS.SECONDS_IN_WAIT, AR.DISK_READS DISK_READS, AR.BUFFER_GETS BUFFER_GETS, SUBSTR(SS.LOCKWAIT,1,10) LOCKWAIT, W.EVENT EVENT, SS.status, AR.SQL_fullTEXT SQL FROM V$SESSION_WAIT W, V$SQLAREA AR, V$SESSION SS, v$timer T, V$PROCESS p WHERE SS.SQL_ADDRESS = AR.ADDRESS AND SS.SQL_HASH_VALUE = AR.HASH_VALUE AND SS.SID = w.SID (+) AND ss.STATUS = 'ACTIVE' AND W.EVENT != 'client message' and p.addr=ss.paddr and ss.username='SCOTT' ORDER BY SS.LOCKWAIT ASC, SS.USERNAME, AR.DISK_READS DESC; USERNAME USER CLIENT CLIENT_PID SERVER_PID SID LOAD_TIME SECONDS_IN_WAIT ---------- --------- ----------------------------- ----------- ------------ --- --------------- --------------- DISK_READS BUFFER_GETS LOCKWAIT EVENT STATUS SQL ---------- ----------- ---------- -------------------------------- -------- ------------------------------------- SCOTT oracle SQL*Plus @ grac41.example.com 22505 22506 57 25-Oct 20:10:34 1145 0 6 00000000AF enq: TX - row lock contention ACTIVE update emp set sal=sal+101 where empno=7902 --> Session SID 57 with OSPID 22506 runs following SQL: update emp set sal=sal+101 where empno=7902; and gets blocked to due TX - row lock contention According to gv$lock SID 32 on instance 2 is holding that lock - lets run our previous SQL script SQL> SELECT SUBSTR(SS.USERNAME,1,8) USERNAME, SS.OSUSER "USER", AR.MODULE || ' @ ' || SS.machine CLIENT, SS.PROCESS Client_PID, SS.SID SID, TO_CHAR(AR.LAST_LOAD_TIME, 'DD-Mon HH24:MM:SS') LOAD_TIME, SS.SECONDS_IN_WAIT, AR.DISK_READS DISK_READS, AR.BUFFER_GETS BUFFER_GETS, SUBSTR(SS.LOCKWAIT,1,10) LOCKWAIT,SS.status, AR.SQL_fullTEXT SQL FROM V$SQLAREA AR, V$SESSION SS WHERE SS.PREV_SQL_ADDR = AR.ADDRESS AND SS.PREV_HASH_VALUE = AR.HASH_VALUE and username='SCOTT' ORDER BY SS.LOCKWAIT ASC, SS.USERNAME, AR.DISK_READS DESC; USERNAME USER CLIENT CLIENT_PID SID LOAD_TIME SECONDS_IN_WAIT DISK_READS -------- ---------- -------------------------------- ---------- --- --------------- --------------- ---------- BUFFER_GETS LOCKWAIT STATUS SQL ----------- ---------- -------- -------------------------------------------------------------------------------- SCOTT oracle SQL*Plus @ grac41.example.com 24529 32 26-Oct 08:10:53 6 0 8 INACTIVE update emp set sal=sal+102 where empno=7902 --> Here we see that this Session with SID 32 did run following SQL : update emp set sal=sal+102 where empno=7902 and can identifid as the blocker The problem is that this OSPID for our server process doesn't match SQL> SELECT SUBSTR(SS.USERNAME,1,8) USERNAME, SS.OSUSER "USER", AR.MODULE || ' @ ' || SS.machine CLIENT, SS.PROCESS Client_PID, SS.SID SID, p.spid SERVER_PID, TO_CHAR(AR.LAST_LOAD_TIME, 'DD-Mon HH24:MM:SS') LOAD_TIME, SS.SECONDS_IN_WAIT, AR.DISK_READS DISK_READS, AR.BUFFER_GETS BUFFER_GETS, SUBSTR(SS.LOCKWAIT,1,10) LOCKWAIT,SS.status, AR.SQL_fullTEXT SQL FROM V$SQLAREA AR, V$SESSION SS, V$PROCESS p WHERE SS.PREV_SQL_ADDR = AR.ADDRESS and p.addr=ss.paddr AND SS.PREV_HASH_VALUE = AR.HASH_VALUE and ss. username='SCOTT' ORDER BY SS.LOCKWAIT ASC, SS.USERNAME, AR.DISK_READS DESC; USERNAME USER CLIENT CLIENT_PID SID SERVER_PID LOAD_TIME -------- -------_ -------------------------------- ----------- --- ---------- ----------- SECONDS_IN_WAIT DISK_READS BUFFER_GETS LOCKWAIT STATUS SQL --------------- ---------- ----------- ---------- -------- ----------------------------------- SCOTT oracle SQL*Plus @ grac41.example.com 24529 32 16767 26-Oct 08:10:53 904 0 8 INACTIVE update emp set sal=sal+102 where empno=7902 --> From above query we can see that for SID 32 the reported SERVER_PID is 16767 So what Server PID is reported by gv$ges_blocking_enqueue ? 2 6740 [0x140009][0x233],[TX][ext 0x5 GRANTED 1 0 1 Run a ps command on instance 2 # ps -elf | grep 6740 0 S oracle 6740 1 0 80 0 - 392687 poll_s Oct25 ? 00:02:08 ora_lmd0_grac42 --> Ok the LMD0 process is reported as the grant owner !!
Deadlocks in a RAC environment
- Deadlock detection is executed by LMD process which will detetect deadlocks in nearly 10 seconds
- After 10s a waiting session is chosen and the statement is rolled back followed by an ORA-60
Create a deadlock scenario: Session 1 on grac41: SQL> update emp set sal=sal+101 where empno=7934; Session 2 on grac42: SQL> update emp set sal=sal+102 where empno=7902 Session 1 on grac41: SQL> update emp set sal=sal+101 where empno=7902; --> Hang Session 2 on grac42: SQL> update emp set sal=sal+102 where empno=7934; --> Deadlock !! Print global locks SQL> select * from gv$lock where type='TX'; INST_ID ADDR KADDR SID TY ID1 ID2 LMODE REQUEST CTIME BLOCK ---------- ---------------- ------------- --- -- ---------- ------ ------ ------- ----- ---------- 2 00000000AFD3AD90 00000000AFD3ADE8 32 TX 589829 1345 0 6 3 0 2 00000000AAF89718 00000000AAF89790 32 TX 1114145 559 6 0 15 2 1 00000000AAF97680 00000000AAF976F8 57 TX 589829 1345 6 0 11 2 1 00000000AFD3B110 00000000AFD3B168 57 TX 1114145 559 0 6 9 0 Print GES locks SQL> select be.inst_id, be.pid, resource_name1, be.state, owner_node, blocked, blocker from gv$ges_blocking_enqueue be where be.resource_name1 like '%TX%'; INST_ID PID RESOURCE_NAME1 STATE OWNER_NODE BLOCKED BLOCKER ------- ------- ------------------------------ --------- ---------- ---------- ---------- 2 6740 [0x110021][0x22f],[TX][ext 0x5 GRANTED 1 0 1 2 16767 [0x90005][0x541],[TX][ext 0x2, OPENING 1 1 0 1 22506 [0x110021][0x22f],[TX][ext 0x5 OPENING 0 1 0 1 7255 [0x90005][0x541],[TX][ext 0x2, GRANTED 0 0 1 --> Here we have 2 sessions involved with 2 locks ( Lock1: 589829/1345 Lock2: 1114145/559 ) From gv$ges_blocking_enqueue we see the Deadlock: Both GES locks are GRANTED on one Node but requested on the other Node Reading related LMD Traces from grac41,grac2 grac42 lmd0 trace reports the following after a Deadlock detection: Global blockers dump start:--------------------------------- DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x90005][0x541],[TX][ext 0x2,0x0] Now nearly at the same time node grac41 reports an ENQUEUE DUMP REQUEST for the same ENQUEUE DUMP REQUEST: from 2 spnum 12 on [0x90005][0x541],[TX][ext 0x2,0x0] for reason 3 mtype 0 DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x90005][0x541],[TX][ext 0x2,0x0] First section is about the resource details ( of course Local inst should be different ) DUMP LOCAL BLOCKER/HOLDER: block level 5 res [0x90005][0x541],[TX][ext 0x2,0x0] ----------resource 0xad9a9a88---------------------- resname : [0x90005][0x541],[TX][ext 0x2,0x0] hash mask : x3 Local inst : 1 dir_inst : 1 master_inst : 1 grac41: grant mode : KJUSERNL KJUSERCR KJUSERCW KJUSERPR KJUSERPW KJUSEREX count : 1 0 0 0 0 1 ... Granted_locks : 1 Cvting_locks : 1 .. lp 0xadbf4ec0 gl KJUSEREX rp 0xad9a9a88 [0x90005][0x541],[TX][ext 0x2,0x0] master 1 gl owner 0xaf806b90 possible pid 22506 xid 30000-0001-000000CC bast 0 rseq 9 mseq 0 history CVT2GRCANCL > REM_AST > MSGSENT > FREE > REF_RES > LOC_AST open opt KJUSERDEADLOCK CONVERT_Q: lp 0xadc06d30 gl KJUSERNL rl KJUSEREX rp 0xad9a9a88 [0x90005][0x541],[TX][ext 0x2,0x0] master 1 owner 2 bast 1 rseq 5 mseq 0x1 history MSGSENT > FREE > REF_RES > REM_AST > GR2CVT > MSGSENT convert opt KJUSERGETVALUE grac42: grant mode : KJUSERNL KJUSERCR KJUSERCW KJUSERPR KJUSERPW KJUSEREX count : 1 0 0 0 0 0 .. Granted_locks : 0 Cvting_locks : 1 .. GRANTED_Q : CONVERT_Q: lp 0xadd3c3a8 gl KJUSERNL rl KJUSEREX rp 0xada50a70 [0x90005][0x541],[TX][ext 0x2,0x0] master 1 gl owner 0xaf853170 possible pid 16767 xid 27000-0002-000000DD bast 0 rseq 5 mseq 0 history LOC_AST > CLOSE > FREE > REF_RES > MSGSENT > GR2CVT convert opt KJUSERGETVALUE --> grac41 has been granted this resource in exclusive mode - where grac42 is waiting on e exclusive lock Information about Locks are printed in the Enqueue Section ----------enqueue 0xadbf4ec0------------------------ grant_level : KJUSEREX req_level : KJUSEREX .. pid : 7255 possible pid : 22506 .. user session for deadlock lock 0xadbf4ec0 sid: 57 ser: 1007 audsid: 641691 user: 85/SCOTT pid: 48 O/S info: user: oracle, term: UNKNOWN, ospid: 22506 image: oracle@grac41.example.com (TNS V1-V3) client details: O/S info: user: oracle, term: pts/7, ospid: 22505 machine: grac41.example.com program: sqlplus@grac41.example.com (TNS V1-V3) application name: SQL*Plus, hash value=3669949024 current SQL: update emp set sal=sal+101 where empno=7902 DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK on grac2: ----------enqueue 0xadd3c3a8------------------------ Owner inst : 2 grant_level : KJUSERNL req_level : KJUSEREX .. pid : 16767 opid : 16767 .. req_level : KJUSEREer session for deadlock lock 0xadd3c3a8 sid: 32 ser: 525 audsid: 660372 user: 85/SCOTT pid: 39 O/S info: user: grid, term: UNKNOWN, ospid: 16767 image: oracle@grac42.example.com client details: O/S info: user: oracle, term: pts/5, ospid: 24529 machine: grac41.example.com program: sqlplus@grac41.example.com (TNS V1-V3) application name: SQL*Plus, hash value=3669949024 current SQL: update emp set sal=sal+102 where empno=7934 DUMP LOCAL BLOCKER: initiate state dump for DEADLOCK Summary: Process 16767 is running SQL : update emp set sal=sal+102 where empno=7934 and tries to aquire a a lock which is hold by process 22506 ( see possible pid : 22506) on instance 1. Even the current reported SQLs : Instance 1 : update emp set sal=sal+101 where empno=7902; Instance 2 : update emp set sal=sal+102 where empno=7934; you need to aware the the root cause could be a former SQL run in that transaction/session. Note: If you are lucky you can retrieve this info by using PREV_SQL_ADDR, PREV_HASH_VALUE from V$session
RAC and DFS lock handle with Sequences [ SV Resource ]
- DFS stands for Distributed File System and has been used since OPS ( Oracle Prallel Server )
- DFS lock is used for typical RAC actions like Global Checkpointing or for DDL operations like truncate table –
- DFS lock handles are also used for sequence operation
- Sequence Performance of combined options ordered from worst to best for RAC
NOCACHE with ORDER
NOCACHE with NOORDER
CACHE with ORDER
CACHE with NOORDER
Create a sequence with oder attribute and access this sequence from 2 different cluster nodes Session 1 on Instance 1: SQL> ALTER SESSION SET EVENTS '10046 trace name context forever, level 12 '; Session altered. SQL> create sequence t1_seq order cache 100; Sequence created. SQL> select t1_seq.nextval from dual; NEXTVAL ---------- 1 Session 2 on Instance 2: SQL> select t1_seq.nextval from dual; NEXTVAL ---------- 2 Session 1 on Instance 1: SQL> select t1_seq.nextval from dual; NEXTVAL ---------- 3 Session 2 on Instance 2: SQL> select t1_seq.nextval from dual; NEXTVAL ---------- 4 Review created trace file from session 1 on instance 1: PARSING IN CURSOR #140722143364984 len=32 dep=0 uid=85 oct=3 lid=85 tim=1382859615260938 hv=789899430 ad='a1adeb70' sqlid='21ckb1wrj9u56' select t1_seq.nextval from dual .. WAIT #140722143364984: nam='enq: SV - contention' ela= 704 type|mode=1398145029 id1=87918 id2=0 obj#=0 tim=1382859615265874 The mode value ( 1398145029 ) in the tracefile can be translated to : SQL> select chr(bitand(&&mode,-16777216)/16777215) || chr(bitand(&&mode,16711680)/65535) LOCK_TYPE, mod(&&mode,16) LOCK_MODE from dual; LOCK_TYPE LOCK_MODE ---------- ---------- SV 5 -> Lock type : SV -> Lock Mode: 5 ( == KJUSERPW -> Protected Write ) Create a resource name for this sequence: SQL> select distinct '[0x'|| trim(to_char(object_id,'xxxxxxxx'))||'][0x0],[SV]' res_name, object_id, owner from dba_objects where object_name=upper('&&obj_name') RES_NAME OBJECT_ID OWNER ----------------------- ---------- ------------------------------ [0x1576e][0x0],[SV] 87918 SCOTT First fetch from sequence and comvert value to internal Oracle Number format in Hex format: SQL> select t1_seq.nextval from dual; NEXTVAL ---------- 12 SQL> select dump(12) from dual; DUMP(12) ------------------- Typ=2 Len=2: 193,13 SQL> select to_char(193,'xx') || to_char(13,'xx') from dual; TO_CHA ------ c1 d Verify that gv$ges_resource already knows this resource: SQL> select INST_ID,resource_name, VALUE_BLK_STATE , VALUE_BLK from gv$ges_resource where resource_name like (select distinct '%[0x'|| trim(to_char(object_id,'xxxxxxxx'))||'][0x0],[SV]%' from dba_object where object_name='T1_SEQ' and owner='SCOTT'); INST_ID RESOURCE_NAME VALUE_BLK_STATE VALUE_BLK ---------- ------------------------------ -------------------- ---------------------------------- 1 [0x1576e][0x0],[SV][ext 0x0,0x KJUSERVS_VALUE 0x82c10e00000000000000000000000000 2 [0x1576e][0x0],[SV][ext 0x0,0x KJUSERVS_VALUE 0x82c10d00000000000000000000000000 --> c10d, c10e in value block represents the sequence value - As we have fetched the seq. on instance 1 this represents the current value. As expected locks are aquired on the SV resource to coordinate the sequence value. During sequence access GRANT_LEVEL and REQUEST_LEVEL should be change to KJUSEREX. SQL> select INST_ID,resource_name1,pid, state , owner_node, grant_level, request_level from gv$ges_enqueue where resource_name1 like (select distinct '%[0x'|| trim(to_char(object_id,'xxxxxxxx'))||'][0x0],[SV]%' from dba_objects where object_name='T1_SEQ' and owner='SCOTT'); INST_ID RESOURCE_NAME1 PID STATE OWNER_NODE GRANT_LEV REQUEST_L ---------- ------------------------ ----- ------- ---------- --------- --------- 2 [0x1576e][0x0],[SV][ext 0x0,0x 6783 GRANTED 1 KJUSERNL KJUSERNL 1 [0x1576e][0x0],[SV][ext 0x0,0x 7300 GRANTED 0 KJUSERNL KJUSERNL 1 [0x1576e][0x0],[SV][ext 0x0,0x 0 GRANTED 1 KJUSERNL KJUSERNL Verify this PIDs from gv$ges_enqueue: #ps -elf |grep 7300 0 S oracle 7300 1 0 80 0 - 389080 poll_s Oct25 ? 00:01:46 ora_lck0_grac41 #ps -elf | grep 6783 0 S oracle 6783 1 0 80 0 - 389080 poll_s Oct25 ? 00:01:31 ora_lck0_grac42
Library Cache Locks/Pins in a RAC env
- Library Cache Locks are globalized as resource types LA through LP
- Library Cache Pins are globalized as resource types NA through NZ
- GRD uses first 16 Byte of column kglnahsv from table x$kglob to create a KGL resourses (KGL = Kernel General Library )
- KGL Resource Sample: [0x619d0263][0xc6de206e].[LB] – [0x619d0263] = Byte 1-8 – [0xc6de206e] = Byte 9-16 from column kglnahsv
Create a KGL resource name from kglnahsv col of table x$kglob SQL> select '[0x' || substr(kglnahsv,1,8) || '][0x'|| substr(kglnahsv,9,8) || ']' kgl_resname, kglnaown kgl_owner, kglnaobj kgl_obj_name from x$kglob where kglnaobj like upper('&&OBJECT_NAME'); KGL_RESNAME KGL_OWNER KGL_OBJNAME ------------------------ ---------------------- -------------------------------- [0x619d0263][0xc6de206e] SCOTT T_LIBTEST Display KGL Resource Details for object : SCOTT.T_LIBTEST SQL> declare begin for c1 in (select '[0x' || substr(kglnahsv,1,8) || '][0x'|| substr(kglnahsv,9,8) || ']' kgl_resname, kglnaown kgl_owner, kglnaobj kgl_obj_name from x$kglob where kglnaobj like upper('&&OBJECT_NAME') ) loop dbms_output.put_line('--------------------------------------------'); dbms_output.put_line('Object Details: ' || c1.kgl_owner || '.' || c1.kgl_obj_name); dbms_output.put_line('--------------------------------------------'); dbms_output.put_line('Resource Details: '); for c2 in (select resource_name,master_node from V$ges_resource where resource_name like '%'||c1.kgl_resname||'%') loop dbms_output.put_line('-> Resource name: ' || c2.resource_name ||' - Master: ' || c2.master_node ); end loop; dbms_output.put_line('Lock Details: '); for c3 in ( select resource_name1, transaction_id0, pid, state, owner_node, grant_level from v$ges_enqueue where resource_name1 like '%'||c1.kgl_resname||'%') loop dbms_output.put_line('-> Resource name: ' || c3.resource_name1 || ' - Owner_node: ' || c3.owner_node || ' - Transaction_id0: ' || c3.transaction_id0 || ' - State: ' || c3.state || ' - Grant_level: ' || c3.grant_level ); end loop; dbms_output.put_line('--------------------------------------------'); end loop; end; / old 5: from x$kglob where kglnaobj like upper('&&OBJECT_NAME') ) new 5: from x$kglob where kglnaobj like upper('T_LIBTEST') ) -------------------------------------------- Object Details: SCOTT.T_LIBTEST -------------------------------------------- Resource Details: -> Resource name: [0x619d0263][0xc6de206e],[NB][ - Master: 1 -> Resource name: [0x619d0263][0xc6de206e],[LB][ - Master: 0 Lock Details: -> Resource name: [0x619d0263][0xc6de206e],[NB][ - Owner_node: 0 - Transaction_id0: 3145729 - State: GRANTED - Grant_level: KJUSERPR -> Resource name: [0x619d0263][0xc6de206e],[LB][ - Owner_node: 0 - Transaction_id0: 3145729 - State: GRANTED - Grant_level: KJUSERPR --------------------------------------------
Troubleshoot a Library Cache Contention problem
Troubleshoot a Library Cache Contention problem Session 1 on instance grac1: Create a long running SQL statement drop table t_libtest; create table t_libtest ( n1 number ); set serveroutput on declare v_sqltext CLOB; c1 number; begin v_sqltext := ' select a0.* from t_libtest a0'; c1 := dbms_sql.open_cursor; for i in 1..1024 loop v_sqltext := v_sqltext || ' , t_libtest a'|| i; end loop; dbms_output.put_line(v_sqltext); dbms_sql.parse(c1, v_sqltext, dbms_sql.native); end; / Session 2 on instance 2 adds a column SQL> alter table T_LIBTEST add ( n3 number ); --> Session hangs Start investigation on instance 2 as this session hangs SQL> SELECT SUBSTR(SS.USERNAME,1,8) USERNAME, SS.OSUSER "USER", AR.MODULE || ' @ ' || SS.machine CLIENT, SS.PROCESS Client_PID, p.spid SERVER_PID, SS.SID SID, TO_CHAR(AR.LAST_LOAD_TIME, 'DD-Mon HH24:MM:SS') LOAD_TIME, SS.SECONDS_IN_WAIT, AR.DISK_READS DISK_READS, AR.BUFFER_GETS BUFFER_GETS, SUBSTR(SS.LOCKWAIT,1,10) LOCKWAIT, W.EVENT EVENT, SS.status, AR.SQL_fullTEXT SQL FROM V$SESSION_WAIT W, V$SQLAREA AR, V$SESSION SS, v$timer T, V$PROCESS p WHERE SS.SQL_ADDRESS = AR.ADDRESS AND SS.SQL_HASH_VALUE = AR.HASH_VALUE AND SS.SID = w.SID (+) AND ss.STATUS = 'ACTIVE' AND W.EVENT != 'client message' and p.addr=ss.paddr and ss.username='SCOTT' ORDER BY SS.LOCKWAIT ASC, SS.USERNAME, AR.DISK_READS DESC; USERNAME USER CLIENT CLIENT_PID SERVER_PID SID LOAD_TIME SECONDS_IN_WAIT DISK_READS -------- ------ ---------------------------- ----------- ---------- --- --------------- --------------- ---------- BUFFER_GETS LOCKWAIT EVENT STATUS SQL ----------- ---------- -------------------------------- -------- ---------------------------------------------------- SCOTT oracle SQL*Plus @ grac41.example.com 10429 4838 53 27-Oct 15:10:44 364 0 42 library cache lock ACTIVE alter table T_LIBTE --> Session with SID 53 and Server OPSID 4838 hangs running SQL: alter table .. --> Wait Event: library cache lock Can we find this OS PID 4838 in gv$ges_blocking_enqueue ? SQL> select be.inst_id, be.pid, resource_name1, be.state, owner_node, blocked, blocker 2 from gv$ges_blocking_enqueue be where be.resource_name1 like '%[0x619d0263][0xc6de206e]%'; INST_ID PID RESOURCE_NAME1 STATE OWNER_NODE BLOCKED BLOCKER ---------- ---------- ------------------------------ ---------------- ---------- ---------- ---------- 2 4838 [0x619d0263][0xc6de206e],[LB][ OPENING 1 1 0 1 22506 [0x619d0263][0xc6de206e],[LB][ GRANTED 0 0 1 --> Yep PID tries to lock resource [0x619d0263][0xc6de206e] - but is blocked from instance 1 with PID 22506 Check what OSPID 22506 is doing on instance 1 SQL> set echo off SQL> @sess_curr_SQL.sql SQL> SELECT s.inst_id, s.sid, s.serial#, s.status, p.spid, s.username,ar.SQL_fullTEXT CURRENT_SQL FROM gv$session s JOIN gv$process p ON p.addr = s.paddr AND p.inst_id = s.inst_id JOIN gv$sqlarea ar on s.SQL_ADDRESS = ar.ADDRESS AND s.SQL_HASH_VALUE = ar.HASH_VALUE and ar.inst_id = s.inst_id WHERE s.type != 'BACKGROUND' and s.username ='SCOTT'; INST_ID SID SERIAL# STATUS SPID USERNAME CURRENT_SQL ---------- ---------- ---------- -------- ---------- ---------- --------------------------------- 2 53 719 ACTIVE 4838 SCOTT alter table T_LIBTEST 1 57 1007 ACTIVE 22506 SCOTT select a0.* from t_libtest a0 , t_libtest a1 , t_libtest a2 , t_libtest a3 , t --> A long running select blocks us from getting a lock on resource [0x619d0263][0xc6de206e]. --> After killing process 22506 on instance 1 and waiting on PMON cleanup the ALTER table should go through
References:
- How to Find the Blocker of the ‘library cache pin’ in a RAC environment? (Doc ID 780514.1)
- Expert Oracle RAC 12c chapter 11 written by Rivaj Shamsudeen
very informative, I really appreciate your time in explaining . thank you.
I have a small doubt.. Under “TM lock contention in a RAC eviroment” Section
I see the lock from instance 2, but you mentioned instance 1, am I understanding wrong or it is a typo?
TM lock contention sample
Session 1 on instance 1
SQL> lock table emp in exclusive mode;
Table(s) Locked.
Session 2 on instance 2
SQL> lock table emp in exclusive mode;
–> hang
Verify local table locks
SQL> select INST_ID, SID, TYPE, ID1, ID2, LMODE, REQUEST, CTIME, BLOCK,
lower(to_char(ID1,’XXXXX’)) OBJ_HEX from gv$lock where lmode=6 and type =’TM’;
INST_ID SID TY ID1 ID2 LMODE REQUEST CTIME BLOCK OBJ_HE
———- ———- — ———- ———- ———- ———- ———- ———- ——
2 47 TM 87452 0 6 0 10106 2 1559c
What could have caused the deadlock ?
Global Enqueue Services Deadlock detected (DID = 20_1_34). More information in file
========================================================================
Global Wait-For-Graph(WFG) for GES Deadlock ID=[20_1_34]
————————————————————————
Victim : (instance=2, lock=0x1218186a0)
Start (master) Instance : 2
Number of Locks involved : 4
Number of Sessions involved : 2
User session identified by:
{
User Name : rac2$
User Machine : rac2
OS Terminal Name : unknown
OS Process ID : 12340
OS Program Name : JDBC Thin Client
Application Name : JDBC Thin Client (TNS V1-V3))3)
Action Name : Flush KSXM hash table actionotestatistics Flush
Current SQL : UPDATE “EMP” SET LASTACCESSDATE = SYSDATE WHERE ENAME = :bv_tok
Session Number : 291
Session Serial Number : 22037
Server Process ORAPID : 97
Server Process OSPID : 99080
Instance : 2
}
waiting for Lock 0x1218186a0 (Transaction):
{
Lock Level : KJUSEREX
Resource Name : TX 0x230021.0x628e(ext 0x0,0x2)
GES Transaction ID : 61000-0002-00008004
}
which is blocked by Lock 0x130282bb0 (Transaction):
{
Lock Level : KJUSEREX
Resource Name : TX 0x230021.0x628e(ext 0x0,0x2)
GES Transaction ID : 76000-0001-00006C12
}
owned by the
User session identified by:
{
User Name : rac2$
User Machine : rac2
OS Terminal Name : unknown
OS Process ID : 123474
OS Program Name : JDBC Thin Client
Application Name : JDBC Thin Client (TNS V1-V3)V3)
Action Name : Flush KSXM hash table actionncesactionave
Current SQL : UPDATE “EMP” SET LASTACCESSDATE = SYSDATE WHERE ENAME = :bv_tok
Session Number : 1804
Session Serial Number : 58770
Server Process ORAPID : 118
Server Process OSPID : 118274
Instance : 1
}
waiting for Lock 0x130d4f1d0 (Transaction):
{
Lock Level : KJUSEREX
Resource Name : TX 0x14001d.0xb22f(ext 0x0,0x5)
GES Transaction ID : 76000-0001-00006C12
}
which is blocked by Lock 0x120282bb0 (Transaction):
{
Lock Level : KJUSEREX
Resource Name : TX 0x14001d.0xb22f(ext 0x0,0x5)
GES Transaction ID : 61000-0002-00008004
}
owned by the first user session of the WFG.
————————————————————————
End of Global WFG for GES Deadlock ID=[20_1_34]
========================================================================