Debugging Clusterware

Enabling Debugging for Oracle Clusterware Modules

You can enable debugging for Oracle Clusterware modules and their components, and for resources, by setting environment variables or by 
running crsctl set log commands:

To enable debugging for Oracle Clusterware:
    # crsctl set log crs "CRSRTI:1,CRSCOMM:2"
To enable debugging for OCR:
    # crsctl set log crs "CRSRTI:1,CRSCOMM:2,OCRSRV:4"
To enable debugging for EVM:
    # crsctl set log evm "EVMCOMM:1"
To enable debugging for resources
    # crsctl set log res "resname:1"

Enabling Additional Tracing for Oracle Clusterware Components

Oracle Support may ask you to enable tracing to capture additional information. Because the procedures described in this 
section may affect performance, only perform these activities with the assistance of My Oracle Support.

You can enable tracing for Oracle Clusterware resources by running the crsctl set trace command, using the following syntax:
crsctl set trace module_name "component_name=tracing_level,..."

tracing_level: A number from 1 to 5 to indicate the level of detail you want the trace command to return, where 1 is the least 
amount of tracing output and 5 provides the most detailed tracing output.

Set trace level for clusterware modules like CRS,CSS, ..
[root@grac41 Desktop]# crsctl  set  trace crs "all=3"
Set CRSD Module: AGENT  Trace Level: 3
Set CRSD Module: AGFW  Trace Level: 3
..
[root@grac41 Desktop]# crsctl   get trace crs  All
Get CRSD Module: AGENT  Trace Level: 3
Get CRSD Module: AGFW  Trace Level: 3
..

Set trace level for a certain clusterware component
[root@grac41 Desktop]# crsctl  set  trace crs "CRSTIMER:4"
Set CRSD Module: CRSTIMER  Trace Level: 4
[root@grac41 Desktop]# crsctl get  trace crs CRSTIMER
Get CRSD Module: CRSTIMER  Trace Level: 4

Enabling Debugging for Oracle Clusterware Resources

You can enable debugging for Oracle Clusterware resources by running the crsctl set log command, using the following syntax:
    crsctl set log res "resource_name=debugging_level"
Run the crsctl set log command as the root user, and supply the following information:
    resource_name  : The name of the resource to debug.
    debugging_level: A number from 1 to 5 to indicate the level of detail you want the debug command to return, where 1 is the 
                     least amount of debugging output and 5 provides the most detailed debugging output. The default debugging level is 2.
You can dynamically change the debugging level in the crsctl command 

Display current resource log level ( Upper and lower CW stack )
[root@grac41 Desktop]#   crsctl get log  res  ora.cluster_interconnect.haip -init
Get Resource ora.cluster_interconnect.haip: 0
[root@grac41 Desktop]#  crsctl get log  res ora.net1.network 
Get Resource ora.net1.network Log Level: 1

Set log level for a CW resoruce ( Upper and lower CW stack )
[root@grac41 Desktop]#  crsctl set log  res  ora.cluster_interconnect.haip=5 -init
Set Resource ora.cluster_interconnect.haip Log Level: 5
[root@grac41 Desktop]# crsctl set log  res   ora.net1.network=5
Set Resource ora.net1.network Log Level: 5

[root@grac41 Desktop]#  crsctl get log css All
Get CSSD Module: CLSF  Log Level: 0
Get CSSD Module: CSSD  Log Level: 2
Get CSSD Module: GIPCCM  Log Level: 2
Get CSSD Module: GIPCGM  Log Level: 2
Get CSSD Module: GIPCNM  Log Level: 2
Get CSSD Module: GPNP  Log Level: 1
Get CSSD Module: OLR  Log Level: 0
Get CSSD Module: SKGFD  Log Level: 0
[root@grac41 Desktop]# crsctl set log css CSSD:3
Set CSSD Module: CSSD  Log Level: 3
[root@grac41 Desktop]#  crsctl get log css CSSD
Get CSSD Module: CSSD  Log Level: 3

Setting CSSD generic log level

Session 1: Run following command in a separate terminal 
[root@grac41 Desktop]# tail -f ocssd.log | grep -i misstime 

Session 2: 
Try to find CSSD component from css modules
[root@grac41 Desktop]#  crsctl get log css All
Get CSSD Module: CLSF  Log Level: 0
Get CSSD Module: CSSD  Log Level: 2
Get CSSD Module: GIPCCM  Log Level: 2
Get CSSD Module: GIPCGM  Log Level: 2
Get CSSD Module: GIPCNM  Log Level: 2
Get CSSD Module: GPNP  Log Level: 1
Get CSSD Module: OLR  Log Level: 0
Get CSSD Module: SKGFD  Log Level: 0

Increase  log level
[root@grac41 Desktop]# crsctl set log css CSSD:3
Set CSSD Module: CSSD  Log Level: 3
[root@grac41 Desktop]#  crsctl get log css CSSD
Get CSSD Module: CSSD  Log Level: 3
--> After increasing log level session 1 dumps every second following tracing info:
2014-05-09 16:56:07.250: [    CSSD][664778496]clssnmHBInfo: css timestmp 1399647367 250 slgtime 26402514 DTO 27740 (index=1) biggest misstime 930 NTO 27570
2014-05-09 16:56:07.647: [    CSSD][666355456]clssnmPollingThread: node 2, grac42, ninfmisstime 320, misstime 320, skgxnbit 4, vcwmisstime 0, syncstage 0
2014-05-09 16:56:07.647: [    CSSD][666355456]clssnmPollingThread: node 3, grac43, ninfmisstime 520, misstime 520, skgxnbit 8, vcwmisstime 0, syncstage 0
2014-05-09 16:56:08.250: [    CSSD][664778496]clssnmHBInfo: css timestmp 1399647368 250 slgtime 26403514 DTO 27750 (index=1) biggest misstime 930 NTO 27570
2014-05-09 16:56:08.647: [    CSSD][666355456]clssnmPollingThread: node 2, grac42, ninfmisstime 320, misstime 320, skgxnbit 4, vcwmisstime 0, syncstage 0
2014-05-09 16:56:08.647: [    CSSD][666355456]clssnmPollingThread: node 3, grac43, ninfmisstime 520, misstime 520, skgxnbit 8, vcwmisstime 0, syncstage 0

Reset log level and above tracing output will be stopped immediate
[root@grac41 Desktop]#  crsctl set log css CSSD:2

Increase log level to debug CSSD and all GIPCD modules
[root@grac42 Desktop]#  crsctl set log css CSSD:5,GIPCCM:5,GIPCGM:5,GIPCNM:5
Set CSSD Module: CSSD  Log Level: 5
Set CSSD Module: GIPCCM  Log Level: 5
Set CSSD Module: GIPCGM  Log Level: 5
Set CSSD Module: GIPCNM  Log Level: 5

Enable HAIP/GIPC resource logging

  • Grid Interprocess Communication (GIPC): A support daemon that enables Redundant Interconnect Usage.
Set log level 5 for HAIP
[root@grac41 log]#  crsctl set log  res  ora.cluster_interconnect.haip=5  -init
Set Resource ora.cluster_interconnect.haip Log Level: 5

Check resource properties
[root@grac41 log]#     crsctl status res ora.cluster_interconnect.haip -init -f | egrep 'CHECK|AGENT'
AGENT_FILENAME=%CRS_HOME%/bin/orarootagent%CRS_EXE_SUFFIX%
CHECK_INTERVAL=30
--> orarootagent is mainting this resource 
    orarootagent checks this resource every 30 seconds  

Monitor ohasd orarootagent logfile 
grid@grac41 log]$ tail -f ./grac41/agent/ohasd/orarootagent_root/orarootagent_root.log

2014-05-09 15:40:49.632: [ora.cluster_interconnect.haip][835454720][F-ALGO]{0:0:2} CHECK initiated by timer for: ora.cluster_interconnect.haip 1 1
2014-05-09 15:40:49.632: [ora.cluster_interconnect.haip][331339520][F-ALGO]{0:0:2} [check] Executing command: check for resource: ora.cluster_interconnect.haip 1 1
2014-05-09 15:40:49.632: [ora.cluster_interconnect.haip][331339520][F-ALGO]{0:0:2} [check] (:CLSN00109:) clsn_agent::check {
2014-05-09 15:40:49.632: [ora.cluster_interconnect.haip][331339520][F-ALGO]{0:0:2} [check] actionLock {
2014-05-09 15:40:49.632: [ora.cluster_interconnect.haip][331339520][F-ALGO]{0:0:2} [check] actionLock }
2014-05-09 15:40:49.632: [ora.cluster_interconnect.haip][331339520][F-ALGO]{0:0:2} [check] actionUnlock {
2014-05-09 15:40:49.632: [ora.cluster_interconnect.haip][331339520][F-ALGO]{0:0:2} [check] actionUnlock }
2014-05-09 15:40:49.632: [ora.cluster_interconnect.haip][331339520][F-ALGO]{0:0:2} [check] (:CLSN00109:) clsn_agent::check 0 }
2014-05-09 15:40:49.633: [ora.cluster_interconnect.haip][331339520][F-ALGO]{0:0:2} [check] Command check for resource: ora.cluster_interconnect.haip 1 1 completed with status ONLINE
....

Monitor GIPCD logfile
Every 5 second GIPC deamon should test the local cluster interconnect 
[grid@grac41 log]$ tail -f ./grac41/gipcd/gipcd.log | grep CLSINET 
2014-05-09 16:20:51.716: [ CLSINET][1858270976] Returning NETDATA: 1 interfaces
2014-05-09 16:20:51.716: [ CLSINET][1858270976] # 0 Interface 'eth2',ip='192.168.2.101',mac='08-00-27-6b-e2-bd',mask='255.255.255.0',net='192.168.2.0',use='cluster_interconnect'
2014-05-09 16:20:56.718: [ CLSINET][1858270976] Returning NETDATA: 1 interfaces
2014-05-09 16:20:56.718: [ CLSINET][1858270976] # 0 Interface 'eth2',ip='192.168.2.101',mac='08-00-27-6b-e2-bd',mask='255.255.255.0',net='192.168.2.0',use='cluster_interconnect'

Enable tracing for VIP and Public Networks

Check resource propterties
[root@grac41 log]# crsctl status res ora.net1.network -f   | egrep 'CHECK|AGENT'
AGENT_FILENAME=%CRS_HOME%/bin/orarootagent%CRS_EXE_SUFFIX%
CHECK_INTERVAL=1
OFFLINE_CHECK_INTERVAL=60

[root@grac41 log]# crsctl status res ora.grac41.vip -f   | egrep 'CHECK|AGENT'
AGENT_FILENAME=%CRS_HOME%/bin/orarootagent%CRS_EXE_SUFFIX%
CHECK_INTERVAL=1
CHECK_TIMEOUT=120
OFFLINE_CHECK_INTERVAL=0

Set maximum log level for these resources 
[root@grac41 network-scripts]#  crsctl set log  res   ora.grac41.vip=5 
Set Resource ora.grac41.vip Log Level: 5
[root@grac41 network-scripts]# crsctl set log  res ora.net1.network=5
Set Resource ora.net1.network Log Level: 5

Review crsd oraroot_agend log:
[grid@grac41 log]$ tail -f ./grac41/agent/crsd/orarootagent_root/orarootagent_root.log 
2014-05-09 16:32:08.744: [ora.grac41.vip][F-ALGO]{1:52826:5} CHECK initiated by timer for: ora.grac41.vip 1 1
2014-05-09 16:32:08.744: [ora.grac41.vip][F-ALGO]{1:52826:5} [check] Executing command: check for resource: ora.grac41.vip 1 1
2014-05-09 16:32:08.744: [ora.grac41.vip][F-ALGO]{1:52826:5} [check] (:CLSN00109:) clsn_agent::check {
2014-05-09 16:32:08.744: [ora.grac41.vip][F-ALGO]{1:52826:5} [check] actionLock {
2014-05-09 16:32:08.744: [ora.grac41.vip][F-ALGO]{1:52826:5} [check] actionLock }
2014-05-09 16:32:08.744: [ora.grac41.vip]{1:52826:5} [check] VipAgent::check{
2014-05-09 16:32:08.744: [ora.grac41.vip]{1:52826:5} [check] updateInterfaceNameFromNetwork Resource {
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] getResAttrib: attrib name START_DEPENDENCIES value hard(ora.net1.network) pullup(ora.net1.network) weak(global:ora.gns) len 68
2014-05-09 16:32:08.745: [ora.grac41.vip][ENTER]{1:52826:5} [check] START_DEPENDENCIES of type ora.network.type: ora.net1.network
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] InterfaceName = eth1
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] updateInterfaceNameFromNetwork Resource }
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] getResAttrib: attrib name ACTION_SCRIPT value  len 0
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] getValueFromNetworkResource USR_ORA_AUTO {
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] getResAttrib: attrib name START_DEPENDENCIES value hard(ora.net1.network) pullup(ora.net1.network) weak(global:ora.gns) len 68
2014-05-09 16:32:08.745: [ora.grac41.vip][ENTER]{1:52826:5} [check] START_DEPENDENCIES of type ora.network.type: ora.net1.network
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] NetworkAuto = dhcp
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] getValueFromNetworkResource USR_ORA_AUTO  dhcp}
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] checkIp{
2014-05-09 16:32:08.745: [ora.grac41.vip][F-ALGO]{1:52826:5} [check] Checking if IP 192.168.1.167 is present on NIC eth1
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] checkIp}
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] getValueFromNetworkResource USR_ORA_AUTO {
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] getResAttrib: attrib name START_DEPENDENCIES value hard(ora.net1.network) pullup(ora.net1.network) weak(global:ora.gns) len 68
2014-05-09 16:32:08.745: [ora.grac41.vip][ENTER]{1:52826:5} [check] START_DEPENDENCIES of type ora.network.type: ora.net1.network
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] NetworkAuto = dhcp
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] getValueFromNetworkResource USR_ORA_AUTO  dhcp}
2014-05-09 16:32:08.745: [ora.grac41.vip][F-ALGO]{1:52826:5} [check] Old value =dhcp, New Value = dhcp
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] getValueFromNetworkResource USR_ORA_AUTO {
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] getResAttrib: attrib name START_DEPENDENCIES value hard(ora.net1.network) pullup(ora.net1.network) weak(global:ora.gns) len 68
2014-05-09 16:32:08.745: [ora.grac41.vip][ENTER]{1:52826:5} [check] START_DEPENDENCIES of type ora.network.type: ora.net1.network
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] NetworkAuto = dhcp
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] getValueFromNetworkResource USR_ORA_AUTO  dhcp}
2014-05-09 16:32:08.745: [ora.grac41.vip]{1:52826:5} [check] DHCP Mode is enabled
2014-05-09 16:32:08.745: [ora.grac41.vip][F-ALGO]{1:52826:5} [check] actionUnlock {
2014-05-09 16:32:08.745: [ora.grac41.vip][F-ALGO]{1:52826:5} [check] actionUnlock }
2014-05-09 16:32:08.745: [ora.grac41.vip][F-ALGO]{1:52826:5} [check] (:CLSN00109:) clsn_agent::check 0 }
2014-05-09 16:32:08.745: [ora.grac41.vip][F-ALGO]{1:52826:5} [check] Command check for resource: ora.grac41.vip 1 1 completed with status ONLINE

2014-05-09 16:32:09.412: [ora.net1.network][F-ALGO]{1:52826:5} CHECK initiated by timer for: ora.net1.network grac41 1
2014-05-09 16:32:09.413: [ora.net1.network][F-ALGO]{1:52826:5} [check] Executing command: check for resource: ora.net1.network grac41 1
2014-05-09 16:32:09.413: [ora.net1.network][F-ALGO]{1:52826:5} [check] (:CLSN00109:) clsn_agent::check {
2014-05-09 16:32:09.413: [ora.net1.network][F-ALGO]{1:52826:5} [check] actionLock {
2014-05-09 16:32:09.413: [ora.net1.network][F-ALGO]{1:52826:5} [check] actionLock }
2014-05-09 16:32:09.413: [ora.net1.network]{1:52826:5} [check] getResAttrib: attrib name USR_ORA_IF value eth1 len 4
2014-05-09 16:32:09.413: [ora.net1.network]{1:52826:5} [check] getResAttrib: attrib name USR_ORA_AUTO value dhcp len 4
2014-05-09 16:32:09.413: [ora.net1.network][F-ALGO]{1:52826:5} [check] NetInterface::getIpAddress eth1
2014-05-09 16:32:09.413: [ora.net1.network][F-ALGO]{1:52826:5} [check] NetInterface::sGetAnyIpAddress { eth1
2014-05-09 16:32:09.413: [ora.net1.network][F-ALGO]{1:52826:5} [check] NetInterface::sGetAnyIpAddress }
2014-05-09 16:32:09.413: [ora.net1.network][F-ALGO]{1:52826:5} [check] actionUnlock {
2014-05-09 16:32:09.413: [ora.net1.network][F-ALGO]{1:52826:5} [check] actionUnlock }
2014-05-09 16:32:09.413: [ora.net1.network][F-ALGO]{1:52826:5} [check] (:CLSN00109:) clsn_agent::check 0 }
2014-05-09 16:32:09.413: [ora.net1.network][F-ALGO]{1:52826:5} [check] Command check for resource: ora.net1.network grac41 1 completed with status ONLINE

New trace records gets written every secong - so don't extend the tracing period - Disable tracing asap by running
[root@grac41 network-scripts]# crsctl set log  res ora.net1.network=0
Set Resource ora.net1.network Log Level: 0
[root@grac41 network-scripts]# crsctl set log  res   ora.grac41.vip=0
Set Resource ora.grac41.vip Log Level: 0

Potential Error

[root@grac41 Desktop]#  crsctl set log  res  ora.cluster_interconnect.haip=5 -init
CRS-2545: Cannot operate on 'ora.cluster_interconnect.haip'. It is locked by 'SYSTEM' for command 'Resource Autostart : grac41'
CRS-0219: Could not update resource 'ora.cluster_interconnect.haip'.
CRS-4000: Command Set failed, or completed with errors.
--> Try to kill all CW process at OS level ( ps -efl | grep d.bin )

View modules and components for logging/tracing

Available Modules
[root@grac41 Desktop]#    crsctl lsmodules
Usage:
  crsctl lsmodules {mdns|gpnp|css|crf|crs|ctss|evm|gipc}
 where
   mdns  multicast Domain Name Server
   gpnp  Grid Plug-n-Play Service
   css   Cluster Synchronization Services
   crf   Cluster Health Monitor
   crs   Cluster Ready Services
   ctss  Cluster Time Synchronization Service
   evm   EventManager
   gipc  Grid Interprocess Communications

Display modules 
[root@grac41 Desktop]#  crsctl lsmodules css
List CSSD Debug Module: CLSF
List CSSD Debug Module: CSSD
List CSSD Debug Module: GIPCCM
List CSSD Debug Module: GIPCGM
List CSSD Debug Module: GIPCNM
List CSSD Debug Module: GPNP
List CSSD Debug Module: OLR
List CSSD Debug Module: SKGFD

CSS Modules Details: 
CLSF and SKGFD - are related to the I/O layer to the voting disks
CSSD           - same old one
GIPCCM         - gipc communication between applications and CSS
GIPCGM         - communication between peers in the GM layer
GIPCNM         - communication between nodes in the NM layer
GPNP           - trace for gpnp calls within CSS
OLR            - trace for olr calls within CSS  yy

 


Display modules and  current log level
[root@grac41 Desktop]#  crsctl get log  css ALL
Get CSSD Module: CLSF  Log Level: 0
Get CSSD Module: CSSD  Log Level: 2
Get CSSD Module: GIPCCM  Log Level: 2
Get CSSD Module: GIPCGM  Log Level: 2
Get CSSD Module: GIPCNM  Log Level: 2
Get CSSD Module: GPNP  Log Level: 1
Get CSSD Module: OLR  Log Level: 0
Get CSSD Module: SKGFD  Log Level: 0

Display modules and  current trace level
[root@grac41 Desktop]#  crsctl get trace css all
Get CSSD Module: CLSF  Trace Level: 0
Get CSSD Module: CSSD  Trace Level: 0
Get CSSD Module: GIPCCM  Trace Level: 0
Get CSSD Module: GIPCGM  Trace Level: 0
Get CSSD Module: GIPCNM  Trace Level: 0
Get CSSD Module: GPNP  Trace Level: 0
Get CSSD Module: OLR  Trace Level: 0
Get CSSD Module: SKGFD  Trace Level: 0

Setting highest log and trace  level for certain modules: 
[root@grac41 Desktop]# crsctl set log crs all:5
[root@grac41 Desktop]# crsctl set log gpnp  all:5
[root@grac41 Desktop]# crsctl set log gipc  all:5
[root@grac41 Desktop]# crsctl set log css all:5

[root@grac41 Desktop]# crsctl set trace  gpnp all:5
[root@grac41 Desktop]# crsctl set trace css all:5
[root@grac41 Desktop]# crsctl set trace gipc  all:5
[root@grac41 Desktop]# crsctl set trace css all:5

Cluster Modules and Functions

Cluster Synchronization Services (CSS) Modules and Functions
Module     Description
------- ------------------------
CSS     CSS client component
CSSD    CSS daemon component

Communication (COMM) Modules and Functions  (can be used by multiple componets like: CRS, CSS, EVM ) 
Module      Description
-------- -------------------------------------
COMMCRS  Clusterware communicatio n layer
COMMNS   NS communication layer

Functions performed by each CRS module
Oracle Clusterware (CRS) Modules and Functions
Module       Descriptions
--------- -----------------------------
CRSUI     User interface module
CRSCOMM   Communication module
CRSRTI    Resource management module
CRSMAIN   Main module/driver
CRSPLACE  CRS placement module
CRSAPP    CRS application
CRSRES    CRS resources
CRSOCR    Oracle Cluster Registry interface
CRSTIMER  Various timers related to CRS
CRSEVT    CRS EVM/event interface module
CRSD      CRS daemon

Oracle Cluster Registry (OCR) Component Names
Module     Description 
------- -----------------------------------
OCRAPI  OCR abstraction component
OCRCLI  OCR client compon
OCRSRV  OCR server component
OCRMAS  OCR master thread component
OCRMSG  OCR message component
OCRCAC  OCR cache component
OCRRAW  OCR raw device component
OCRUTL  OCR util component
OCROSD  OCR operating system dependent (OSD) layer
OCRASM  OCR ASM component

OCR tool modules.
OCRCONFIG  Modules and Functions
---------  -----------------------------------
Module        Description
OCRCONFIG  OCRCONFIG component for configuring OCR
OCRDUMP    OCRDUMP component that lists the Oracle Cluster Registry contents
OCRCHECK   OCRCHECK component that verifies all of the configured OCRs

 

Reference

  • Diagnosability for Oracle Clusterware (CRS or Grid Infrastructure) Component and Resource (Doc ID 357808.1)

Leave a Reply

Your email address will not be published. Required fields are marked *