Procwatcher Setup and Usage

Procwatcher provides the following data for debugging hang issue

  • Wait chain data from v$wait_chains
  • Session wait data from v$session_wait
  • Active Session History data to see the recent history of this process
  • GES Enqueue data to look at RAC related locks.
  • Lock data from v$lock
  • Current SQL of the session
  • Ps output for the process
  • 3 short stacks of the process for more advanced troubleshooting by Oracle Support
  • Suspected final blocker information in later versions of Procwatcher

Install ProcWatcher  prw.sh shell script

$ unzip /media/sf_mykits/prw_11.2.12.12.2.zip 
Archive:  /media/sf_mykits/prw_11.2.12.12.2.zip
  inflating: prw.sh

Prerequisites runninng procwatcher

Run as user root verify that  /bin and /usr/bin in your $PATH ( to find tools like grep  .... )
# env | grep PATH
PATH=/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/bin:/usr/bin:/bin:/root/bin

Verify that ORACLE_HOME is set to GRID_HOME
# env | grep HOME
GRID_HOME=/u01/app/11204/grid
ORACLE_HOME=/u01/app/11204/grid

Check that your gdb debugger is available
# file /usr/bin/gdb
/usr/bin/gdb: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), 
              for GNU/Linux 2.6.18, stripped
As we try to debug clusterware processes set EXAMINE_CLUSTER=true

Starting procwatcher / Reconfiguring procwatcher

Starting and reconfiguring procwatcher 
[root@grac41 procwatcher]# cd  /home/grid/PROC_WATCHER 
[root@grac41 PROC_WATCHER]# ./prw.sh  start
Sat Apr 26 11:17:54 CEST 2014: Starting Procwatcher
 
Sat Apr 26 11:17:54 CEST 2014: Thank you for using Procwatcher. :-)
Sat Apr 26 11:17:54 CEST 2014: Please add a comment to Oracle Support Note 459694.1
Sat Apr 26 11:17:54 CEST 2014: if you have any comments, suggestions, or issues with this tool.
Procwatcher files will be written to: /u01/app/11204/grid/log/procwatcher 
Sat Apr 26 11:17:54 CEST 2014: Started Procwatcher

[root@grac41 PROC_WATCHER]# ps -elf | grep prw.sh
0 S root     13826     1  1  80   0 - 27601 wait   11:17 pts/5    00:00:00 ksh /u01/app/11204/grid/log/procwatcher/prw.sh run
0 S root     14082     1  0  80   0 - 27569 poll_s 11:17 pts/5    00:00:00 ksh /u01/app/11204/grid/log/procwatcher/prw.sh housekeeper 7
0 S root     14489 13826  0  80   0 - 27369 wait   11:18 pts/5    00:00:00 ksh /u01/app/11204/grid/log/procwatcher/prw.sh gdbrun /usr/bin/gdb Linux 13675 mdnsd.bin mdnsd.bin /u01/app/11204/grid/log/procwatcher/PRW_CLUSTER/prw_mdnsd.bin_13675_04-26-14 OH_not_set /u01/app/11204/grid/bin

- Even we start prw.sh from /home/grid/PROC_WATCHER the start script copies prw.sh to /u01/app/11204/grid/log/procwatcher/o 
- If we need to reconfigure procwatcher delete /u01/app/11204/grid/log/procwatcher first  

Reconfigure procwatcher
--> Change parameter
[root@grac41 PROC_WATCHER]# ./prw.sh stop
[root@grac41 PROC_WATCHER]# rm -rf  /u01/app/11204/grid/log/procwatcher
[root@grac41 PROC_WATCHER]# ./prw.sh  start

Debug a RAC session hang scenarios using procwatcher

Create a TX Enqueue contention 
Session 1:
SQL> select * from emp where empno=7934 for update;
     EMPNO ENAME      JOB           MGR HIREDATE        SAL       COMM    DEPTNO
---------- ---------- --------- ---------- --------- ---------- ----------  --------
      7934 MILLER     CLERK          7782 23-JAN-82       1806                  10

Session 2:
SQL> select * from emp where empno=7934 for update;
--> session 2 hang 

Procwatcher configuration 
EXAMINE_CLUSTER=false
EXAMINE_BG=false
SIDLIST=grac41
USE_SQL=true

Start debugging by reviewing pw_waitchains_grac41_04-26-14.out file
# cd /u01/app/11204/grid/log/procwatcher/PRW_DB_grac41

[root@grac41 PRW_DB_grac41]# cat  pw_waitchains_grac41_04-26-14.out
################################################################################
Procwatcher waitchains report
################################################################################
SQL> SQL> -e V WAITCHAINS (top 100 rows) Snapshot Taken At: Sat Apr 26 11:53:43 CEST 2014
PROC 20008 : Current Process: 20008    SID: 33 SER#: 25     INST grac41         INST #: 1
PROC 20008 : Blocking Process: <none> from Instance        Number of waiters: 1
PROC 20008 : Final Blocking Process: <none> from Instance    Program:
PROC 20008 : Wait Event: SQL*Net message from client        P1: 1413697536     P2: 1          P3: 0
PROC 20008 : Seconds in Wait: 867                Seconds Since Last Wait:
PROC 20008 : Wait Chain: 1: 'SQL*Net message from client'<='enq: TX - row lock contention'
PROC 20008 : Blocking Wait Chain: <none>
------------------------------
PROC 23267 : Current Process: 23267    SID: 59 SER#: 37     INST grac41         INST #: 1
PROC 23267 : Blocking Process: 20008 from Instance 1        Number of waiters: 0
PROC 23267 : Final Blocking Process: 20008 from Instance 1    Program: oracle@grac41.example.com
PROC 23267 : Wait Event: enq: TX - row lock contention        P1: 1415053318     P2: 655386      P3: 19323
PROC 23267 : Seconds in Wait: 844                Seconds Since Last Wait:
PROC 23267 : Wait Chain: 1: 'SQL*Net message from client'<='enq: TX - row lock contention'
PROC 23267 : Blocking Wait Chain: <none>
------------------------------
Elapsed: 00:00:00.96

----------blkr----------
Sat Apr 26 11:53:58 CEST 2014: Suspected final blocker is:  Process: 20008 SID: 33 SER#: 25 INST grac41 INST #: 1
-------end blkr---------
grac41 Waitchains SessionCount:2-Instance:1
################################################################################

--> Process 20008 and 23267 are involved where 20008 is the blocker
    Procwatcher is creating the following files 
     -rwxrwxrwx. 1 root root 135726 Apr 26 12:22 prw_ora_fg_grac41_20008_04-26-14.out
     -rwxrwxrwx. 1 root root 319626 Apr 26 12:22 prw_ora_fg_grac41_23267_04-26-14.out

Review prw_ora_fg_grac41_20008_04-26-14.out :
################################################################################
Procwatcher Debugging for Process 20008 ora_fg_grac41
################################################################################
SQL: Wait Chains Report for Process 20008 ora_fg_grac41
SQL> SQL> -e V WAITCHAINS (top 100 rows) Snapshot Taken At: Sat Apr 26 11:53:43 CEST 2014
PROC 20008 : Current Process: 20008    SID: 33 SER#: 25     INST grac41         INST #: 1
PROC 20008 : Blocking Process: <none> from Instance        Number of waiters: 1
PROC 20008 : Final Blocking Process: <none> from Instance    Program:
PROC 20008 : Wait Event: SQL*Net message from client        P1: 1413697536     P2: 1          P3: 0
PROC 20008 : Seconds in Wait: 867                Seconds Since Last Wait:
PROC 20008 : Wait Chain: 1: 'SQL*Net message from client'<='enq: TX - row lock contention'
PROC 20008 : Blocking Wait Chain: <none>
 
################################################################################
SQL: Lock Report for Process 20008 ora_fg_grac41
SQL> SQL> -e V LOCK Snapshot Taken At: Sat Apr 26 11:53:51 CEST 2014
PROC             PROC          TY        ID1        ID2    LMODE     REQUEST      BLOCK
-------------------- -------------------- -- ---------- ---------- ---------- ---------- ----------
PROC 20008         INST grac41      TX     655386      19323        6           0      1
--> process 20008 holds a lock in exclusiv mode
################################################################################
SQL: Current SQL Report for Process 20008 ora_fg_grac41
SQL> SQL> Snapshot Taken At: Sat Apr 26 11:54:04 CEST 2014
PROC 20008 - select * from emp where empno=7934 for update
at Apr 26 11:54:07 CEST 2014
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
0 S oracle   20008     1  0  80   0 - 394868 sk_wai 11:38 ?       00:00:00 oraclegrac41 (LOCAL=NO)
 
SQL*Plus: Release 11.2.0.4.0 Production on Sat Apr 26 11:54:13 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Enter user-name: SQL> Oracle pid: 48, Unix process pid: 20008, image: oracle@grac41.example.com
Sat Apr 26 11:54:13 CEST 2014

ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-read()+14<-nttfprd()+343<-nsbasic_brc()+376
  <-nsbrecv()+69<-nioqrc()+495<-opikndf2()+978<-opitsk()+831<-opiino()+969<-opiodr()+917<-opidrv()+570<-sou2o()+103
  <-opimai_real()+133<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253
Sat Apr 26 11:54:13 CEST 2014

ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-read()+14<-nttfprd()+343<-nsbasic_brc()+376
  <-nsbrecv()+69<-nioqrc()+495<-opikndf2()+978<-opitsk()+831<-opiino()+969<-opiodr()+917<-opidrv()+570<-sou2o()+103
  <-opimai_real()+133<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253
Sat Apr 26 11:54:13 CEST 2014

ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-read()+14<-nttfprd()+343<-nsbasic_brc()+376
  <-nsbrecv()+69<-nioqrc()+495<-opikndf2()+978<-opitsk()+831<-opiino()+969<-opiodr()+917<-opidrv()+570<-sou2o()+103
  <-opimai_real()+133<-ssthrdmain()+265<-main()+201<-__libc_start_main()+253
Statement processed.
--> Application is not progressing - Server process is waiting for Input from client 

Additiona Tracefiles
Session Wait Info: pw_sessionwait_grac41_04-26-14.out
################################################################################
SQL> SQL> -e V SESSIONWAIT Snapshot Taken At: Sat Apr 26 12:37:16 CEST 2014
PROC             INST         STATE    EVENT                       P1      P2         P3        SEC
-------------------- --------------- ---------- ------------------------------ ---------- ---------- ---------- ----------
PROC 23267         INST grac41     WAITING    enq: TX - row lock contention  1415053318     655386      19323       3468
Elapsed: 00:00:00.00
################################################################################

Lock Info: pw_lock_grac41_04-26-14.out
################################################################################
SQL> SQL> -e V LOCK Snapshot Taken At: Sat Apr 26 12:38:39 CEST 2014
PROC             PROC          TY        ID1        ID2    LMODE     REQUEST      BLOCK
-------------------- -------------------- -- ---------- ---------- ---------- ---------- ----------
PROC 20008         INST grac41      TX     655386      19323        6           0      1
PROC 23267         INST grac41      TX     655386      19323        0           6      0
################################################################################

Latch Holder: pw_latchholder_grac41_04-26-14.out

Global Enqueues: pw_gesenqueue_grac41_04-26-14.out
################################################################################
SQL> SQL> -e V GESENQUEUE Snapshot Taken At: Sat Apr 26 12:40:05 CEST 2014
PROC             INST         RESOURCE_NAME          GRANT_LEVEL  REQUEST_LEVE
-------------------- --------------- ------------------------ ------------ ------------
PROC 8583         INST grac41     [0x19][0x2],[RS][ext 0x0 KJUSERNL       KJUSEREX
PROC 23267         INST grac41     [0xa001a][0x4b7b],[TX][e KJUSEREX       KJUSEREX
PROC 23267         INST grac41     [0xa001a][0x4b7b],[TX][e KJUSERNL       KJUSEREX
Elapsed: 00:00:00.06
################################################################################

Debugging clusterware processes ocssd.bin using procwatcher

  • Note this should be done only on a test cluster due to the risks   of creating add. Node Evictions
To monitor a Node Eviction problem following prameters should be set in script prw.sh
EXAMINE_CLUSTER=true
EXAMINE_BG=false       <-- We are not interested in DB processess
INTERVAL=15            <-- Lest have at least 2 stack traces before a reboot takes place
USE_SQL=false          <-- We are not interested in any SQL 
CLUSTERPROCS="ocssd.bin"   <-- We are only interested in stack traces for ocssd.bin process

Note;  If the OS debugger suspends the ocssd.bin processs for too long *that* could cause a reboot of the machine. 
       Instead of debugging a Node Eviction we have created an additional Node Evicticon
Before deploying above configuration clusterwide lets start locally and check out how long it takes to get a stack
from ocssd.bin process.

 

Starting procwatcher locally

# ./prw.sh start
Wed Oct 9 09:00:53 CEST 2013: Starting Procwatcher
Wed Oct 9 09:00:53 CEST 2013: Thank you for using Procwatcher. :-)
Wed Oct 9 09:00:53 CEST 2013: Please add a comment to Oracle Support Note 459694.1
Wed Oct 9 09:00:53 CEST 2013: if you have any comments, suggestions, or issues with this tool.
Procwatcher files will be written to: /u01/app/11204/grid/log/procwatcher
Wed Oct 9 09:00:53 CEST 2013: Started Procwatcher

 

Log File Locations

# pwd
/u01/app/11204/grid/log/procwatcher
# ls
PRW_CLUSTER  prw_grac43.log  prw.sh  PRW_SYS_grac43

Monitor Logfile
# ./prw.sh log runtime
Wed Oct 9 09:33:54 CEST 2013: Getting stack for ocssd.bin 3166 using gdb (24 threads) in /u01/app/11204/grid/log/procwatcher/PRW_CLUSTER/prw_ocssd.bin_3166_10-09-13.out
Wed Oct 9 09:33:55 CEST 2013: Stacks complete after 3 seconds (1 stacks - average seconds: 3)
Wed Oct 9 09:33:55 CEST 2013: Cycle complete after 3 seconds
Wed Oct 9 09:33:55 CEST 2013: Sleeping 12 seconds until time to run again per the INTERVAL setting (15 seconds)
--> We need only 3 seconds to take a stack - so we have 12 seconds for ocssd.bin process to do its works.
    Looks safe to apply this settings clusterwide.
...

Check ProcWatcher process dump file
# more /u01/app/11204/grid/log/procwatcher/PRW_CLUSTER/prw_ocssd.bin_3166_10-09-13.out
################################################################################
Procwatcher Debugging for Process 3166 ocssd.bin
################################################################################
Wed Oct  9 09:13:21 CEST 2013
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
4 S grid      3166     1  0 -40   - - 167108 futex_ Oct08 ?       00:07:34 /u01/app/11204/grid/bin/ocssd.bin 
Threads: 
F S UID        PID  PPID   LWP  C NLWP PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
4 S grid      3166     1  3166  0   24 -40   - - 167108 futex_ Oct08 ?       00:00:00 /u01/app/11204/grid/bin/ocssd.bin 
5 S grid      3166     1  3168  0   24 -40   - - 167108 futex_ Oct08 ?       00:00:04 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3169  0   24 -40   - - 167108 futex_ Oct08 ?       00:00:00 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3170  0   24 -40   - - 167108 poll_s Oct08 ?       00:00:00 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3171  0   24 -40   - - 167108 ep_pol Oct08 ?       00:01:34 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3183  0   24 -40   - - 167108 futex_ Oct08 ?       00:00:00 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3201  0   24 -40   - - 167108 ep_pol Oct08 ?       00:02:40 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3202  0   24 -40   - - 167108 ep_pol Oct08 ?       00:00:06 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3846  0   24 -40   - - 167108 futex_ Oct08 ?       00:00:17 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3847  0   24 -40   - - 167108 futex_ Oct08 ?       00:00:03 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3848  0   24 -40   - - 167108 futex_ Oct08 ?       00:00:21 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3850  0   24 -40   - - 167108 futex_ Oct08 ?       00:00:16 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3851  0   24 -40   - - 167108 futex_ Oct08 ?       00:00:02 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3852  0   24 -40   - - 167108 futex_ Oct08 ?       00:00:21 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3854  0   24 -40   - - 167108 futex_ Oct08 ?       00:00:17 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3855  0   24 -40   - - 167108 futex_ Oct08 ?       00:00:02 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3856  0   24 -40   - - 167108 futex_ Oct08 ?       00:00:21 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3857  0   24 -40   - - 167108 futex_ Oct08 ?       00:00:06 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3858  0   24 -40   - - 167108 futex_ Oct08 ?       00:00:00 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3859  0   24 -40   - - 167108 ep_pol Oct08 ?       00:00:10 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3864  0   24 -40   - - 167108 futex_ Oct08 ?       00:00:05 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3865  0   24 -40   - - 167108 futex_ Oct08 ?       00:00:19 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3866  0   24 -40   - - 167108 futex_ Oct08 ?       00:00:00 /u01/app/11204/grid/bin/ocssd.bin 
1 S grid      3166     1  3867  0   24 -40   - - 167108 ep_pol Oct08 ?       00:00:21 /u01/app/11204/grid/bin/ocssd.bin 
[Thread debugging using libthread_db enabled]
0x0000003ae180b7bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
Thread 24 (Thread 0x7f773382a700 (LWP 3168)):
#0  0x0000003ae180b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f7736041054 in sltspcwait () from /u01/app/11204/grid/lib/libclntsh.so.11.1
#2  0x00007f77379320e8 in clsd_logThread () from /u01/app/11204/grid/lib/libhasgen11.so
#3  0x0000003ae1807851 in start_thread () from /lib64/libpthread.so.0
#4  0x0000003ae10e894d in clone () from /lib64/libc.so.6
Thread 23 (Thread 0x7f7733473700 (LWP 3169)):
#0  0x0000003ae180b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f7736041054 in sltspcwait () from /u01/app/11204/grid/lib/libclntsh.so.11.1
#2  0x00007f77378cb44c in clsc_cvwait () from /u01/app/11204/grid/lib/libhasgen11.so
#3  0x00007f77378c29f7 in clsc_select_monitor () from /u01/app/11204/grid/lib/libhasgen11.so
#4  0x00007f77378b31ec in clscselect () from /u01/app/11204/grid/lib/libhasgen11.so
#5  0x00007f7737846301 in clsdms_thdmai () from /u01/app/11204/grid/lib/libhasgen11.so
#6  0x0000003ae1807851 in start_thread () from /lib64/libpthread.so.0
#7  0x0000003ae10e894d in clone () from /lib64/libc.so.6
....

 

Deploy procwatcher as a cluster resource

If running stop the local procwatcher  
# ./prw.sh stop

Deploy procwatcher 
# ./prw.sh deploy
Procwatcher already registered, deregistering
CRS-2679: Attempting to clean 'procwatcher' on 'grac42'
CRS-2679: Attempting to clean 'procwatcher' on 'grac41'
CRS-2679: Attempting to clean 'procwatcher' on 'grac43'
CRS-2680: Clean of 'procwatcher' on 'grac43' failed
CRS-2680: Clean of 'procwatcher' on 'grac42' failed
CRS-2681: Clean of 'procwatcher' on 'grac41' succeeded
CRS-4000: Command Stop failed, or completed with errors.
Registering clusterware resource
SETTING UP NODE grac41
SETTING UP NODE grac42
Copying Procwatcher to Node grac42
prw.sh                                                                                            100%  180KB 180.3KB/s   00:00    
SETTING UP NODE grac43
Copying Procwatcher to Node grac43
prw.sh                                                                                            100%  180KB 180.3KB/s   00:00    
CRS-2672: Attempting to start 'procwatcher' on 'grac42'
CRS-2672: Attempting to start 'procwatcher' on 'grac41'
CRS-2672: Attempting to start 'procwatcher' on 'grac43'
CRS-2676: Start of 'procwatcher' on 'grac41' succeeded
CRS-2676: Start of 'procwatcher' on 'grac43' succeeded
CRS-2676: Start of 'procwatcher' on 'grac42' succeeded
PROCWATCHER DEPLOYED

 

Display Procwatcher status and related OS processes

# ./prw.sh status all
Wed Oct 9 08:37:06 CEST 2013: PROCWATCHER VERSION: 11.2.12.12.2
Wed Oct 9 08:37:06 CEST 2013: ### Parameters ###
Wed Oct 9 08:37:06 CEST 2013: EXAMINE_CLUSTER=true
Wed Oct 9 08:37:06 CEST 2013: EXAMINE_BG=false
Wed Oct 9 08:37:06 CEST 2013: PRWPERM=777
Wed Oct 9 08:37:06 CEST 2013: RETENTION=7
Wed Oct 9 08:37:06 CEST 2013: WARNINGEMAIL=
Wed Oct 9 08:37:06 CEST 2013: INTERVAL=15
Wed Oct 9 08:37:06 CEST 2013: THROTTLE=5
Wed Oct 9 08:37:06 CEST 2013: IDLECPU=3
Wed Oct 9 08:37:06 CEST 2013: SIDLIST=
Wed Oct 9 08:37:06 CEST 2013: ### Advanced Parameters (non-default) ###
Wed Oct 9 08:37:06 CEST 2013: USE_SQL=false
Wed Oct 9 08:37:06 CEST 2013: CLUSTERPROCS=ocssd.bin
Wed Oct 9 08:37:06 CEST 2013: ### End Parameters ###
Wed Oct 9 08:37:06 CEST 2013: Procwatcher is currently running on local node grac41
Wed Oct 9 08:37:06 CEST 2013: Procwatcher files are be written to: /u01/app/11204/grid/log/procwatcher
Wed Oct 9 08:37:06 CEST 2013: There are 0 concurrent debug sessions running...
Wed Oct 9 08:37:06 CEST 2013: PROCWATCHER CLUSTERWARE STATUS:
NAME=procwatcher
TYPE=application
TARGET=ONLINE          , ONLINE          , ONLINE
STATE=ONLINE on grac42, ONLINE on grac43, ONLINE on grac41
# ps -elf | grep  prw.sh
4 S root     27380     1  0  80   0 - 26986 poll_s 12:15 ?        00:00:00 ksh /u01/app/11204/grid/log/procwatcher/prw.sh run
4 S root     27650     1  0  80   0 - 26938 poll_s 12:15 ?        00:00:00 ksh /u01/app/11204/grid/log/procwatcher/prw.sh housekeeper 7
0 D root     32035 27380  0  80   0 - 26763 fork   12:18 ?        00:00:00 ksh /u01/app/11204/grid/log/procwatcher/prw.sh gdbrun /usr/bin/gdb Linux 3877 ocssd.bin ocssd.bin /u01/app/11204/grid/log/procwatcher/PRW_CLUSTER/prw_ocssd.bin_3877_02-22-14 OH_not_set /u01/app/11204/grid/bin

 

Stop ProcessWatcher when running as a cluster resource

 
# ./prw.sh stop all
CRS-2673: Attempting to stop 'procwatcher' on 'grac43'
CRS-2673: Attempting to stop 'procwatcher' on 'grac41'
CRS-2673: Attempting to stop 'procwatcher' on 'grac42'
CRS-2677: Stop of 'procwatcher' on 'grac42' succeeded
CRS-2677: Stop of 'procwatcher' on 'grac43' succeeded
CRS-2677: Stop of 'procwatcher' on 'grac41' succeeded

Start  ProcessWatcher as a cluster resource

# ./prw.sh start all
CRS-2672: Attempting to start 'procwatcher' on 'grac43'
CRS-2672: Attempting to start 'procwatcher' on 'grac41'
CRS-2672: Attempting to start 'procwatcher' on 'grac42'
CRS-2676: Start of 'procwatcher' on 'grac42' succeeded
CRS-2676: Start of 'procwatcher' on 'grac43' succeeded
CRS-2676: Start of 'procwatcher' on 'grac41' succeeded

Deregister Procwatcher from Clusterware and remove Cluster resource

# ./prw.sh deinstall
CRS-2673: Attempting to stop 'procwatcher' on 'grac42'
CRS-2673: Attempting to stop 'procwatcher' on 'grac43'
CRS-2673: Attempting to stop 'procwatcher' on 'grac41'
CRS-2677: Stop of 'procwatcher' on 'grac42' succeeded
CRS-2677: Stop of 'procwatcher' on 'grac43' succeeded
CRS-2677: Stop of 'procwatcher' on 'grac41' succeeded
De-registering procwatcher resource
DECONFIGURING NODE grac41
DECONFIGURING NODE grac42
DECONFIGURING NODE grac43
Removing /u01/app/11204/grid/log/procwatcher directory
Procwatcher Deinstalled

 

Reference

  • Procwatcher (prw_12.1.13.11.0.zip) – Document 459694.1
  • How To Troubleshoot Database Contention With Procwatcher (Doc ID 1352623.1)

3 thoughts on “Procwatcher Setup and Usage”

  1. I wish to try this – but where can I download the latest version?? Or the zip file listed in your instructions – prw_11.2.12.12.2.zip

    1. Download TFA ( Trace File Analyzer ) from Document 1513912.1
      Oracle Trace File Analyzer with database support tools bundle includes
      tools like procwatcher and more

Leave a Reply

Your email address will not be published. Required fields are marked *