Cleanup / Removing a crashed node from OCR – RAC 10.2.0.1

Cleanup OCR

Backup your OCR
[root@ract1 ~]#   $ORA_CRS_HOME/bin/ocrconfig -export ocr_before_node_removal.exp

Verify and drop the failing instance ract3
--> Note ract3 is not available anymore 
Check current CRS status 
Name           Type           Target    State     Host
------------------------------------------------------------
ora....T3.inst application    ONLINE    OFFLINE               
ora....SM3.asm application    ONLINE    OFFLINE               
ora....T3.lsnr application    ONLINE    OFFLINE               
ora.ract3.gsd  application    ONLINE    OFFLINE               
ora.ract3.ons  application    ONLINE    OFFLINE               
ora.ract3.vip  application    ONLINE    ONLINE    ract1  
--> After crash only ora.ract3.vip  failed over to ract1 - other resources are OFFLINE

Stop all resources running on ract3
[root@ract1 ~]#  crs_stat -t | egrep 'Name|T3|t3|M3'
Name           Type           Target    State     Host        
ora....T3.inst application    ONLINE   OFFLINE               
ora....SM3.asm application    ONLINE   OFFLINE               
ora....T3.lsnr application    ONLINE   OFFLINE               
ora.ract3.gsd  application    ONLINE   OFFLINE               
ora.ract3.ons  application    ONLINE   OFFLINE               
ora.ract3.vip  application    ONLINE   ONLINE    ract1   
[root@ract1 ~]# crs_stop ora.RACT.RACT3.inst
Target set to OFFLINE for `ora.RACT.RACT3.inst`
[root@ract1 ~]# crs_stop ora.ract3.ASM3.asm
Target set to OFFLINE for `ora.ract3.ASM3.asm
[root@ract1 ~]#  crs_stop ora.ract3.LISTENER_RACT3.lsnr
Target set to OFFLINE for `ora.ract3.LISTENER_RACT3.lsnr`

[root@ract1 ~]#  crs_stop ora.ract3.gsd
Target set to OFFLINE for `ora.ract3.gsd`

[root@ract1 ~]# crs_stop ora.ract3.ons
Target set to OFFLINE for `ora.ract3.ons`

[root@ract1 ~]# crs_stop ora.ract3.vip
Attempting to stop `ora.ract3.vip` on member `ract1`
Stop of `ora.ract3.vip` on member `ract1` succeeded.

[root@ract1 ~]# crs_stat -t | egrep 'Name|T3|t3|M3'
Name           Type           Target    State     Host        
ora....T3.inst application    OFFLINE   OFFLINE               
ora....SM3.asm application    OFFLINE   OFFLINE               
ora....T3.lsnr application    OFFLINE   OFFLINE               
ora.ract3.gsd  application    OFFLINE   OFFLINE               
ora.ract3.ons  application    OFFLINE   OFFLINE               
ora.ract3.vip  application    OFFLINE   OFFLINE      

Remove RDBMS instance
[oracle@ract1 ~]$  srvctl remove instance -d ract -i RACT3
Remove instance RACT3 from the database ract? (y/[n]) y

Remove ASM instance 
$ srvctl remove asm -f -n ract3

Remove RAC listener
The only way to remove the listener resources is to use the command 'crs_unregister', 
please use this command only in this particular scenario:
[root@ract1 ~]# crs_stat | grep lsnr
NAME=ora.ract1.LISTENER_RACT1.lsnr
NAME=ora.ract2.LISTENER_RACT2.lsnr
NAME=ora.ract3.LISTENER_RACT3.lsnr
[root@ract1 ~]# crs_unregister ora.ract3.LISTENER_RACT3.lsnr

Remove remaining nodeapps 
[oracle@ract1 ~]$  crs_stat -t | egrep 'Name|T3|t3|M3'
Name           Type           Target    State     Host        
ora.ract3.gsd  application    OFFLINE   OFFLINE               
ora.ract3.ons  application    OFFLINE   OFFLINE               
ora.ract3.vip  application    OFFLINE   OFFLINE 

[root@ract1 ~]# srvctl  remove  nodeapps -n ract3 
Please confirm that you intend to remove the node-level applications on node ract3 (y/[n]) y

Verify that all resource linked to ract3 are removed
[root@ract1 ~]#   crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....T1.inst application    ONLINE    ONLINE    ract1       
ora....T2.inst application    ONLINE    ONLINE    ract2       
ora.RACT.db    application    ONLINE    ONLINE    ract2       
ora....SM1.asm application    ONLINE    ONLINE    ract1       
ora....T1.lsnr application    ONLINE    ONLINE    ract1       
ora.ract1.gsd  application    ONLINE    ONLINE    ract1       
ora.ract1.ons  application    ONLINE    ONLINE    ract1       
ora.ract1.vip  application    ONLINE    ONLINE    ract1       
ora....SM2.asm application    ONLINE    ONLINE    ract2       
ora....T2.lsnr application    ONLINE    ONLINE    ract2       
ora.ract2.gsd  application    ONLINE    ONLINE    ract2       
ora.ract2.ons  application    ONLINE    ONLINE    ract2       
ora.ract2.vip  application    ONLINE    ONLINE    ract2 


Remove node by running ./rootdeletenode.sh
[root@ract1 ~]# olsnodes -n
ract1   1
ract2   2
ract3   3

[root@ract1 ~]# cd $ORA_CRS_HOME/install
[root@ract1 install]#  ./rootdeletenode.sh ract3,3
CRS-0210: Could not find resource 'ora.ract3.LISTENER_RACT3.lsnr'.
CRS-0210: Could not find resource 'ora.ract3.ons'.
CRS-0210: Could not find resource 'ora.ract3.vip'.
CRS-0210: Could not find resource 'ora.ract3.gsd'.
CRS-0210: Could not find resource ora.ract3.vip.
CRS nodeapps are deleted successfully
clscfg: EXISTING configuration version 3 detected.
clscfg: version 3 is 10G Release 2.
Successfully deleted 14 values from OCR.
Key SYSTEM.css.interfaces.noderact3 marked for deletion is not there. Ignoring.
Successfully deleted 5 keys from OCR.
Node deletion operation successful.
'ract3,3' deleted successfully

[root@ract1 install]#  olsnodes -n
ract1   1
ract2   2      

Cleanup CRS and Rdbms Inventory   
$ cd $ORA_CRS_HOME/oui/bin
$ ./runInstaller
  Installed products
   OraCrs10g_home
    Cluster Nodes
     ract1
     ract2
     ract3
   OraDb10g_home1
    Cluster Nodes
     ract1
     ract2
     ract3
[oracle@ract1 bin]$  cd $ORA_CRS_HOME/oui/bin
[oracle@ract1 bin]$  ./runInstaller -updateNodeList ORACLE_HOME=$ORA_CRS_HOME "CLUSTER_NODES={ract1,ract2}" CRS=true
Starting Oracle Universal Installer...
No pre-requisite checks found in oraparam.ini, no system pre-requisite checks will be executed.
'UpdateNodeList' was successful.

[oracle@ract1 bin]$  cd $ORACLE_HOME/oui/bin
[oracle@ract1 bin]$  $ORACLE_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME  "CLUSTER_NODES={ract1,ract2}"
Starting Oracle Universal Installer...
No pre-requisite checks found in oraparam.ini, no system pre-requisite checks will be executed.
'UpdateNodeList' was successful.

Verify Cluster Nodes
$ ./runInstaller
  Installed products
   OraCrs10g_home
    Cluster Nodes
     ract1
     ract2
   OraDb10g_home1
    Cluster Nodes
     ract1
     ract2

Cleanup crashed Node ( ract3 ) after reboot

Be carefull when deleting files - depending on your setup you may damage a 2.nd database install
# To remove RDBMS and CRS software run ( both crs and rdbms are installed under  /u01/app/oracle/ )
rm -rf  /u01/app/oracle/*   <-- dangerous !!
#
rm /etc/oracle/*
rm -f /etc/init.d/init.cssd
rm -f /etc/init.d/init.crs
rm -f /etc/init.d/init.crsd
rm -f /etc/init.d/init.evmd
rm -f /etc/rc2.d/K96init.crs
rm -f /etc/rc2.d/S96init.crs
rm -f /etc/rc3.d/K96init.crs
rm -f /etc/rc3.d/S96init.crs
rm -f /etc/rc5.d/K96init.crs
rm -f /etc/rc5.d/S96init.crs
rm -Rf /etc/oracle/scls_scr
rm -f /etc/inittab.crs
cp /etc/inittab.orig /etc/inittab

--> reboot cleanup node again and check that no cluster process has survived the cleanup !

Verify that no CRS process survived the cleanup
# ps -elf |grep d.bin

Error: srvctl remove instance fails IF INSTANCE: TARGET=ONLINE AND STATE=OFFLINE

[root@ract1 ~]# srvctl remove instance -d ract -i RACT3
root@ract1 ~]#  crs_stat  -t | egrep 'Name|t3|T3'
Name           Type           Target    State     Host        
ora....T3.inst application    ONLINE    OFFLINE       
--> Target for Instance RACT3 still ONLINE !!
See  BUG  4423294 : DELETE INSTANCE THROUGH SRVM FAILS IF INSTANCE: 
                   TARGET=ONLINE AND STATE=OFFLINE crs_stop  should fix !

[root@ract1 ~]# crs_stop ora.RACT.RACT3.inst
Target set to OFFLINE for `ora.RACT.RACT3.inst`
[root@ract1 ~]#  crs_stat -t | egrep 'Name|t3|T3'
Name           Type           Target    State     Host        
ora....T3.inst application    OFFLINE   OFFLINE     

Now srvctl remove instance should work
[root@ract1 ~]# srvctl remove instance -d ract -i RACT3

 

Reference

Cleanup root.sh configuration after a failed CRS 10g, 11.1 installation

Methode 1 : Using rootdelete.sh and rootdeinstall.sh on all rac nodes

# cd /u01/app/oracle/product/crs/install
# ./rootdelete.sh 
# ./rootdeinstall.sh
#   rm -rf /var/tmp/.oracle

Now rerun root.sh 
on ract1
# /u01/app/oracle/product/crs/root.sh
Later run on ract2
# /u01/app/oracle/product/crs/root.sh

Methode 2: Manually deletion

1) Try to stop nodeapps on all cluster nodes
[root@ract1 install]#  srvctl stop nodeapps -n ract1
PRKH-1010 : Unable to communicate with CRS services.
  [OCR Error(Native: prsr_initCLSS:[21])]

[root@ract2 ~]#   srvctl stop nodeapps -n ract2
PRKH-1010 : Unable to communicate with CRS services.
  [OCR Error(Native: prsr_initCLSS:[21])]

2) Cleanup files on all cluster nodes
rm /etc/oracle/*
rm -f /etc/init.d/init.cssd
rm -f /etc/init.d/init.crs
rm -f /etc/init.d/init.crsd
rm -f /etc/init.d/init.evmd
rm -f /etc/rc2.d/K96init.crs
rm -f /etc/rc2.d/S96init.crs
rm -f /etc/rc3.d/K96init.crs
rm -f /etc/rc3.d/S96init.crs
rm -f /etc/rc5.d/K96init.crs
rm -f /etc/rc5.d/S96init.crs
rm -Rf /etc/oracle/scls_scr
rm -f /etc/inittab.crs
cp /etc/inittab.orig /etc/inittab

3) Check CRS processes and kill them on all cluster nodes
[root@ract2 rules.d]# ps -ef | egrep  'css|crs|evm'
root      4314     1  0 07:13 ?        00:00:00 /bin/sh /etc/init.d/init.evmd run
root      4315     1  0 07:13 ?        00:00:00 /bin/sh /etc/init.d/init.cssd fatal
root      4316     1  0 07:13 ?        00:00:00 /bin/sh /etc/init.d/init.crsd run
root      4343  4315  0 07:13 ?        00:00:00 /bin/sh /etc/init.d/init.cssd startcheck
root      4365  4314  0 07:13 ?        00:00:00 /bin/sh /etc/init.d/init.cssd startcheck
root      4516  4316  0 07:13 ?        00:00:00 /bin/sh /etc/init.d/init.cssd startcheck
root     16626  9372  0 08:57 pts/1    00:00:00 egrep css|crs|evm
[root@ract2 rules.d]# kill -9 4314 4315 4316 4343 4365 4516
[root@ract2 rules.d]# ps -ef | egrep  'css|crs|evm'
root     18063  9372  0 09:10 pts/1    00:00:00 egrep css|crs|evm

4) Cleanup /var/tmp/.oracle/ and  /tmp/.oracle/ on all cluster nodes
If there is no other Oracle software running (like listeners, DB's, etc...), you can remove the files in 
   /var/tmp/.oracle or /tmp/.oracle.
[root@ract2 ~]# rm -f /var/tmp/.oracle/* or rm -f /tmp/.oracle/*

5) Remove the ocr.loc Usually the ocr.loc can be found at /etc/oracle on all cluster nodes
 --> should already clean due to step 1 

6)  Clean out the OCR and Voting Files with dd commands - only on a single node
[root@ract2 ~]# /bin/dd if=/dev/zero skip=25 bs=4k count=2560 of=/dev/raw/raw1
2560+0 records in
2560+0 records out
10485760 bytes (10 MB) copied, 0.248064 seconds, 42.3 MB/s
[root@ract2 ~]#  /bin/dd if=/dev/zero skip=25 bs=4k count=2560 of=/dev/raw/raw2
[root@ract2 ~]#  /bin/dd if=/dev/zero skip=25 bs=4k count=2560 of=/dev/raw/raw3
[root@ract2 ~]#  /bin/dd if=/dev/zero skip=25 bs=4k count=2560 of=/dev/raw/raw4
[root@ract2 ~]#  /bin/dd if=/dev/zero skip=25 bs=4k count=2560 of=/dev/raw/raw5

Reference

  • How to Proceed From a Failed 10g or 11.1 Oracle Clusterware (CRS) Installation (Doc ID 239998.1)

Cleanup steps for failed/missing/deleted CRS installation ( 11.2.0.3)

Cleanup steps for cluster node removal for following scenarios

  • failed CRS install on a particular node
  • cluster node was deleted and can’t be rebooted
  • RAC/Clusterware installation files were deleted at OS level from a cluster node by running all or any or all of the following rm commands
rm -rf /u01/app/oraInventory/*
rm -rf /u01/app/11203/grid/*
rm -rf /u01/app/grid/*
rm -rf /u01/app/oracle/product/11203/racdb/* 
rm -rf /etc/oracle/* 
rm /etc/oraInst.loc 
rm /etc/oratab 
rm /var/tmp/.oracle/*

Cleanup steps to remove node grac3 from a three-node cluster

First deleting the related cluster VIP:  ora.grac3.vip
# $GRID_HOME/bin/crsctl stop resource ora.grac3.vip 
# $GRID_HOME/bin/crsctl delete resource ora.grac3.vip 

Update the installer repository ( remove grac3 node )
$ ./runInstaller -updateNodeList ORACLE_HOME=/u01/app/11203/grid "CLUSTER_NODES={grac1,grac2}" CRS=TRUE

Verify current cluster nodes and delete the related cluster node
$ olsnodes
grac1
grac2
grac3
Drop grac3 from the cluster 
#  $GRID_HOME/bin/crsctl delete node -n grac3
CRS-4661: Node grac3 successfully deleted.
#  $GRID_HOME/bin/olsnodes -t -s
grac1    Active    Unpinned
grac2    Active    Unpinned

Verify node deletion with cluvfy 
$  cluvfy stage -post nodedel -n grac3
Performing post-checks for node removal 
Checking CRS integrity...
Clusterware version consistency passed
CRS integrity check passed
Node removal check passed
Post-check for node removal was successful.

 

Reference

  • How to Proceed from Failed 11gR2 Grid Infrastructure (CRS) Installation (Doc ID 942166.1)

Cleanup / Delete CRS manually

 

Detach the CRS ORACLE_HOME from Installer repository
$ ./runInstaller  -detachHome ORACLE_HOME="/u01/app/11203/grid"
Starting Oracle Universal Installer...
Checking swap space: must be greater than 500 MB.   Actual 6151 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /u01/app/oraInventory

Verify 
$ $ORACLE_HOME/OPatch/opatch lsinventory -all   
Oracle Interim Patch Installer version 11.2.0.3.4
Copyright (c) 2012, Oracle Corporation.  All rights reserved.
Oracle Home       : /u01/app/11203/grid
Central Inventory : /u01/app/oraInventory
   from           : /u01/app/11203/grid/oraInst.loc
OPatch version    : 11.2.0.3.4
OUI version       : 11.2.0.3.0
Log file location : /u01/app/11203/grid/cfgtoollogs/opatch/opatch2013-08-13_13-00-03PM_1.log
List of Homes on this system:
Inventory load failed... OPatch cannot load inventory for the given Oracle Home.
Possible causes are:
   Oracle Home dir. path does not exist in Central Inventory
   Oracle Home is a symbolic link
   Oracle Home inventory is corrupted
LsInventorySession failed: OracleHomeInventory gets null oracleHomeInfo
OPatch failed with error code 73

Delete unused entries from /etc/oratab and remove ORACLE_HOME files at OS level
#  rm -rf /u01/app/11203/grid

Reference:
How to Deinstall Oracle Clusterware Home Manually (Doc ID 1364419.1)