Recovering OCR DG when missing a single disk

OCR disk group content

  • tested with GRID 11.2.0.4
  • Storage for ASM SPFile
  • Storage for Voting Disk
  • Storage for OCR repository

Setup test case

Destroy ASM disk header from our OCR diskgroup
[root@grac41 Desktop]# dd if=/dev/zero  of=/dev/asm_ocr_11204_2G_disk3 bs=1024 count=1024
1024+0 records in
1024+0 records out
1048576 bytes (1.0 MB) copied, 0.00483026 s, 217 MB/s

Verify CW status after startup 
[root@grac41 Desktop]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

Monitor CRS startup 
[grid@grac41 ~]$ watch crsi
NAME                           TARGET     STATE           SERVER       STATE_DETAILS   
-------------------------      ---------- ----------      ------------ ------------------
ora.asm                        ONLINE     INTERMEDIATE    grac41       OCR not started
ora.cluster_interconnect.haip  ONLINE     ONLINE          grac41         
ora.crf                        ONLINE     ONLINE          grac41         
ora.crsd                       ONLINE     OFFLINE                        
ora.cssd                       ONLINE     ONLINE          grac41         
ora.cssdmonitor                ONLINE     ONLINE          grac41         
ora.ctssd                      ONLINE     ONLINE          grac41       OBSERVER  
ora.diskmon                    OFFLINE    OFFLINE                        
ora.drivers.acfs               ONLINE     OFFLINE                        
ora.evmd                       ONLINE     INTERMEDIATE    grac41         
ora.gipcd                      ONLINE     ONLINE          grac41         
ora.gpnpd                      ONLINE     ONLINE          grac41         
ora.mdnsd                      ONLINE     ONLINE          grac41   

RAC Alertlog 
[/u01/app/11204/grid/bin/oraagent.bin(15374)]CRS-5019:All OCR locations are on ASM disk groups [OCR], and 
                                             none of these disksgroups are mounted. 
Details are at "(:CLSN00100:)" in "/u01/app/11204/grid/log/grac41/agent/ohasd/oraagent_grid/oraagent_grid.log".
Agent Log
2014-07-05 09:23:32.627: [ora.asm][1002428160]{0:0:2} [start] ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "0" is missing from group number "1"

Check trace for errors:
[grid@grac41 log]$ fn.sh ORA- | egrep 'TraceFile|2014-07-05 09:'
TraceFileName: ./grac41/agent/ohasd/oraagent_grid/oraagent_grid.l01
2014-07-05 09:23:32.627: [ora.asm][1002428160]{0:0:2} [start] ORA-15032: not all alterations performed
2014-07-05 09:23:32.640: [ora.asm][1002428160]{0:0:2} [start] ORA-15100: invalid or missing diskgroup name

Check voting disks    
[grid@grac41 log]$  crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   a4452602e3be4f42bf5467f41d96e46a (/dev/asm_ocr_11204_2G_disk1) [OCR]
 2. ONLINE   293e6baa9cbc4f90bf01f52b1b9019fa (/dev/asm_ocr_11204_2G_disk2) [OCR]
 3. OFFLINE  69e2423e2cff4f64bf16c1346a900803 () []
Located 3 voting disk(s).
--> Voting Disk 3 is OFFLINE

t@grac41 ~]# ocrcheck -local
Status of Oracle Local Registry is as follows :
     Version                  :          3
     Total space (kbytes)     :     262120
     Used space (kbytes)      :       2676
     Available space (kbytes) :     259444
     ID                       : 1855884304
     Device/File Name         : /u01/app/11204/grid/cdata/grac41.olr
                                    Device/File integrity check succeeded
     Local registry integrity check succeeded
     Logical corruption check succeeded
--> OLR ok

[root@grac41 ~]# ocrcheck
PROT-602: Failed to retrieve data from the cluster registry
PROC-26: Error while accessing the physical storage
[root@grac41 ~]# ocrcheck -config
Oracle Cluster Registry configuration is :
     Device/File Name         :       +OCR
--> Local OLR ok - Cluster OCR not ONLINE due to missing voting disk

Find the missing disk name of our 3.rd ASM disk 
SQL> select dg.name dg_name,  dg.state dg_state,  dg.type, d.DISK_NUMBER dsk_no, d.MOUNT_STATUS, d.HEADER_STATUS, d.MODE_STATUS,
  2      d.STATE, d. PATH, d.FAILGROUP  FROM V$ASM_DISK d,  v$asm_diskgroup dg
  3   where dg.group_number(+)=d.group_number order by dg_name, dsk_no;

DG_NAME    DG_STATE   TYPE    DSK_NO MOUNT_S HEADER_STATU MODE_ST STATE    PATH               FAILGROUP
---------- ---------- ------ ------- ------- ------------ ------- -------- ------------------------------ ---------------
OCR       DISMOUNTED           1 CLOSED  MEMBER      ONLINE  NORMAL   /dev/asm_ocr_11204_2G_disk2
OCR       DISMOUNTED           2 CLOSED  MEMBER      ONLINE  NORMAL   /dev/asm_ocr_11204_2G_disk1
OCR       DISMOUNTED           3 CLOSED  CANDIDATE   ONLINE  NORMAL   /dev/asm_ocr_11204_2G_disk3

Check the ASM disk header this with kfed 
[root@grac41 Desktop]# kfed read /dev/asm_ocr_11204_2G_disk3
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
7F4C5C91B400 00000000 00000000 00000000 00000000  [................]
  Repeat 255 times
KFED-00322: Invalid content encountered during block traversal: [kfbtTraverseBlock][Invalid OSM block type][][0]
--> ASM disk header erased 

Verify the disk header from the remaining disk
[root@grac41 Desktop]#  kfed read /dev/asm_ocr_11204_2G_disk1   | egrep 'type|name'
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfdhdb.dskname:                OCR_0003 ; 0x028: length=8
kfdhdb.grpname:                     OCR ; 0x048: length=3
kfdhdb.fgname:                 OCR_0003 ; 0x068: length=8
kfdhdb.capname:                         ; 0x088: length=0
[root@grac41 Desktop]#  kfed read /dev/asm_ocr_11204_2G_disk2 | egrep 'type|name'
kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD
kfdhdb.dskname:                OCR_0000 ; 0x028: length=8
kfdhdb.grpname:                     OCR ; 0x048: length=3
kfdhdb.fgname:                 OCR_0000 ; 0x068: length=8
kfdhdb.capname:                         ; 0x088: length=0
--> Disks asm_ocr_11204_2G_disk1 and asm_ocr_11204_2G_disk2 are ok !

Add the failed disk back again to OCR DG

Mount OCR DG with force option and re-add the repaired disk  

SQL>  alter diskgroup OCR mount force;
Diskgroup altered.

SQL>  @dg
DG_NAME    DG_STATE   TYPE    DSK_NO MOUNT_S HEADER_STATU MODE_ST STATE    PATH                           FAILGROUP
---------- ---------- ------ ------- ------- ------------ ------- -------- ------------------------------ ---------------
OCR       MOUNTED    NORMAL       0 CACHED  MEMBER      ONLINE  NORMAL   /dev/asm_ocr_11204_2G_disk2      OCR_0000
OCR       MOUNTED    NORMAL       1 MISSING UNKNOWN     OFFLINE NORMAL                                    OCR_0001
OCR       MOUNTED    NORMAL       3 CACHED  MEMBER      ONLINE  NORMAL   /dev/asm_ocr_11204_2G_disk1      OCR_0003                   
                                  3 CLOSED  CANDIDATE   ONLINE  NORMAL   /dev/asm_ocr_11204_2G_disk3

SQL>  alter diskgroup OCR add disk '/dev/asm_ocr_11204_2G_disk3';
Diskgroup altered.

SQL> @dg

DG_NAME    DG_STATE   TYPE    DSK_NO MOUNT_S HEADER_STATU MODE_ST STATE    PATH               FAILGROUP
---------- ---------- ------ ------- ------- ------------ ------- -------- ------------------------------ ---------------
OCR       MOUNTED    NORMAL       0 CACHED  MEMBER      ONLINE  NORMAL   /dev/asm_ocr_11204_2G_disk2      OCR_0000
OCR       MOUNTED    NORMAL       1 MISSING UNKNOWN     OFFLINE NORMAL                  OCR_0001
OCR       MOUNTED    NORMAL       2 CACHED  MEMBER      ONLINE  NORMAL   /dev/asm_ocr_11204_2G_disk3      OCR_0002
OCR       MOUNTED    NORMAL       3 CACHED  MEMBER      ONLINE  NORMAL   /dev/asm_ocr_11204_2G_disk1      OCR_0003

Testing mount/remount operation before CW restart 
SQL>  alter diskgroup OCR dismount force;
Diskgroup altered.
SQL>  alter diskgroup OCR mount;
Diskgroup altered.
SQL> @dg
DG_NAME    DG_STATE   TYPE    DSK_NO MOUNT_S HEADER_STATU MODE_ST STATE    PATH               FAILGROUP
---------- ---------- ------ ------- ------- ------------ ------- -------- ------------------------------ ---------------
OCR       MOUNTED    NORMAL       0 CACHED  MEMBER      ONLINE  NORMAL   /dev/asm_ocr_11204_2G_disk2      OCR_0000
OCR       MOUNTED    NORMAL       1 MISSING UNKNOWN      OFFLINE FORCING                  OCR_0001
OCR       MOUNTED    NORMAL       2 CACHED  MEMBER      ONLINE  NORMAL   /dev/asm_ocr_11204_2G_disk3      OCR_0002
OCR       MOUNTED    NORMAL       3 CACHED  MEMBER      ONLINE  NORMAL   /dev/asm_ocr_11204_2G_disk1      OCR_0003

Restart CW and verify voting disk status 
- Note disk with FORCING state will be cleaned up 
- Missing Vote disk will be added automatically

DG_NAME    DG_STATE   TYPE    DSK_NO MOUNT_S HEADER_STATU MODE_ST STATE    PATH               FAILGROUP
---------- ---------- ------ ------- ------- ------------ ------- -------- ------------------------------ ---------------
OCR       MOUNTED    NORMAL       0 CACHED  MEMBER      ONLINE  NORMAL   /dev/asm_ocr_11204_2G_disk2      OCR_0000
OCR       MOUNTED    NORMAL       2 CACHED  MEMBER      ONLINE  NORMAL   /dev/asm_ocr_11204_2G_disk3      OCR_0002
OCR       MOUNTED    NORMAL       3 CACHED  MEMBER      ONLINE  NORMAL   /dev/asm_ocr_11204_2G_disk1      OCR_0003

[root@grac41 Desktop]# crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   a4452602e3be4f42bf5467f41d96e46a (/dev/asm_ocr_11204_2G_disk1) [OCR]
 2. ONLINE   293e6baa9cbc4f90bf01f52b1b9019fa (/dev/asm_ocr_11204_2G_disk2) [OCR]
 3. ONLINE   a0967baeb88b4f45bf3ae5da678fecc4 (/dev/asm_ocr_11204_2G_disk3) [OCR]

Leave a Reply

Your email address will not be published. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>