Recover RAC database after loosing FRA diskgroup

FRA DG content

 

  • stroage for mulitplexed control file
  • storage for mulitplexed ONLINE REDO logs
  • storage for BCF file ( Block Change Tracking file )

Prepare test case

[grid@grac41 Desktop]$  asmcmd lsdg FRA
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512   4096  1048576     40952    31513                0           31513              0             N  FRA/
[grid@grac41 Desktop]$ asmcmd lsdsk -k
Total_MB  Free_MB  OS_MB  Name      Failgroup  Failgroup_Type  Library  Label  UDID  Product  Redund   Path
   20473    15758  20473  FRA_0000  FRA_0000  REGULAR         System                         UNKNOWN  /dev/asmdisk_fra1
   20479    15755  20479  FRA_0001  FRA_0001  REGULAR         System                         UNKNOWN  /dev/asmdisk_fra2
--> FRA DG is using EXTERNAL redundany with 2 disks ( /dev/asmdisk_fra1, /dev/asmdisk_fra2 )

Find related device UUID
[root@grac41 ~]# ./check_uuid.sh
/dev/sda  WWID:   1ATA_VBOX_HARDDISK_VBcd7c99fa-dc59f9dd
..
/dev/sdj  WWID:   1ATA_VBOX_HARDDISK_VB1726c4c7-b3bcaccd
/dev/sdk  WWID:   1ATA_VBOX_HARDDISK_VB17a025ba-62aae810
/dev/sdl  WWID:   1ATA_VBOX_HARDDISK_VB0cba64ab-3d0e1451
[root@grac41 ~]#  scsi_id --whitelisted --replace-whitespace --device=/dev/asmdisk_fra2
1ATA_VBOX_HARDDISK_VB1726c4c7-b3bcaccd
--> /dev/sdj is the disk device for FRA partition /dev/asmdisk_fra2

Verify this by reading major/minior device numbers 
[root@grac41 Desktop]# ls -l /dev/sdj
brw-rw----. 1 root disk 8, 144 Jul 12 09:42 /dev/sdj
[root@grac41 Desktop]#  ls -l /dev/asmdisk_fra2
brw-rw----. 1 grid asmadmin 8, 145 Jul 12 09:50 /dev/asmdisk_fra2

Disable I/O to the 2.nd FRA disk  ( do this on all instances ) echo offline > /sys/block/sdj/device/state
--> Instance crash - Alert log
WARNING: Write Failed. group:2 disk:1 AU:4497 offset:49152 size:16384
Errors in file /u01/app/oracle/diag/rdbms/grac4/grac41/trace/grac41_ckpt_6406.trc:
ORA-15080: synchronous I/O operation to a disk failed
ORA-27061: waiting for async I/Os failed
Linux-x86_64 Error: 5: Input/output error
Additional information: -1
Additional information: 16384
WARNING: failed to write mirror side 1 of virtual extent 0 logical extent 0 of file 321 in group 2 on disk 1 allocation unit 4497 
Errors in file /u01/app/oracle/diag/rdbms/grac4/grac41/trace/grac41_ckpt_6406.trc:
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '+FRA/grac4/controlfile/current.321.852654927'
ORA-15081: failed to submit an I/O operation to a disk
ORA-15081: failed to submit an I/O operation to a disk
Errors in file /u01/app/oracle/diag/rdbms/grac4/grac41/trace/grac41_ckpt_6406.trc:
ORA-00221: error on write to control file
ORA-00206: error in writing (block 3, # blocks 1) of control file
ORA-00202: control file: '+FRA/grac4/controlfile/current.321.852654927'
ORA-15081: failed to submit an I/O operation to a disk
ORA-15081: failed to submit an I/O operation to a disk
Sat Jul 12 09:52:20 2014
System state dump requested by (instance=1, osid=6406 (CKPT)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/oracle/diag/rdbms/grac4/grac41/trace/grac41_diag_6378_20140712095220.trc
CKPT (ospid: 6406): terminating the instance due to error 221

Note automatic Reboot failed with 
WARNING: Read Failed. group:2 disk:1 AU:4497 offset:16384 size:32768
WARNING: failed to read mirror side 1 of virtual extent 0 logical extent 0 of file 321 in group [2.3426318794] from disk FRA_0001  allocation unit 4497 reason error; if possible, will try another mirror side
NOTE: dependency between database grac4 and diskgroup resource ora.DATA.dg is established
ORA-00204: error in reading (block 1, # blocks 1) of control file
ORA-00202: control file: '+FRA/grac4/controlfile/current.321.852654927'
ORA-15081: failed to submit an I/O operation to a disk
ORA-27072: File I/O error
Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 9209888
Additional information: -1
ORA-205 signalled during: ALTER DATABASE MOUNT /* db agent *//* {0:13:25} */...
NOTE: dependency between database grac4 and diskgroup resource ora.FRA.dg is established

Check clusterware status
[root@grac41 Desktop]# crs
NAME                           TARGET     STATE           SERVER       STATE_DETAILS   
-------------------------      ---------- ----------      ------------ ------------------       
ora.FRA.dg                    ONLINE     ONLINE          grac41        
ora.FRA.dg                    ONLINE     ONLINE          grac42        
ora.FRA.dg                    ONLINE     ONLINE          grac43        
..      
ora.grac4.db                   ONLINE     OFFLINE         grac41       Instance Shutdown
--> Note it may take some time until FRA DG is dropped 

[grid@grac41 ~]$ asmcmd lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  NORMAL  N         512   4096  1048576     40944    18711            10236            4237              0             N  DATA/
MOUNTED  EXTERN  N         512   4096  1048576     40952    31431                0           31431              0             N  FRA/
MOUNTED  NORMAL  N         512   4096  4194304      6132     4960             2044            1458              0             Y  OCR/

Connect to ASM instance and check for ASM diks after ASM has dismounted the FRA DG
SQL> select dg.name dg_name,  dg.state dg_state,  dg.type,d.name, d.DISK_NUMBER dsk_no, d.MOUNT_STATUS, d.HEADER_STATUS, d.MODE_STATUS,
        d.STATE, d. PATH, d.FAILGROUP  FROM V$ASM_DISK d,  v$asm_diskgroup dg
     where dg.group_number(+)=d.group_number order by dg_name, dsk_no;

DG_NAME    DG_STATE   TYPE   NAME     DSK_NO MOUNT_S HEADER_STATU MODE_ST STATE    PATH                 FAILGROUP
---------- ---------- ------ ---------- ------- ------- ------------ ------- -------- ------------------------------ ---------------
FRA       DISMOUNTED                  0 CLOSED    MEMBER       ONLINE  NORMAL   /dev/asmdisk_fra1
FRA       DISMOUNTED                  2 CLOSED    CANDIDATE    ONLINE  NORMAL   /dev/asmdisk_fra2

Verifying controlfile access :
[grid@grac41 ~]$ asmcmd ls +FRA/grac4/controlfile/current.321.852654927
ASMCMD-8002: entry 'current.321.852654927' does not exist in directory '+FRA/grac4/controlfile/'
[grid@grac41 ~]$ asmcmd ls
DATA/
OCR/
--> FRA top level directory was missing 

Recover database without FRA DG

Try to start the database manually

oracle@grac41 ~]$ sqlplus / as sysdba
Connected to an idle instance.

SQL> startup mount
ORACLE instance started.

Total System Global Area 1336176640 bytes
Fixed Size            2253024 bytes
Variable Size          469765920 bytes
Database Buffers      855638016 bytes
Redo Buffers            8519680 bytes
ORA-00205: error in identifying control file, check alert log for more info

SQL> show parameter control
NAME                           TYPE     VALUE
------------------------------ ----------- ------------------------------
control_files                  string     +DATA/grac4/controlfile/current.260.826111693, 
                                          +FRA/grac4/controlfile/current.321.8526549

As FRA DG isn't available anymore - remove the contolfile reference
SQL>  startup force nomount
SQL>  alter system set control_files='+DATA/grac4/controlfile/current.260.826111693' scope = spfile;

SQL> startup mount
ORA-01081: cannot start already-running ORACLE - shut it down first
SQL> startup force mount
ORA-01105: mount is incompatible with mounts by other instances
ORA-01104: number of control files (1) does not equal 2
--> As the remaining instance still have access to the FRA DG lets shutdown these instances
[oracle@grac41 trace]$  srvctl stop instance -d grac4 -i grac42
[oracle@grac41 trace]$  srvctl stop instance -d grac4 -i grac43

SQL> startup force mount 
SQL> show parameter control
NAME                     TYPE     VALUE
-------------------- ----------- ------------------------------
control_files         string     +DATA/grac4/controlfile/current.260.826111693

--> Mount status ok - let's try to open the database  
SQL> alter database open ;
alter database open
*
ERROR at line 1:
ORA-19751: could not create the change tracking file
ORA-19750: change tracking file: '+FRA/bct.dbf'
ORA-17502: ksfdcre:1 Failed to create file +FRA/bct.dbf
ORA-17501: logical block size 4294967295 is invalid
ORA-17503: ksfdopn:2 Failed to open file +FRA/bct.dbf
ORA-15001: diskgroup "FRA" does not exist or is not mounted
ORA-15001: diskgroup "FRA" does not exist or is not mounted

Disable  BLOCK CHANGE TRACKING and try to open the database 
SQL>  ALTER DATABASE DISABLE BLOCK CHANGE TRACKING; 
Database altered.
SQL>  alter database open;
 alter database open
*
ERROR at line 1:
ORA-16038: log 6 sequence# 21 cannot be archived
ORA-19504: failed to create file ""
ORA-00312: online log 6 thread 3: '+DATA/grac4/onlinelog/group_6.269.852652289'
ORA-00312: online log 6 thread 3: '+FRA/grac4/onlinelog/group_6.306.852652301'

Verify REDO logs:
   THREAD#     GROUP#  SEQUENCE# STATUS       MEMBER                         TYPE    IS_RDF
---------- ---------- ---------- ---------------- -------------------------------------------------- ------- ------
     1        1          41 INACTIVE      +FRA/grac4/onlinelog/group_1.285.852650241         ONLINE  YES
     1        1          41 INACTIVE      +DATA/grac4/onlinelog/group_1.274.852650227         ONLINE  NO
     1        2          42 CURRENT      +DATA/grac4/onlinelog/group_2.273.852651533         ONLINE  NO
     1        2          42 CURRENT      +FRA/grac4/onlinelog/group_2.298.852651537         ONLINE  YES
     2        3          20 INACTIVE      +DATA/grac4/onlinelog/group_3.272.852652849         ONLINE  NO
     2        3          20 INACTIVE      +FRA/grac4/onlinelog/group_3.318.852652859         ONLINE  YES
     2        4          19 INACTIVE      +DATA/grac4/onlinelog/group_4.266.852652635         ONLINE  NO
     2        4          19 INACTIVE      +FRA/grac4/onlinelog/group_4.294.852652647         ONLINE  YES
     3        5          22 INACTIVE      +FRA/grac4/onlinelog/group_5.305.852652263         ONLINE  YES
     3        5          22 INACTIVE      +DATA/grac4/onlinelog/group_5.270.852652251         ONLINE  NO
     3        6          21 INACTIVE      +FRA/grac4/onlinelog/group_6.306.852652301         ONLINE  YES
     3        6          21 INACTIVE      +DATA/grac4/onlinelog/group_6.269.852652289         ONLINE  NO
---> ONLINE REDO logs still reference +FRA DG

Disable FRA and open database 
SQL>  ALTER SYSTEM SET DB_RECOVERY_FILE_DEST=''  SCOPE=BOTH  SID='*';
System altered.

SQL> alter database open;
Database altered.
Verify REDO logs
   THREAD#     GROUP#  SEQUENCE# STATUS       MEMBER                         TYPE    IS_RDF
---------- ---------- ---------- ---------------- -------------------------------------------------- ------- ------
     1        1          41 INACTIVE      +FRA/grac4/onlinelog/group_1.285.852650241         ONLINE  YES
     1        1          41 INACTIVE      +DATA/grac4/onlinelog/group_1.274.852650227         ONLINE  NO
     1        2          42 CURRENT      +DATA/grac4/onlinelog/group_2.273.852651533         ONLINE  NO
     1        2          42 CURRENT      +FRA/grac4/onlinelog/group_2.298.852651537         ONLINE  YES
     2        3          20 INACTIVE      +DATA/grac4/onlinelog/group_3.272.852652849         ONLINE  NO
     2        3          20 INACTIVE      +FRA/grac4/onlinelog/group_3.318.852652859         ONLINE  YES
     2        4          19 INACTIVE      +DATA/grac4/onlinelog/group_4.266.852652635         ONLINE  NO
     2        4          19 INACTIVE      +FRA/grac4/onlinelog/group_4.294.852652647         ONLINE  YES
     3        5          22 INACTIVE      +FRA/grac4/onlinelog/group_5.305.852652263         ONLINE  YES
     3        5          22 INACTIVE      +DATA/grac4/onlinelog/group_5.270.852652251         ONLINE  NO
     3        6          21 INACTIVE      +FRA/grac4/onlinelog/group_6.306.852652301         ONLINE  YES
     3        6          21 INACTIVE      +DATA/grac4/onlinelog/group_6.269.852652289         ONLINE  NO

Database is open ---> lets recreate FRA DG

Restore FRA DG

Try to mount
SQL> alter diskgroup FRA mount force;
alter diskgroup FRA mount force
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "2" is missing from group number "2"
--> This is the expected error


Rereate FRA DG ( old DG +FRA - new DG : + FRA2 )
-> Completly erase our ASM disk headers so we can reuse disk !
[root@grac41 Desktop]# dd if=/dev/zero of=/dev/asmdisk_fra1 bs=1024 count=1024
[root@grac41 Desktop]# dd if=/dev/zero of=/dev/asmdisk_fra2 bs=1024 count=1024

Note drop with asmca only works after all related ASM disk header are cleanup.
As long disks are still members of the FRA DG you can't drop the DG.

Create and enable new FRA DG +FRA2 by using asmca and renable DB_RECOVERY_FILE_DEST

SQL> alter system set db_recovery_file_dest_size=40G  scope=both SID='*';
SQL> alter system set db_recovery_file_dest='+FRA2' scope=both SID='*';

Verify database  setup
SQL>  select log_mode from v$database;
LOG_MODE
------------
ARCHIVELOG


SQL>  archive log list
Database log mode             Archive Mode
Automatic archival            Enabled
Archive destination           USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence     7
Next log sequence to archive   8
Current log sequence           8

SQL>  show parameter    db_recovery_file_dest
NAME                     TYPE     VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest             string     +FRA2
db_recovery_file_dest_size         big integer 40G

SQL> alter system set log_archive_dest_1='LOCATION=USE_DB_RECOVERY_FILE_DEST';
System altered.
SQL> alter system switch logfile; 
System altered.

Verify that archive logs are in FRA 
[grid@grac41 ~]$ asmcmd ls -l +FRA2/GRAC4/ARCHIVELOG/2014_07_10
Type        Redund  Striped  Time             Sys  Name
ARCHIVELOG  UNPROT  COARSE   JUL 10 17:00:00  Y    thread_1_seq_8.258.852572669
ARCHIVELOG  UNPROT  COARSE   JUL 10 17:00:00  Y    thread_1_seq_9.259.852573253

Drop the logfile members pointing to the old +FRA DG 
   THREAD#     GROUP# STATUS           MEMBER
---------- ---------- ---------------- --------------------------------------------------
     1       11 INACTIVE           +FRA/grac4/onlinelog/group_11.1102.852485687
     1       11 INACTIVE           +DATA/grac4/onlinelog/group_11.271.852485683
     1       12 CURRENT            +FRA/grac4/onlinelog/group_12.1103.852485693
     1       12 CURRENT            +DATA/grac4/onlinelog/group_12.272.852485689
..
SQL>  ALTER DATABASE DROP LOGFILE MEMBER '+FRA/grac4/onlinelog/group_11.1102.852485687';
Database altered.

SQL> ALTER DATABASE DROP LOGFILE MEMBER '+FRA/grac4/onlinelog/group_12.1103.852485693';
ALTER DATABASE DROP LOGFILE MEMBER '+FRA/grac4/onlinelog/group_12.1103.852485693'
*
ERROR at line 1:
ORA-01609: log 12 is the current log for thread 1 - cannot drop members
ORA-00312: online log 12 thread 1: '+DATA/grac4/onlinelog/group_12.272.852485689'
ORA-00312: online log 12 thread 1: '+FRA/grac4/onlinelog/group_12.1103.852485693'

SQL> alter system switch logfile;
System altered.
SQL>  ALTER DATABASE DROP LOGFILE MEMBER '+FRA/grac4/onlinelog/group_12.1103.852485693';
Database altered.
....

Backup database and Validate backup

RMAN> run
{
set until time "to_date('2014-13-07:10:25:00','yyyy-dd-mm:hh24:mi:ss')";
restore database preview;
}

Multiplex REDO and controlfile for usage of newly created FRA DG

Verify FRA status

SQL> @cf

STATUS    NAME                           IS_RDF
------- -------------------------------------------------- ------
    +DATA/grac4/controlfile/current.260.826111693       NO
    +FRA2/grac4/controlfile/current.321.852654927       YES

SQL> select l.thread#, group#, sequence#, l.status,    member,type,  IS_RECOVERY_DEST_FILE is_rdf from v$logfile inner join v$log l
          using (group#)   order by  l.thread#, group#;
   THREAD#     GROUP#  SEQUENCE# STATUS       MEMBER                         TYPE    IS_RDF
---------- ---------- ---------- ---------------- -------------------------------------------------- ------- ------
     1        1          37 CURRENT      +FRA2/grac4/onlinelog/group_1.285.852650241         ONLINE  YES
     1        1          37 CURRENT      +DATA/grac4/onlinelog/group_1.274.852650227         ONLINE  NO
     1        2          36 INACTIVE      +DATA/grac4/onlinelog/group_2.273.852651533         ONLINE  NO
     1        2          36 INACTIVE      +FRA2/grac4/onlinelog/group_2.298.852651537         ONLINE  YES
     2        3          18 CURRENT      +DATA/grac4/onlinelog/group_3.272.852652849         ONLINE  NO
     2        3          18 CURRENT      +FRA2/grac4/onlinelog/group_3.318.852652859         ONLINE  YES
     2        4          17 INACTIVE      +DATA/grac4/onlinelog/group_4.266.852652635         ONLINE  NO
     2        4          17 INACTIVE      +FRA2/grac4/onlinelog/group_4.294.852652647         ONLINE  YES
     3        5          18 CURRENT      +FRA2/grac4/onlinelog/group_5.305.852652263         ONLINE  YES
     3        5          18 CURRENT      +DATA/grac4/onlinelog/group_5.270.852652251         ONLINE  NO
     3        6          17 INACTIVE      +FRA2/grac4/onlinelog/group_6.306.852652301         ONLINE  YES
     3        6          17 INACTIVE      +DATA/grac4/onlinelog/group_6.269.852652289         ONLINE  NO

SQL>  select * from V$RECOVERY_AREA_USAGE;

FILE_TYPE            PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
-------------------- ------------------ ------------------------- ---------------
CONTROL FILE                .05             0                       1
REDO LOG                    .75             0                       6
ARCHIVED LOG                .54             0                      41
BACKUP PIECE              20.99             0                      18
IMAGE COPY                21.22             0                       6
FLASHBACK LOG                  0            0                       0
FOREIGN ARCHIVED LOG           0            0                       0

Leave a Reply

Your email address will not be published. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>