Node Eviction due to failed Voting Disks

Node Eviction due to failed DHB ( Disk Heartbeat )

Triggering a DHB Note Eviction on Node grac2 :

Disable the first voting  disk ( we still have 2 remaining votedisks - no eviction should occur ) )
# echo 1 > /sys/block/sdg/device/delete

Monitor $GRID_HOME/log/alertgrac2.log
2013-08-02 18:53:26.148
cssd(21694)]CRS-1649:An I/O error occured for voting file: /dev/oracleasm/disks/OCR1; details at (:CSSNM00059:) in /u01/app/11203/grid/log/grac2/cssd/ocssd.log
[cssd(21694)]CRS-1615:No I/O has completed after 50% of the maximum interval. Voting file /dev/oracleasm/disks/OCR1 will be considered not functional in 99340 milliseconds
2013-08-02 18:55:55.848
[cssd(21694)]CRS-1614:No I/O has completed after 75% of the maximum interval. Voting file /dev/oracleasm/disks/OCR1 will be considered not functional in 49320 milliseconds
[cssd(21694)]CRS-1613:No I/O has completed after 90% of the maximum interval. Voting file /dev/oracleasm/disks/OCR1 will be considered not functional in 19310 milliseconds
2013-08-02 18:56:45.871
[cssd(21694)]CRS-1604:CSSD voting file is offline: /dev/oracleasm/disks/OCR1; details at (:CSSNM00058:) in /u01/app/11203/grid/log/grac2/cssd/ocssd.log.

Check voting disk status:
# $GRID_HOME/bin/crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. OFFLINE  5c9b3326e5a74f5abf9d300f3e76c6d0 (/dev/oracleasm/disks/OCR1) [OCR]
 2. ONLINE   a6a0becb42874faabf14a50ab74996c2 (/dev/oracleasm/disks/OCR2) [OCR]
 3. ONLINE   417df0660a424fb1bf2db116acfdb47b (/dev/oracleasm/disks/OCR3) [OCR]

Disable the 2.nd voting  disk ( eviction takes place ) 
# echo 1 > /sys/block/sdg/device/delete

Monitor $GRID_HOME/log/alertgrac2.log
[cssd(21694)]CRS-1649:An I/O error occured for voting file: /dev/oracleasm/disks/OCR3; details at (:CSSNM00059:) in /u01/app/11203/grid/log/grac2/cssd/ocssd.log.
2013-08-02 19:43:06.212
[cssd(21694)]CRS-1615:No I/O has completed after 50% of the maximum interval. Voting file /dev/oracleasm/disks/OCR3 will be considered not functional in 99760 milliseconds
2013-08-02 19:43:56.239
[cssd(21694)]CRS-1614:No I/O has completed after 75% of the maximum interval. Voting file /dev/oracleasm/disks/OCR3 will be considered not functional in 49740 milliseconds
[cssd(21694)]CRS-1613:No I/O has completed after 90% of the maximum interval. Voting file /dev/oracleasm/disks/OCR1 will be considered not functional in 19310 milliseconds
2013-08-02 18:56:45.871
[cssd(21694)]CRS-1604:CSSD voting file is offline: /dev/oracleasm/disks/OCR1; details at (:CSSNM00058:) in /u01/app/11203/grid/log/grac2/cssd/ocssd.log.
c2013-08-02 19:22:20.836
[ctssd(21782)]CRS-2409:The clock on host grac2 is not synchronous with the mean cluster time. No action has been taken as the Cluster Time Synchronization Service is running in observer mode.
2013-08-02 19:41:26.958
[cssd(21694)]CRS-1649:An I/O error occured for voting file: /dev/oracleasm/disks/OCR3; details at (:CSSNM00059:) in /u01/app/11203/grid/log/grac2/cssd/ocssd.log.
2013-08-02 19:43:06.212
[cssd(21694)]CRS-1615:No I/O has completed after 50% of the maximum interval. Voting file /dev/oracleasm/disks/OCR3 will be considered not functional in 99760 milliseconds
2013-08-02 19:43:56.239
[cssd(21694)]CRS-1614:No I/O has completed after 75% of the maximum interval. Voting file /dev/oracleasm/disks/OCR3 will be considered not functional in 49740 milliseconds
2013-08-02 19:44:26.252
[cssd(21694)]CRS-1613:No I/O has completed after 90% of the maximum interval. Voting file /dev/oracleasm/disks/OCR3 will be considered not functional in 19730 milliseconds
2013-08-02 19:44:46.265
[cssd(21694)]CRS-1604:CSSD voting file is offline: /dev/oracleasm/disks/OCR3; details at (:CSSNM00058:) in /u01/app/11203/grid/log/grac2/cssd/ocssd.log.
2013-08-02 19:44:46.265
[cssd(21694)]CRS-1606:The number of voting files available, 1, is less than the minimum number of voting files required, 2, resulting in CSSD termination to ensure data integrity; details at (:CSSNM00018:) in /u01/app/11203/grid/log/grac2/cssd/ocssd.log
2013-08-02 19:44:46.265
[cssd(21694)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /u01/app/11203/grid/log/grac2/cssd/ocssd.log
2013-08-02 19:44:46.309
[cssd(21694)]CRS-1652:Starting clean up of CRSD resources.
2013-08-02 19:44:47.792
[/u01/app/11203/grid/bin/oraagent.bin(22056)]CRS-5016:Process "/u01/app/11203/grid/opmn/bin/onsctli" spawned by agent "/u01/app/11203/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11203/grid/log/grac2/agent/crsd/oraagent_grid/oraagent_grid.log"
2013-08-02 19:44:48.422
[/u01/app/11203/grid/bin/oraagent.bin(22056)]CRS-5016:Process "/u01/app/11203/grid/bin/lsnrctl" spawned by agent "/u01/app/11203/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11203/grid/log/grac2/agent/crsd/oraagent_grid/oraagent_grid.log"
2013-08-02 19:44:48.423
[/u01/app/11203/grid/bin/oraagent.bin(22056)]CRS-5016:Process "/u01/app/11203/grid/bin/lsnrctl" spawned by agent "/u01/app/11203/grid/bin/oraagent.bin" for action "check" failed: details at "(:CLSN00010:)" in "/u01/app/11203/grid/log/grac2/agent/crsd/oraagent_grid/oraagent_grid.log"
2013-08-02 19:44:48.430
[cssd(21694)]CRS-1654:Clean up of CRSD resources finished successfully.
2013-08-02 19:44:48.431
[cssd(21694)]CRS-1655:CSSD on node grac2 detected a problem and started to shutdown.
2013-08-02 19:44:48.487
[ohasd(21483)]CRS-2765:Resource 'ora.crsd' has failed on server 'grac2'.
2013-08-02 19:44:48.509
[/u01/app/11203/grid/bin/oraagent.bin(22056)]CRS-5822:Agent '/u01/app/11203/grid/bin/oraagent_grid' disconnected from server. Details at (:CRSAGF00117:) {0:1:5} in /u01/app/11203/grid/log/grac2/agent/crsd/oraagent_grid/oraagent_grid.log.
2013-08-02 19:44:48.510
[/u01/app/11203/grid/bin/orarootagent.bin(22073)]CRS-5822:Agent '/u01/app/11203/grid/bin/orarootagent_root' disconnected from server. Details at (:CRSAGF00117:) {0:3:86} in /u01/app/11203/grid/log/grac2/agent/crsd/orarootagent_root/orarootagent_root.log.
2013-08-02 19:44:48.512
[/u01/app/11203/grid/bin/oraagent.bin(22296)]CRS-5822:Agent '/u01/app/11203/grid/bin/oraagent_oracle' disconnected from server. Details at (:CRSAGF00117:) {0:5:299} in /u01/app/11203/grid/log/grac2/agent/crsd/oraagent_oracle/oraagent_oracle.log.
2013-08-02 19:44:51.916
[crsd(26171)]CRS-0805:Cluster Ready Service aborted due to failure to communicate with Cluster Synchronization Service with error [3]. Details at (:CRSD00109:) in /u01/app/11203/grid/log/grac2/crsd/crsd.log.
2013-08-02 19:44:51.935
[ohasd(21483)]CRS-2765:Resource 'ora.ctssd' has failed on server 'grac2'.
2013-08-02 19:44:52.158
[ohasd(21483)]CRS-2765:Resource 'ora.evmd' has failed on server 'grac2'.
2013-08-02 19:44:52.340
[/u01/app/11203/grid/bin/oraagent.bin(21601)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/u01/app/11203/grid/log/grac2/agent/ohasd/oraagent_grid/oraagent_grid.log"
2013-08-02 19:44:52.517
[ohasd(21483)]CRS-2765:Resource 'ora.asm' has failed on server 'grac2'.
2013-08-02 19:44:52.969
[ctssd(26187)]CRS-2402:The Cluster Time Synchronization Service aborted on host grac2. Details at (:ctss_css_init1:) in /u01/app/11203/grid/log/grac2/ctssd/octssd.log.
2013-08-02 19:44:56.940
[ohasd(21483)]CRS-2765:Resource 'ora.crsd' has failed on server 'grac2'.
2013-08-02 19:44:58.212
[ctssd(26221)]CRS-2402:The Cluster Time Synchronization Service aborted on host grac2. Details at (:ctss_css_init1:) in /u01/app/11203/grid/log/grac2/ctssd/octssd.log.
2013-08-02 19:45:03.254
[ctssd(26233)]CRS-2402:The Cluster Time Synchronization Service aborted on host grac2. Details at (:ctss_css_init1:) in /u01/app/11203/grid/log/grac2/ctssd/octssd.log.
2013-08-02 19:45:04.740
[ohasd(21483)]CRS-2765:Resource 'ora.cssd' has failed on server 'grac2'.
2013-08-02 19:45:04.761
[/u01/app/11203/grid/bin/oraagent.bin(21601)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/u01/app/11203/grid/log/grac2/agent/ohasd/oraagent_grid/oraagent_grid.log"
2013-08-02 19:45:04.767
[ohasd(21483)]CRS-2765:Resource 'ora.cluster_interconnect.haip' has failed on server 'grac2'.
2013-08-02 19:45:04.941
[/u01/app/11203/grid/bin/oraagent.bin(21601)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/u01/app/11203/grid/log/grac2/agent/ohasd/oraagent_grid/oraagent_grid.log"
2013-08-02 19:45:05.118
[/u01/app/11203/grid/bin/oraagent.bin(21601)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/u01/app/11203/grid/log/grac2/agent/ohasd/oraagent_grid/oraagent_grid.log"
2013-08-02 19:45:06.593
[ohasd(21483)]CRS-2765:Resource 'ora.cssdmonitor' has failed on server 'grac2'.
2013-08-02 19:45:07.277
[/u01/app/11203/grid/bin/oraagent.bin(21601)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/u01/app/11203/grid/log/grac2/agent/ohasd/oraagent_grid/oraagent_grid.log"
2013-08-02 19:45:07.488
[/u01/app/11203/grid/bin/oraagent.bin(21601)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/u01/app/11203/grid/log/grac2/agent/ohasd/oraagent_grid/oraagent_grid.log"
2013-08-02 19:45:13.144
[cssd(26283)]CRS-1713:CSSD daemon is started in clustered mode
2013-08-02 19:45:13.187
[cssd(26283)]CRS-1637:Unable to locate configured voting file with ID 5c9b3326-e5a74f5a-bf9d300f-3e76c6d0; details at (:CSSNM00020:) in /u01/app/11203/grid/log/grac2/cssd/ocssd.log
2013-08-02 19:45:13.187
[cssd(26283)]CRS-1637:Unable to locate configured voting file with ID 417df066-0a424fb1-bf2db116-acfdb47b; details at (:CSSNM00020:) in /u01/app/11203/grid/log/grac2/cssd/ocssd.log
2013-08-02 19:45:13.187
[cssd(26283)]CRS-1705:Found 1 configured voting files but 2 voting files are required, terminating to ensure data integrity; details at (:CSSNM00021:) in /u01/app/11203/grid/log/grac2/cssd/ocssd.log
2013-08-02 19:45:13.188
[cssd(26283)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /u01/app/11203/grid/log/grac2/cssd/ocssd.log
2013-08-02 19:45:13.200
[cssd(26283)]CRS-1603:CSSD on node grac2 shutdown by user.

 

CRS errors reported in CRS alert.log

Node grac3 was triggered for voting disks errors:
CRS Alert Log Events:
2013-10-24 09:50:54    No I/O has completed after 50% of the maximum interval    files/alertgrac3.log    [cssd(3002)]CRS-1615:No I/O has completed after 50% of the maximum interval. Voting file /dev/oracleasm/disks/OCR3 will be considered not functional in 99010 milliseconds
2013-10-24 09:52:36    No I/O has completed after 50% of the maximum interval    files/alertgrac3.log    [cssd(3002)]CRS-1615:No I/O has completed after 50% of the maximum interval. Voting file /dev/oracleasm/disks/OCR2 will be considered not functional in 99550 milliseconds
                       --> here we have lost 2 of your 3 voting disks  
2013-10-24 09:54:16    The CSS daemon is terminating due to a fatal error    files/alertgrac3.log    [cssd(3002)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /u01/app/11203/grid/log/grac3/cssd/ocssd.log
2013-10-24 09:55:06    The CSS daemon is terminating due to a fatal error    files/alertgrac3.log    [cssd(2203)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /u01/app/11203/grid/log/grac3/cssd/ocssd.log
2013-10-24 09:55:19    CRS-1603:CSSD on node %s shutdown by user.    files/alertgrac3.log    [cssd(3395)]CRS-1603:CSSD on node grac3 shutdown by user.
2013-10-24 09:55:19    The CSS daemon is terminating due to a fatal error    files/alertgrac3.log    [cssd(3395)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /u01/app/11203/grid/log/grac3/cssd/ocssd.log

2013-10-24 09:56:08    Node %s, number %n, was manually shut down    grac1/alertgrac1.log    [cssd(3748)]CRS-1625:Node grac3, number 3, was manually shut down
2013-10-24 09:56:08    CRS-1601:CSSD Reconfiguration complete. Active nodes are %s    grac1/alertgrac1.log    [cssd(3748)]CRS-1601:CSSD Reconfiguration complete. Active nodes are grac1 grac2 .
2013-10-24 09:56:08    CRS-1601:CSSD Reconfiguration complete. Active nodes are %s    grac1/alertgrac1.log    [cssd(3748)]CRS-1601:CSSD Reconfiguration complete. Active nodes are grac1 grac2 .
2013-10-24 09:56:08    Node %s, number %n, was manually shut down    grac1/alertgrac1.log    [cssd(3748)]CRS-1625:Node grac3, number 3, was manually shut down
                       --> grac1 reports that we have lost node grac3

2013-10-24 09:56:35    CRS-1601:CSSD Reconfiguration complete. Active nodes are %s    files/alertgrac2.log    [cssd(2977)]CRS-1601:CSSD Reconfiguration complete. Active nodes are grac1 grac2 .
2013-10-24 09:56:35    Node %s, number %n, was manually shut down    files/alertgrac2.log    [cssd(2977)]CRS-1625:Node grac3, number 3, was manually shut down
2013-10-24 10:05:30    CRS-1603:CSSD on node %s shutdown by user.    files/alertgrac3.log    [cssd(3190)]CRS-1603:CSSD on node grac3 shutdown by user.
2013-10-24 10:05:30    The CSS daemon is terminating due to a fatal error    files/alertgrac3.log    [cssd(3190)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /u01/app/11203/grid/log/grac3/cssd/ocssd.log
                        --> grac2 reports that we have lost node grac3
Node Eviction Events:
2013-10-24 09:54:16    Voting disks unavailable    cssd/ocssd.log    Voting disks unavailable
2013-10-24 09:54:16.582: [ CSSD][2156365568](:CSSNM00018:)clssnmvDiskCheck: Aborting, 1 of 3 configured voting disks available, need 2
2013-10-24 09:54:16    Voting disks unavailable    cssd/ocssd.log    Voting disks unavailable
2013-10-24 09:54:16.582: [ CSSD][2156365568](:CSSNM00018:)clssnmvDiskCheck: Aborting, 1 of 3 configured voting disks available, need 2
2013-10-24 09:54:16    Voting disks unavailable    files/alertgrac3.log    Voting disks unavailable
 [cssd(3002)]CRS-1606:The number of voting files available, 1, is less than the minimum number of voting files required, 2, resulting in CSSD termination to ensure data integrity; details at (:CSSNM00018:) in /u01/app/11203/grid/log/grac3/cssd/ocssd.log

 

Leave a Reply

Your email address will not be published. Required fields are marked *