Common cluvfy errors and warnings including first debugging steps

Table of Contents

 PRIF-10, PRVG-1060, PRCT-1011  [ cluvfy  stage -pre crsinst ]

Current Configuration :

  • Your CRS stack doesn’t come up and you want to verify your CRS statck
  • Your are running  cluvfy  stage -pre crsinst in a ready installed CRS stack

ERROR       : PRVG-1060 : Failed to retrieve the network interface classification information from an existing CRS home at path "/u01/app/121/grid" on the local node
              PRCT-1011 : Failed to run "oifcfg". Detailed error: PRIF-10: failed to initialize the cluster registry
Command     :   cluvfy  stage -pre crsinst in a ready installed CRS stack
Workaround 1: Try to start clusterware in exclusive mode 
               # crsctl start crs -excl 
                 Oracle High Availability Services is online 
                 CRS-4692: Cluster Ready Services is online in exclusive mode 
                 CRS-4529: Cluster Synchronization Services is online 
                 CRS-4533: Event Manager is online 
                $ bin/cluvfy  stage -pre crsinst -n gract1 
               Note if you can startup cluvfy in exclusive mode cluvfy  stage -post crsinst should work too 
                 $  cluvfy  stage -post crsinst -n gract1 
Workaround 2: Need to be used if you can start the CRS stack in exclusive mode  
               If you can startup the CRS stack you may use the WA from  
                  Bug 17505999 : CVU CHECKS FOR ACTIVEVERSION WHEN CRS STACK IS NOT UP. 
                  # mv /etc/oraInst.loc /etc/oraInst.loc_sav 
                  # mv /etc/oracle  /etc/oracle_sav 
                 
                $ bin/cluvfy  -version 
                   12.1.0.1.0 Build 112713x8664 
                Now the command below should work and as said before always download the latest cluvfy version ! 
                 $  bin/cluvfy  stage -pre crsinst -n gract1 
                 .. Check for /dev/shm mounted as temporary file system passed 
                  Pre-check for cluster services setup was successful.
 Reference :    Bug 17505999 : CVU CHECKS FOR ACTIVEVERSION WHEN CRS STACK IS NOT UP.

PRVF-0002 : Could not retrieve local nodename

Command    : $ ./bin/cluvfy -h
Error      : PRVF-0002 : Could not retrieve local nodename
Root cause : Nameserver down, host not not yet know in DNS 
             $   nslookup grac41   returns error
               Server:        192.135.82.44
               Address:    192.135.82.44#53
               ** server can't find grac41: NXDOMAIN
Fix         : Restart DNS, or configure DNS . Nslookup should work in any case !

PRVG-1013 : The path “/u01/app/11203/grid” does not exist or cannot be created

Command    : cluvfy stage -pre nodeadd -n grac3 -verbose
Error      : PRVG-1013 : The path "/u01/app/11203/grid" does not exist or cannot be created on the nodes to be added
             Shared resources check for node addition failed:
Logfile    : Check cluvify log:  $GRID_HOME/cv/log/cvutrace.log.0
             [ 15025@grac1.example.com] [Worker 1] [ 2013-08-29 15:17:08.266 CEST ] [NativeSystem.isCmdScv:499]  isCmdScv: 
             cmd=[/usr/bin/ssh -o FallBackToRsh=no  -o PasswordAuthentication=no  -o StrictHostKeyChecking=yes  
             -o NumberOfPasswordPrompts=0  grac3 -n 
             /bin/sh -c "if [  -d /u01 -a -w /u01 ] ; then echo exists; fi"]
             ...
             [15025@grac1.example.com] [main] [ 2013-08-29 15:17:08.270 CEST ] [TaskNodeAddDelete.checkSharedPath:559]  
             PRVG-1013 : The path "/u01/app/11203/grid" does not exist or cannot be created on the nodes to be added
             [15025@grac1.example.com] [main] [ 2013-08-29 15:17:08.270 CEST ] [ResultSet.traceResultSet:359]
             Node Add/Delete ResultSet trace.
             Overall Status->VERIFICATION_FAILED
             grac3-->VERIFICATION_FAILED
Root cause:  cluvfy commands tries to check the /u01 directory with write attribute and fails
             /bin/sh -c "if [  -d /u01 -a -w /u01 ] ; then echo exists; fi"
Code Fix     : drop -w argument and we get the required fixed ouput
              $  /bin/sh -c "if [  -d /u01 -a /u01 ] ; then echo exists; fi"
               exists
Related BUG:
             Bug 13241453 : LNX64-12.1-CVU: "CLUVFY STAGE -POST NODEADD" COMMAND REPORTS PRVG-1013 ERROR

PRVF-5229 : GNS VIP is active before Clusterware installation

Command    : $ ./bin/cluvfy comp gns -precrsinst -domain grid.example.com -vip 192.168.1.50 -verbose -n grac121
              Verifying GNS integrity 
              Checking GNS integrity...
              Checking if the GNS subdomain name is valid...
              The GNS subdomain name "grid.example.com" is a valid domain name
              Checking if the GNS VIP is a valid address...
              GNS VIP "192.168.1.50" resolves to a valid IP address
              Checking the status of GNS VIP...
Error       : Error PRVF-5229 : GNS VIP is active before Clusterware installation
              GNS integrity check passed
Fix         : If your clusterware is already installed and up and running ignore this error
              If this is a new install use an unsed TPC/IP address for your GNS VIP ( note ping should fail ! )

PRVF-4007 : User equivalence check failed for user “oracle”

Command   : $ ./bin/cluvfy stage -pre crsinst -n grac1 
Error     : PRVF-4007 : User equivalence check failed for user "oracle" 
Fix       : Run  sshUserSetup.sh            
            $ ./sshUserSetup.sh -user grid -hosts "grac1 grac2"  -noPromptPassphrase            
            Verify SSH connectivity:            
            $ /usr/bin/ssh -x -l grid  grac1 date             Tue Jul 16 12:14:17 CEST 2013            
            $ /usr/bin/ssh -x -l grid  grac2 date             Tue Jul 16 12:14:25 CEST 2a013

PRVF-9992 : Group of device “/dev/oracleasm/disks/DATA1” did not match the expected group

Command    : $ ./bin/cluvfy stage -pre crsinst -n grac1 -asm -asmdev /dev/oracleasm/disks/DATA1 Checking consistency of device group across all nodes... 
Error      : PRVF-9992 : Group of device "/dev/oracleasm/disks/DATA1" did not match the expected group. [Expected = "dba"; Found = "{asmadmin=[grac1]}"] 
Root cause : Cluvfy doesn't know that grid user belongs to a different group 
Fix:       : Run cluvfy with -asmgrp asmadmin to provide correct group mappings: 
             $ ./bin/cluvfy stage -pre crsinst -n grac1 -asm -asmdev /dev/oracleasm/disks/DATA1 -asmgrp asmadmin

PRVF-9802 : Attempt to get udev info from node “grac1” failed

 Command   : $ ./bin/cluvfy stage -pre crsinst -n grac1 -asm -asmdev /dev/oracleasm/disks/DATA1 
Error     : PRVF-9802 : Attempt to get udev info from node "grac1" failed
           UDev attributes check failed for ASM Disks
Bug       : Bug 12804811 : [11203-LIN64-110725] OUI PREREQUISITE CHECK FAILED IN OL6
Fix       : If using ASMLIB you can ignore currently this error
            If using UDEV you may read follwing link. 

PRVF-7539 – User “grid” does not belong to group “dba

Error       : PRVF-7539 - User "grid" does not belong to group "dba
Command     : $  ./bin/cluvfy comp sys -p crs -n grac1
Fix         :  Add grid owner to DBA group
Note        : ID 1505586.1 : CVU found following errors with Clusterware setup : User "grid" does not 
          belong to group "dba" [ID 1505586.1]
            : ID 316817.1] Cluster Verification Utility (CLUVFY) FAQ [ID 316817.1]
Bug         :  Bug 12422324 : LNX64-112-CMT: HIT PRVF-7539 : GROUP "DBA" DOES NOT EXIST ON OUDA NODE ( Fixed : 11.2.0.4 )

PRVF-7617 : Node connectivity between “grac1 : 192.168.1.61” and “grac1 : 192.168.1.55” failed

Command     : $ ./bin/cluvfy comp nodecon -n grac1
Error       : PRVF-7617 : Node connectivity between "grac1 : 192.168.1.61" and "grac1 : 192.168.1.55" failed
Action 1    : Disable firewall / IP tables
             # service iptables stop 
             # chkconfig iptables off
             # iptables -F
             # service iptables status 
             If after a reboot the firewall is enabled again please read following post .              
Action 2    : Checking ssh connectivity 
              $ id
              uid=501(grid) gid=54321(oinstall) groups=54321(oinstall),504(asmadmin),506(asmdba),507(asmoper),54322(dba)
              $ ssh grac1 date 
                Sat Jul 27 13:42:19 CEST 2013
Fix         : Seems that we need to run cluvfy comp nodecon with at least 2 Nodes
              Working Command: $ ./bin/cluvfy comp nodecon -n grac1,grac2 
                -> Node connectivity check passed
              Failing Command: $ ./bin/cluvfy comp nodecon -n grac1
                -> Verification of node connectivity was unsuccessful. 
                   Checks did not pass for the following node(s):
               grac1 : 192.168.1.61
            :  Ignore this error if running with a single RAC Node  - Rerun later when both nodes are available 
            : Verify that that ping is working with all involved IP addresses

Action 3    : 2 or more network interfaces are using the same network address
              Test your Node Commectivity by running:
              $ /u01/app/11203/grid/bin//cluvfy comp nodecon -i eth1,eth2 -n grac31,grac32,grac33 -verbose

              Interface information for node "grac32"
              Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU   
              ------ --------------- --------------- --------------- --------------- ----------------- ------
              eth0   10.0.2.15       10.0.2.0        0.0.0.0         10.0.2.2        08:00:27:88:32:F3 1500  
              eth1   192.168.1.122   192.168.1.0     0.0.0.0         10.0.2.2        08:00:27:EB:39:F1 1500  
              eth3   192.168.1.209   192.168.1.0     0.0.0.0         10.0.2.2        08:00:27:69:AE:D2 1500  

              Verifiy current settings via ifconfig
              eth1     Link encap:Ethernet  HWaddr 08:00:27:5A:61:E3  
                       inet addr:192.168.1.121  Bcast:192.168.1.255  Mask:255.255.255.0
              eth3     Link encap:Ethernet  HWaddr 08:00:27:69:AE:D2  
                       inet addr:192.168.1.209  Bcast:192.168.1.255  Mask:255.255.255.0

              --> Both eth1 and eth3 are using the same network address 192.168.1 
Fix           : Setup your network devices and provide a different IP Address like 192.168.3 for eth3 

Action 4      :Intermittent PRVF-7617 error with cluvfy 11.2.0.3 ( cluvfy Bug )     
               $  /u01/app/11203/grid/bin/cluvfy -version
               11.2.0.3.0 Build 090311x8664
               $ /u01/app/11203/grid/bin/cluvfy comp nodecon -i eth1,eth2 -n grac31,grac32,grac33 -verbos
               --> Fails intermittent with following ERROR: 
               PRVF-7617 : Node connectivity between "grac31 : 192.168.1.121" and "grac33 : 192.168.1.220" failed

               $  /home/grid/cluvfy_121/bin/cluvfy -version
               12.1.0.1.0 Build 062813x8664
               $  /home/grid/cluvfy_121/bin/cluvfy comp nodecon -i eth1,eth2 -n grac31,grac32,grac33 -verbose
               --> Works for each run
    Fix      : Always use latest 12.1 cluvfy utility to test Node connectivity 

References:  
               PRVF-7617: TCP connectivity check failed for subnet (Doc ID 1335136.1)
               Bug 16176086 - SOLX64-12.1-CVU:CVU REPORT NODE CONNECTIVITY CHECK FAIL FOR NICS ON SAME NODE 
               Bug 17043435 : EM 12C: SPORADIC INTERRUPTION WITHIN RAC-DEPLOYMENT AT THE STEP INSTALL/CLONE OR

PRVG-1172 : The IP address “192.168.122.1” is on multiple interfaces “virbr0” on nodes “grac42,grac41”

Command    :  $ ./bin/cluvfy stage -pre crsinst -asm -presence local -asmgrp asmadmin -asmdev /dev/oracleasm/disks/DATA1,/dev/oracleasm/disks/DATA2,/dev/oracleasm/disks/DATA3,/dev/oracleasm/disks/DATA4 -n grac41,grac42   a
Error      :  PRVG-1172 : The IP address "192.168.122.1" is on multiple interfaces "virbr0,virbr0" on nodes "grac42,grac41"
Root cause :  There are multiple networks ( eth0,eth1,eth2,virbr0  ) defined
Fix        :  use cluvfy with  -networks eth1:192.168.1.0:PUBLIC/eth2:192.168.2.0:cluster_interconnect -n grac41,grac42
Sample     :  $ ./bin/cluvfy stage -pre crsinst -asm -presence local -asmgrp asmadmin -asmdev /dev/oracleasm/disks/DATA1,/dev/oracleasm/disks/DATA2,/dev/oracleasm/disks/DATA3,/dev/oracleasm/disks/DATA4 -networks eth1:192.168.1.0:PUBLIC/eth2:192.168.2.0:cluster_interconnect -n grac41,grac42  ss

Cluvfy Warnings:

PRVG-1101 : SCAN name “grac4-scan.grid4.example.com” failed to resolve  ( PRVF-4664 PRVF-4657

Warning:      PRVG-1101 : SCAN name "grac4-scan.grid4.example.com" failed to resolve  
Cause:        An attempt to resolve specified SCAN name to a list of IP addresses failed because SCAN could not be resolved in DNS or GNS using 'nslookup'.
Action:       Verify your GNS/SCAN setup using ping, nslookup can cluvfy
              $  ping -c 1  grac4-scan.grid4.example.com
              PING grac4-scan.grid4.example.com (192.168.1.168) 56(84) bytes of data.
              64 bytes from 192.168.1.168: icmp_seq=1 ttl=64 time=0.021 ms
              --- grac4-scan.grid4.example.com ping statistics ---
              1 packets transmitted, 1 received, 0% packet loss, time 1ms
               rtt min/avg/max/mdev = 0.021/0.021/0.021/0.000 ms

              $  ping -c 1  grac4-scan.grid4.example.com
              PING grac4-scan.grid4.example.com (192.168.1.170) 56(84) bytes of data.
              64 bytes from 192.168.1.170: icmp_seq=1 ttl=64 time=0.031 ms 
              --- grac4-scan.grid4.example.com ping statistics ---
              1 packets transmitted, 1 received, 0% packet loss, time 2ms
              rtt min/avg/max/mdev = 0.031/0.031/0.031/0.000 ms

             $  ping -c 1  grac4-scan.grid4.example.com
             PING grac4-scan.grid4.example.com (192.168.1.165) 56(84) bytes of data.
             64 bytes from 192.168.1.165: icmp_seq=1 ttl=64 time=0.143 ms
             --- grac4-scan.grid4.example.com ping statistics ---
             1 packets transmitted, 1 received, 0% packet loss, time 0ms
             rtt min/avg/max/mdev = 0.143/0.143/0.143/0.000 ms

             $ nslookup grac4-scan.grid4.example.com
             Server:        192.168.1.50
             Address:    192.168.1.50#53
             Non-authoritative answer:
             Name:    grac4-scan.grid4.example.com
             Address: 192.168.1.168
             Name:    grac4-scan.grid4.example.com
             Address: 192.168.1.165
             Name:    grac4-scan.grid4.example.com
             Address: 192.168.1.170

            $ $GRID_HOME/bin/cluvfy comp scan
            Verifying scan 
            Checking Single Client Access Name (SCAN)...
            Checking TCP connectivity to SCAN Listeners...
            TCP connectivity to SCAN Listeners exists on all cluster nodes
            Checking name resolution setup for "grac4-scan.grid4.example.com"...
            Checking integrity of name service switch configuration file "/etc/nsswitch.conf" ...
            Check for integrity of name service switch configuration file "/etc/nsswitch.conf" passed
            Verification of SCAN VIP and Listener setup passed
            Verification of scan was successful. 

 Fix:       As nsloopkup, ping and cluvfy works as expected you can ignore this warning   

Reference:  RVF-4664 PRVF-4657: Found inconsistent name resolution entries for SCAN name (Doc ID 887471.1)

WARNING    : Could not find a suitable set of interfaces for the private interconnect

Root cause : public ( 192.168.1.60) and private interface ( 192.168.1.61) uses same network adress
Fix             : provide own network address (  192.168.1.xx) for private interconenct 
                  After fix cluvfy reports : 
                  Interfaces found on subnet "192.168.1.0" that are likely candidates for VIP are:
                  grac1 eth0:192.168.1.60
                  Interfaces found on subnet "192.168.2.0" that are likely candidates for a private interconnect are:
                  grac1 eth1:192.168.2.101

WARNING: Could not find a suitable set of interfaces for VIPs

WARNING: Could not find a suitable set of interfaces for VIPs
             Checking subnet mask consistency...
             Subnet mask consistency check passed for subnet "192.168.1.0".
             Subnet mask consistency check passed for subnet "192.168.2.0".
             Subnet mask consistency check passed.
Fix        : Ignore this warning 
Root Cause : Per BUG:4437727, cluvfy makes an incorrect assumption based on RFC 1918 that any IP address/subnet that 
            begins with any of the following octets is private and hence may not be fit for use as a VIP:
            172.16.x.x  through 172.31.x.x
            192.168.x.x
            10.x.x.x
            However, this assumption does not take into account that it is possible to use these IPs as Public IP's on an
            internal network  (or intranet).   Therefore, it is very common to use IP addresses in these ranges as 
            Public IP's and as Virtual IP(s), and this is a supported configuration.  
Reference:
Note:       CLUVFY Fails With Error: Could not find a suitable set of interfaces for VIPs or Private Interconnect [ID 338924.1]

PRVF-5436 : The NTP daemon running on one or more nodes lacks the slewing option “-x”

Error        :PRVF-5436 : The NTP daemon running on one or more nodes lacks the slewing option "-x"
Solution     :Change  /etc/sysconfig/ntpd
               # OPTIONS="-u ntp:ntp -p /var/run/ntpd.pid"
                to 
                OPTIONS="-x -u ntp:ntp -p /var/run/ntpd.pid"
               Restart NTPD daemon
               [root@ract1 ~]#  service ntpd  restart

PRVF-5217 : An error occurred while trying to look up IP address for “grac1cl.grid2.example.com

WARNING:    PRVF-5217 : An error occurred while trying to look up IP address for "grac1cl.grid2.example.com"
Action    : Verify with dig and nslookup that VIP IP adresss is working:
            $  dig grac1cl-vip.grid2.example.com
             ; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.10.rc1.el6 <<>> grac1cl-vip.grid2.example.com
             ;; global options: +cmd
             ;; Got answer:
             ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23546
             ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 1
             ;; QUESTION SECTION:
             ;grac1cl-vip.grid2.example.com.    IN    A
             ;; ANSWER SECTION:
             grac1cl-vip.grid2.example.com. 120 IN    A    192.168.1.121
             ;; AUTHORITY SECTION:
             grid2.example.com.    3600    IN    NS    ns1.example.com.
             grid2.example.com.    3600    IN    NS    gns2.grid2.example.com.
            ;; ADDITIONAL SECTION:
            ns1.example.com.    3600    IN    A    192.168.1.50
           ;; Query time: 12 msec
           ;; SERVER: 192.168.1.50#53(192.168.1.50)
           ;; WHEN: Mon Aug 12 09:39:24 2013
           ;; MSG SIZE  rcvd: 116
          $  nslookup grac1cl-vip.grid2.example.com
           Server:        192.168.1.50
           Address:    192.168.1.50#53
           Non-authoritative answer:
           Name:    grac1cl-vip.grid2.example.com
           Address: 192.168.1.121
Fix:      Ignores this warning.
          DNS server on this system has stripped the authoritative flag. This results into the throw of an 
          UnknownHostExecption when  CVU calls InetAddress.getAllByName(..). That's why cluvfy returns a WARNING.
Reference: Bug 12826689 : PRVF-5217 FROM CVU WHEN VALIDATING GNS 

Running cluvfy comp dns -server fails silent – Cluvfy logs show PRCZ-2090 error


Command  runcluvfy.sh comp dns -server ... just exits with SUCCESS which is not what we expect. Indeed this command should create a local DNS server and block until runcluvfy.sh comp dns -client -last was executed

[grid@ractw21 linuxx64_12201_grid_home]$ runcluvfy.sh comp dns -server -domain grid122.example.com -vipaddress 192.168.1.59/255.255.255.0/enp0s8 -verbose -method root
Enter "ROOT" password:

Verifying Task DNS configuration check ...
Waiting for DNS client requests...
Verifying Task DNS configuration check ...PASSED

Verification of DNS Check was successful. 

CVU operation performed:      DNS Check
Date:                         Apr 11, 2017 3:23:56 PM
CVU home:                     /media/sf_kits/Oracle/122/linuxx64_12201_grid_home/
User:                         grid

Review CVU traces shows that cluvfy command fails with: error  PRCZ-2090
PRCZ-2090 : failed to create host key repository from file "/home/grid/.ssh/known_hosts" to establish SSH connection to node "ractw21"
[main] [ 2017-04-14 17:38:09.204 CEST ] [ExecCommandNoUserEqImpl.runCmd:374]  Final CompositeOperationException: PRCZ-2009 : Failed to execute command "/media/sf_kits/Oracle/122/linuxx64_12201_grid_home//cv/admin/odnsdlite" as root within 0 seconds on nodes "ractw21"

Fix login user grid via ssh and create the proper ssh environment
[grid@ractw21 linuxx64_12201_grid_home]$  ssh grid@ractw21.example.com

 

PRVF-5636 , PRVF-5637 : The DNS response time for an unreachable node exceeded “15000” ms

Problem 1: 
Command   : $ ./bin/cluvfy stage -pre crsinst -n grac1 -asm -asmdev /dev/oracleasm/disks/DATA1
Error     : PRVF-5636 : The DNS response time for an unreachable node exceeded "15000" ms on following nodes: grac1
Root Cause: nsloopup return wrong status message
            # nslookup hugo.example.com
            Server:        192.168.1.50
            Address:    192.168.1.50#53
            ** server can't find hugo.example.com: NXDOMAIN
            #  echo $?
            1
            --> Note the error can't find hugo.example.com is ok - but no the status code
 Note:      PRVF-5637 : DNS response time could not be checked on following nodes [ID 1480242.1]
 Bug :      Bug 16038314 : PRVF-5637 : DNS RESPONSE TIME COULD NOT BE CHECKED ON FOLLOWING NODESa

 Problem 2:
 Version   : 12.1.0.2
 Command   : $GRID_HOME/addnode/addnode.sh -silent "CLUSTER_NEW_NODES={gract3}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={auto}" "CLUSTER_NEW_NODE_ROLES={hub}" a
 Error     : SEVERE: [FATAL] [INS-13013] Target environment does not meet some mandatory requirements.
             FINE: [Task.perform:594]
             sTaskResolvConfIntegrity:Task resolv.conf Integrity[STASKRESOLVCONFINTEGRITY]:TASK_SUMMARY:FAILED:CRITICAL:VERIFICATION_FAILED
             PRVF-5636 : The DNS response time for an unreachable node exceeded "15000" ms on following nodes: gract1,gract3a
Verify     : Runs ping  SCAN address for a long time to check out node connectivity
             $ ping -v gract-scan.grid12c.example.com
             $ nsloopkup gract-scan.grid12c.example.com
             Note you may need to run above commands a long time until error comes up
Root Cause : Due to the intermittent hang of the above OS commands a firewall issue could be identified
Fix        : Disable firewall
Reference  : PRVF-5636 : The DNS response time for an unreachable node exceeded "15000" ms on following nodes (Doc ID 1356975.1) 
             PRVF-5637 : DNS response time could not be checked on following nodes (Doc ID 1480242.1)
             Using 11.2 WA by setting : $ export IGNORE_PREADDNODE_CHECKS=Y did not help

PRVF-4037 : CRS is not installed on any of the nodes

Error     : PRVF-4037 : CRS is not installed on any of the nodes
            PRVF-5447 : Could not verify sharedness of Oracle Cluster Voting Disk configuration
Command   : $ cluvfy stage -pre crsinst -upgrade -n grac41,grac42,grac43 -rolling -src_crshome $GRID_HOME 
           -dest_crshome /u01/app/grid_new -dest_version 12.1.0.1.0  -fixup -fixupdir /tmp -verbose
Root Cause:  /u01/app/oraInventory/ContentsXML/inventory.xml was corrupted ( missing node_list for GRID HOME )
            <HOME NAME="Ora11g_gridinfrahome1" LOC="/u01/app/11204/grid" TYPE="O" IDX="1" CRS="true"/>
            <HOME NAME="OraDb11g_home1" LOC="/u01/app/oracle/product/11204/racdb" TYPE="O" IDX="2">
              <NODE_LIST>
               <NODE NAME="grac41"/>
               <NODE NAME="grac42"/>
               <NODE NAME="grac43"/>
              ....
Fix: Correct entry in inventory.xml
            <HOME NAME="Ora11g_gridinfrahome1" LOC="/u01/app/11204/grid" TYPE="O" IDX="1" CRS="true">
               <NODE_LIST>
                  <NODE NAME="grac41"/>
                  <NODE NAME="grac42"/>
                  <NODE NAME="grac43"/>
               </NODE_LIST>
               ...

Reference : CRS is not installed on any of the nodes (Doc ID 1316815.1)
            CRS is not installed on any of the nodes. Inventory.xml is changed even when no problem with TMP files. (Doc ID 1352648.1)

avahi-daemon is running

Cluvfy report : 
     Checking daemon "avahi-daemon" is not configured and running
     Daemon not configured check failed for process "avahi-daemon"
     Check failed on nodes: 
        ract2,ract1
     Daemon not running check failed for process "avahi-daemon"
     Check failed on nodes: 
        ract2,ract1

Verify  for running avahi-daemon daemon
     $ ps -elf | grep avahi-daemon
     5 S avahi     4159     1  0  80   0 -  5838 poll_s Apr02 ?        00:00:00 avahi-daemon: running [ract1.local]
     1 S avahi     4160  4159  0  80   0 -  5806 unix_s Apr02 ?        00:00:00 avahi-daemon: chroot helper

Fix it ( run on all nodes ) :
      To shut it down, as root
      # /etc/init.d/avahi-daemon stop
      To disable it, as root:
      # /sbin/chkconfig  avahi-daemon off

Reference: 
    Cluster After Private Network Recovered if avahi Daemon is up and Running (Doc ID 1501093.1)

Reference data is not available for verifying prerequisites on this operating system distribution

Command    : ./bin/cluvfy stage -pre crsinst -upgrade -n gract3 -rolling -src_crshome $GRID_HOME 
                -dest_crshome /u01/app/12102/grid -dest_version 12.1.0.2.0 -verbose
Error      :  Reference data is not available for verifying prerequisites on this operating system distribution
              Verification cannot proceed
              Pre-check for cluster services setup was unsuccessful on all the nodes.
Root cause:  cluvfy runs rpm -qa | grep  release
             --> if this command fails above error was thrown
             Working Node 
             [root@gract1 log]# rpm -qa | grep  release
             oraclelinux-release-6Server-4.0.4.x86_64
             redhat-release-server-6Server-6.4.0.4.0.1.el6.x86_64
             oraclelinux-release-notes-6Server-9.x86_64
             Failing Node
             [root@gract1 log]#  rpm -qa | grep  release
             rpmdb: /var/lib/rpm/__db.003: No such file or directory
             error: db3 error(2) from dbenv->open: No such file or directory 
             ->  Due to space pressure /var/lib/rpm was partially deleted on a specific RAC node
 Fix        : Restore RPM packages form a REMOTE RAC node or from backup
             [root@gract1 lib]# pwd
             /var/lib
             [root@gract1 lib]#  scp -r gract3:/var/lib/rpm .
             Verify RPM database
             [root@gract1 log]#   rpm -qa | grep  release
             oraclelinux-release-6Server-4.0.4.x86_64
             redhat-release-server-6Server-6.4.0.4.0.1.el6.x86_64
             oraclelinux-release-notes-6Server-9.x86_64
Related Nodes:
             - Oracle Secure Enterprise Search 11.2.2.2 Installation Problem On RHEL 6 - [INS-75028] 
               Environment Does Not Meet Minimum Requirements: Unsupported OS Distribution (Doc ID 1568473.1)
             - RHEL6: 12c OUI INS-13001: CVU Fails: Reference data is not available for verifying prerequisites on 
               this operating system distribution (Doc ID 1567127.1)

Cluvfy Debug : PRVG-11049

Create a problem - Shutdown cluster Interconnect:
$ ifconfig eth1 down

Verify error with cluvfy
$ cluvfy comp nodecon -n all -i eth1
Verifying node connectivity 
Checking node connectivity...
Checking hosts config file...
Verification of the hosts config file successful
ERROR: 
PRVG-11049 : Interface "eth1" does not exist on nodes "grac2"
...

Step 1 - check cvutrace.log.0 trace:
# grep PRVG /home/grid/cluvfy112/cv/log/cvutrace.log.0
[21684@grac1.example.com] [main] [ 2013-07-29 18:32:46.429 CEST ] [TaskNodeConnectivity.performSubnetExistanceCheck:1394]  Found Bad node(s): PRVG-11049 : Interface "eth1" does not exist on nodes "grac2"
PRVG-11049 : Interface "eth1" does not exist on nodes "grac2"
          ERRORMSG(grac2): PRVG-11049 : Interface "eth1" does not exist on nodes "grac2"

Step 2: Create a script and set trace level:  SRVM_TRACE_LEVEL=2
rm -rf /tmp/cvutrace
mkdir /tmp/cvutrace
export CV_TRACELOC=/tmp/cvutrace
export SRVM_TRACE=true
export SRVM_TRACE_LEVEL=2
./bin/cluvfy comp nodecon -n all -i eth1 -verbose
ls /tmp/cvutrace

Run script and check cluvfy trace file:
[32478@grac1.example.com] [main] [ 2013-07-29 19:08:23.125 CEST ] [TaskNodeConnectivity.performSubnetExistanceCheck:1367]  getting interface eth1 on node grac2
[32478@grac1.example.com] [main] [ 2013-07-29 19:08:23.126 CEST ] [TaskNodeConnectivity.performSubnetExistanceCheck:1374]  Node: grac2 has no 'eth1' interfaces!
[32478@grac1.example.com] [main] [ 2013-07-29 19:08:23.126 CEST ] [TaskNodeConnectivity.performSubnetExistanceCheck:1367]  getting interface eth1 on node grac1
[32478@grac1.example.com] [main] [ 2013-07-29 19:08:23.127 CEST ] [TaskNodeConnectivity.performSubnetExistanceCheck:1394]  Found Bad node(s): PRVG-11049 : Interface "eth1" does not exist on nodes "grac2"

Verify problem with ifconfig on grac2 ( eth1 is not up )
# ifconfig eth1
eth1      Link encap:Ethernet  HWaddr 08:00:27:8E:6D:24  
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:413623 errors:0 dropped:0 overruns:0 frame:0
          TX packets:457739 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:226391378 (215.9 MiB)  TX bytes:300565159 (286.6 MiB)
          Interrupt:16 Base address:0xd240 
Fix : 
Restart eth1 and restart crs 
# ifconfig eth1 up
#  $GRID_HOME/bin/crsctl stop  crs -f
#  $GRID_HOME/bin/crsctl start  crs 

Debug PRVF-9802

From cluvfy log following command is failing 
$  /tmp/CVU_12.1.0.1.0_grid/exectask.sh -getudevinfo oracleasm/disks/DATA1
<CV_ERR><SLOS_LOC>CVU00310</SLOS_LOC><SLOS_OP></SLOS_OP><SLOS_CAT>OTHEROS</SLOS_CAT><SLOS_OTHERINFO>No UDEV rule found for device(s) specified</SLOS_OTHERINFO></CV_ERR>
<CV_VRES>1</CV_VRES><CV_LOG>Exectask:getudevinfo success</CV_LOG><CV_CMDLOG>
<CV_INITCMD>/tmp/CVU_12.1.0.1.0_grid/exectask -getudevinfo oracleasm/disks/DATA1 </CV_INITCMD>
<CV_CMD>popen /etc/udev/udev.conf</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT><CV_CMD>opendir /etc/udev/permissions.d</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT>
<CV_CMD>opendir /etc/udev/rules.d</CV_CMD><CV_CMDOUT> Reading: /etc/udev/rules.d</CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT>
<CV_CMD>popen /bin/grep KERNEL== /etc/udev/rules.d/*.rules | grep GROUP | grep MODE | sed -e '/^#/d' -e 's/\*/.*/g' -e 's/\(.*\)KERNEL=="\([^\"]*\)\(.*\)/\2 @ \1 KERNEL=="\2\3/' 
| awk '{if ("oracleasm/disks/DATA1" ~ $1 ) print $3,$4,$5,$6,$7,$8,$9,$10,$11,$12}' | sed -e 's/://' -e 's/\.\*/\*/g'</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT></CV_CMDLOG><CV_ERES>0</CV_ERES>
--> No Output

Failing Command
$ /bin/grep KERNEL== /etc/udev/rules.d/*.rules | grep GROUP | grep MODE | sed -e '/^#/d' -e 's/\*/.*/g' -e 's/\(.*\)KERNEL=="\([^\"]*\)\(.*\)/\2 @ \1 KERNEL=="\2\3/' 
| awk '{if ("oracleasm/disks/DATA1" ~ $1 ) print $3,$4,$5,$6,$7,$8,$9,$10,$11,$12}' | sed -e 's/://' -e 's/\.\*/\*/g'
Diagnostics : cluvfy is scanning directory /etc/udev/rules.d/ for udev rules for device : oracleasm/disks/DATA1 - but couldn't find a rule that device

Fix: setup udev rules.

After fixing the udev rules the above command works fine and cluvfy doesn't complain anymore 
$ /bin/grep KERNEL== /etc/udev/rules.d/*.rules | grep GROUP | grep MODE | sed -e '/^#/d' -e 's/\*/.*/g' -e 's/\(.*\)KERNEL=="\([^\"]*\)\(.*\)/\2 @ \1 KERNEL=="\2\3/'
kvm @ /etc/udev/rules.d/80-kvm.rules: KERNEL=="kvm", GROUP="kvm", MODE="0666"
fuse @ /etc/udev/rules.d/99-fuse.rules: KERNEL=="fuse", MODE="0666",OWNER="root",GROUP="root"
Fix: setup udev rules .....
Verify: $ /tmp/CVU_12.1.0.1.0_grid/exectask.sh -getudevinfo  /dev/asmdisk1_udev_sdb1
<CV_VAL><USMDEV><USMDEV_LINE>/etc/udev/rules.d/99-oracle-asmdevices.rules KERNEL=="sdb1", NAME="asmdisk1_udev_sdb1", OWNER="grid", GROUP="asmadmin", MODE="0660"    
</USMDEV_LINE><USMDEV_NAME>sdb1</USMDEV_NAME><USMDEV_OWNER>grid</USMDEV_OWNER><USMDEV_GROUP>asmadmin</USMDEV_GROUP><USMDEV_PERMS>0660</USMDEV_PERMS></USMDEV></CV_VAL><CV_VRES>0</CV_VRES><CV_LOG>Exectask:getudevinfo success</CV_LOG><CV_CMDLOG><CV_INITCMD>/tmp/CVU_12.1.0.1.0_grid/exectask -getudevinfo /dev/asmdisk1_udev_sdb1 </CV_INITCMD><CV_CMD>popen /etc/udev/udev.conf</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT><CV_CMD>opendir /etc/udev/permissions.d</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT><CV_CMD>opendir /etc/udev/rules.d</CV_CMD><CV_CMDOUT> Reading: /etc/udev/rules.d</CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT><CV_CMD>popen /bin/grep KERNEL== /etc/udev/rules.d/*.rules | grep GROUP | grep MODE | sed -e '/^#/d' -e 's/\*/.*/g' -e 's/\(.*\)KERNEL=="\([^\"]*\)\(.*\)/\2 @ \1 KERNEL=="\2\3/' | awk '{if ("/dev/asmdisk1_udev_sdb1" ~ $1 ) print $3,$4,$5,$6,$7,$8,$9,$10,$11,$12}' | sed -e 's/://' -e 's/\.\*/\*/g'</CV_CMD><CV_CMDOUT> /etc/udev/rules.d/99-oracle-asmdevices.rules KERNEL=="sdb1", NAME="asmdisk1_udev_sdb1", OWNER="grid", GROUP="asmadmin", MODE="0660"    
</CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT></CV_CMDLOG><CV_ERES>0</CV_ERES>

Debug and Fix  PRVG-13606 Error

Reference:

One thought on “Common cluvfy errors and warnings including first debugging steps”

Leave a Reply

Your email address will not be published. Required fields are marked *