Setup DNS, NTP and DHCP for a mixed RAC/Internet usage

Note : 

You need to install your RAC Nameserver on a separate Virtualbox image/system as a NON-functional Nameserver can lead to a RAC hang scenario !!

Install BIND / DHCP RPMs and learn the needed configuration commands

Install and verify BIND installation [ bind_libs and bind_utils should be arlready installed ] 
[root@hract21 Desktop]#  yum install bind bind-utils bind-libs
[root@hract21 Desktop]# rpm -qa |grep '^bind'
bind-utils-9.8.2-0.30.rc1.el6_6.1.x86_64
bind-libs-9.8.2-0.30.rc1.el6_6.1.x86_64
bind-9.8.2-0.30.rc1.el6_6.1.x86_64

Install and verify DHCP setup 
Download and install dcping utility;
Download location:  http://pkgs.repoforge.org/dhcping  following package :
    dhcping-1.2-2.2.el6.rf.x86_64.rpm  11-Nov-2010 07:31   16K  RHEL6 and CentOS-6 x86 64bit
[root@ns1 ~]# rpm -i Downloads/dhcping-1.2-2.2.el6.rf.x86_64.rpm
 
[root@hract21 Desktop]# yum install dhcp.x86_64 
Total download size: 1.2 M
Is this ok [y/N]: y
Downloading Packages:
(1/3): dhclient-4.1.1-43.P1.0.1.el6_6.1.x86_64.rpm                                                           | 318 kB     00:00     
(2/3): dhcp-4.1.1-43.P1.0.1.el6_6.1.x86_64.rpm                                                               | 819 kB     00:00     
(3/3): dhcp-common-4.1.1-43.P1.0.1.el6_6.1.x86_64.rpm                                                        | 142 kB     00:00  

[root@hract21 Desktop]#  rpm -qa | grep -i DHCP
dhcp-common-4.1.1-43.P1.0.1.el6_6.1.x86_64
dhcp-4.1.1-43.P1.0.1.el6_6.1.x86_64

Setup Files needed: 
: /etc/named.conf
: /var/named/example.com.db
: /var/named/192.168.2.db
: /var/named/192.168.5.db
: /etc/dhcp/dhcpd.conf
: /etc/sysconfig/dhcpd  
: /etc/dhcp/dhcpd.conf
--> For details how to configure DNS/DHCP please read the details the chapters below. 

Setup,test and configure BIND service 
# service named restart 
# nslookup google.de
Server:        192.168.5.50
Address:    192.168.5.50#53

Non-authoritative answer:
Name:    google.de
Address: 173.194.112.152
Name:    google.de
Address: 173.194.112.159
Name:    google.de
Address: 173.194.112.143
Name:    google.de
Address: 173.194.112.151
#  chkconfig named on chkconfig named --list
named              0:off    1:off    2:on    3:on    4:on    5:on    6:off

Setup,test and configure DHCP service 
# service dhcpd start
Starting dhcpd:                                            [  OK  ]
# chkconfig  dhcpd on
# chkconfig  dhcpd --list
dhcpd              0:off    1:off    2:on    3:on    4:on    5:on    6:off
Verify DHCP setup with  dhcping
[root@hract21 Desktop]#  dhcping -s 192.168.5.50 -c 192.168.5.197 
Got answer from: 192.168.5.50

DNS Server Setup

Our DNS server should have configured the Virtualbox Network Devices 
eth0  -> Bridged Network  : inet addr:192.168.1.XXX  Bcast:192.168.1.255  [ Internet Access ]
eth1  -> Internal Network : inet addr:192.168.5.50   Bcast:192.168.5.255  [ Public RAC Interface ]

eth0      Link encap:Ethernet  HWaddr 08:00:27:E6:71:54  
          inet addr:192.168.1.X  Bcast:192.168.1.255  Mask:255.255.255.0

eth1      Link encap:Ethernet  HWaddr 08:00:27:8D:8A:93  
          inet addr:192.168.5.50  Bcast:192.168.5.255  Mask:255.255.255.0   

Setup files used by  DNS : 
  /etc/named.conf  
  /var/named/example.com.db 
  /var/named/192.168.2.db
  /var/named/192.168.5.db


/etc/named.conf :
options {
    listen-on port 53 {  192.168.5.50; 127.0.0.1; };
    directory     "/var/named";
    dump-file     "/var/named/data/cache_dump.db";
        statistics-file "/var/named/data/named_stats.txt";
        memstatistics-file "/var/named/data/named_mem_stats.txt";
    allow-query     {  any; };
    allow-recursion     {  any; };
    recursion yes;
    dnssec-enable no;
    dnssec-validation no;
};

logging {
        channel default_debug {
                file "data/named.run";
                severity dynamic;
        };
};

zone "." IN {
    type hint;
    file "named.ca";
};
zone    "5.168.192.in-addr.arpa" IN { // Reverse zone
    type master;
    file "192.168.5.db";
        allow-transfer { any; };
    allow-update { none; };
};
zone    "2.168.192.in-addr.arpa" IN { // Reverse zone
    type master;
    file "192.168.2.db";
        allow-transfer { any; };
    allow-update { none; };
};
zone  "example.com" IN {
      type master;
       notify no;
       file "example.com.db";
};

/var/named/example.com.db: 
$TTL 1H         ; Time to live
$ORIGIN example.com.
@       IN      SOA     ns1.example.com.  hostmaster.example.com.  (
                        2009011202      ; serial (todays date + todays serial #)
                        3H              ; refresh 3 hours
                        1H              ; retry 1 hour
                        1W              ; expire 1 week
                        1D )            ; minimum 24 hour
;
             IN     NS        ns1  ; name server for example.com
ns1          IN     A        192.168.5.50
grac41       IN     A        192.168.5.101  
grac42       IN     A        192.168.5.102  
grac43       IN     A        192.168.5.103  
grac41int    IN     A        192.168.2.101  
grac42int    IN     A        192.168.2.102  
grac43int    IN     A        192.168.2.103 
;
$ORIGIN grid4.example.com.
@       IN          NS        gns4.grid4.example.com. ; NS  grid4.example.com
        IN          NS        ns1.example.com.      ; NS example.com
gns4    IN          A         192.168.5.54 ; glue record



/var/named/192.168.5.db :
$TTL 1H
@       IN      SOA     ns1.example.com.  root.domin.com.  (
                        2009011201      ; serial (todays date + todays serial #)
                        3H              ; refresh 3 hours
                        1H              ; retry 1 hour
                        1W              ; expire 1 week
                        1D )            ; minimum 24 hour
      IN    NS    ns1
ns1     IN       A      192.168.5.50
;
50            PTR       ns1.example.com.
54            PTR       gns4.grid4.example.com. ; reverse mapping for GNS
101           PTR       grac41.example.com. 
102           PTR       grac42.example.com. 
103           PTR       grac43.example.com. 
201           PTR       wls1.example.com. 

/var/named/192.168.2.db :
$TTL 1H
@       IN      SOA     ns1.example.com. hostmaster.example.com.  (
                        2009011201      ; serial (todays date + todays serial #)
                        3H              ; refresh 3 hours
                        1H              ; retry 1 hour
                        1W              ; expire 1 week
                        1D )            ; minimum 24 hour
        IN      NS      ns1
ns1     IN       A         192.168.5.50
; 
101          PTR       grac41int.example.com. 
102          PTR       grac42int.example.com. 
103          PTR       grac43int.example.com.


Verify zone files and restart named deamon
[root@ns1 named]#  named-checkconf /etc/named.conf
[root@ns1 named]#  named-checkzone example.com example.com.db
zone example.com/IN: grid.example.com/NS 'gns.grid.example.com' (out of zone) has no addresses records (A or AAAA)
zone example.com/IN: grid12c.example.com/NS 'gns12c.grid12c.example.com' (out of zone) has no addresses records (A or AAAA)
zone example.com/IN: grid2.example.com/NS 'gns2.grid2.example.com' (out of zone) has no addresses records (A or AAAA)
zone example.com/IN: grid3.example.com/NS 'gns3.grid3.example.com' (out of zone) has no addresses records (A or AAAA)
zone example.com/IN: grid4.example.com/NS 'gns4.grid4.example.com' (out of zone) has no addresses records (A or AAAA)
zone example.com/IN: loaded serial 2009011202
OK
[root@ns1 named]# named-checkzone example.com  192.168.5.db
zone example.com/IN: loaded serial 2009011201
OK
[root@ns1 named]# named-checkzone example.com  192.168.2.db
zone example.com/IN: loaded serial 2009011201
OK

Verify DNS Setup

[root@ns1 ~]# nslookup google.de
Server:        192.168.5.50
Address:    192.168.5.50#53

Non-authoritative answer:
Name:    google.de
Address: 173.194.67.94

[root@ns1 ~]# nslookup grac41 
Server:        192.168.5.50
Address:    192.168.5.50#53

Name:    grac41.example.com
Address: 192.168.5.101

[root@ns1 ~]# ping -c 2  google.de
PING google.de (173.194.67.94) 56(84) bytes of data.
64 bytes from wi-in-f94.1e100.net (173.194.67.94): icmp_seq=1 ttl=38 time=66.3 ms
64 bytes from wi-in-f94.1e100.net (173.194.67.94): icmp_seq=2 ttl=38 time=62.3 ms
--- google.de ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1064ms
rtt min/avg/max/mdev = 62.373/64.344/66.316/1.987 ms

[root@ns1 ~]# ping -c 2  grac41 
PING grac41.example.com (192.168.5.101) 56(84) bytes of data.
64 bytes from grac41.example.com (192.168.5.101): icmp_seq=1 ttl=64 time=0.200 ms
 64 bytes from grac41.example.com (192.168.5.101): icmp_seq=2 ttl=64 time=0.293 ms
--- grac41.example.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.200/0.246/0.293/0.049 ms

[root@ns1 ~]#  cat /etc/resolv.conf
# Generated by NetworkManager
search example.com grid4.example.com
nameserver 192.168.5.50
[root@ns1 ~]#  netstat -rn
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
0.0.0.0         192.168.1.1     0.0.0.0         UG        0 0          0 eth0
192.168.1.0     0.0.0.0         255.255.255.0   U         0 0          0 eth0
192.168.3.0     0.0.0.0         255.255.255.0   U         0 0          0 eth2
192.168.5.0     0.0.0.0         255.255.255.0   U         0 0          0 eth1

If the GNS server is running the following commands should work too !
[root@ns1 ~]# nslookup grac4-scan
Server:        192.168.5.50
Address:    192.168.5.50#53

Non-authoritative answer:
Name:    grac4-scan.grid4.example.com
Address: 192.168.5.167
Name:    grac4-scan.grid4.example.com
Address: 192.168.5.156
Name:    grac4-scan.grid4.example.com
Address: 192.168.5.153

[root@ns1 ~]# ping -c 2  grac4-scan
PING grac4-scan.grid4.example.com (192.168.5.167) 56(84) bytes of data.
64 bytes from 192.168.5.167: icmp_seq=1 ttl=64 time=0.176 ms
64 bytes from 192.168.5.167: icmp_seq=2 ttl=64 time=0.203 ms
--- grac4-scan.grid4.example.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.176/0.189/0.203/0.019 ms
[root@ns1 ~]# dig @192.168.5.50 grac4-scan.grid4.example.com
; <<>> DiG 9.9.3-RedHat-9.9.3-P1.el6 <<>> @192.168.5.50 grac4-scan.grid4.example.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 18529
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 2, ADDITIONAL: 2

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;grac4-scan.grid4.example.com.    IN    A

;; ANSWER SECTION:
grac4-scan.grid4.example.com. 94 IN    A    192.168.5.167
grac4-scan.grid4.example.com. 94 IN    A    192.168.5.156
grac4-scan.grid4.example.com. 94 IN    A    192.168.5.153

;; AUTHORITY SECTION:
grid4.example.com.    3600    IN    NS    gns4.grid4.example.com.
grid4.example.com.    3600    IN    NS    ns1.example.com.

;; ADDITIONAL SECTION:
ns1.example.com.    3600    IN    A    192.168.5.50

;; Query time: 1 msec
;; SERVER: 192.168.5.50#53(192.168.5.50)
;; WHEN: Sun Jan 11 17:17:51 CET 2015
;; MSG SIZE  rcvd: 158

[root@ns1 ~]#  dig @192.168.5.54 grac4-scan.grid4.example.com
; <<>> DiG 9.9.3-RedHat-9.9.3-P1.el6 <<>> @192.168.5.54 grac4-scan.grid4.example.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5071
;; flags: qr aa; QUERY: 1, ANSWER: 3, AUTHORITY: 1, ADDITIONAL: 1

;; QUESTION SECTION:
;grac4-scan.grid4.example.com.    IN    A

;; ANSWER SECTION:
grac4-scan.grid4.example.com. 120 IN    A    192.168.5.153
grac4-scan.grid4.example.com. 120 IN    A    192.168.5.156
grac4-scan.grid4.example.com. 120 IN    A    192.168.5.167

;; AUTHORITY SECTION:
grid4.example.com.    10800    IN    SOA    grac4-gns-vip.grid4.example.com. grac4-gns-vip.grid4.example.com. 264601876 10800 10800 30 120

;; ADDITIONAL SECTION:
grac4-gns-vip.grid4.example.com. 10800 IN A    192.168.5.54

;; Query time: 2 msec
;; SERVER: 192.168.5.54#53(192.168.5.54)
;; WHEN: Sun Jan 11 17:17:59 CET 2015
;; MSG SIZE  rcvd: 160

If GNS is not configured or running you will get error:  can't find grac4-scan: NXDOMAIN
[grid@grac41 ~]$  srvctl stop gns
[root@ns1 ~]# ping 192.168.5.54
PING 192.168.5.54 (192.168.5.54) 56(84) bytes of data.
From 192.168.5.50 icmp_seq=2 Destination Host Unreachable
From 192.168.5.50 icmp_seq=3 Destination Host Unreachable
From 192.168.5.50 icmp_seq=4 Destination Host Unreachable
^C
--- 192.168.5.54 ping statistics ---
4 packets transmitted, 0 received, +3 errors, 100% packet loss, time 3944ms
pipe 3
[root@ns1 ~]#  nslookup grac4-scan
Server:        192.168.5.50
Address:    192.168.5.50#53

** server can't find grac4-scan: NXDOMAIN

Verify subdomain delegation with cluvfy

Starting with Oracle Database 11g release 2 (11.2.0.2), use the cluvfy comp dns component verification 
command to verify that the Grid Naming Service (GNS) subdomain delegation has been properly set up in 
the Domain Name Service (DNS) server.

Run cluvfy comp dns -server on one node of the cluster. On each node of the cluster run 
cluvfy comp dns -client to verify the DNS server setup for the cluster.

Oh grac41: 
[root@grac41 ~]# cluvfy comp dns -server -domain  grid4.example.com -vipaddress 192.168.5.54/255.255.255.0/eth1 -verbose
Verifying DNS Check 
Starting the test DNS server on IP "192.168.5.54/255.255.255.0/eth1" listening on port 53
Started the IP address "192.168.5.54/255.255.255.0/eth1" on node "grac41"

On grac42: 
[root@grac42 ~]#  cluvfy comp dns -client -domain  grid4.example.com -vip 192.168.5.54
Verifying DNS Check 
Checking if the IP address "192.168.5.54" is reachable
The IP address "192.168.5.54" is reachable from local node
Successfully connected to test DNS server
Checking if the test DNS server started on address "192.168.5.54", listening on port 53 can be queried
Check output of command "cluvfy comp dns -server" to see if it received IP address lookup for name "grac42.grid4.example.com"
Successfully connected to the test DNS server started on address "192.168.5.54", listening on port 53
Checking DNS delegation for the GNS subdomain "grid4.example.com"...
Check output of command "cluvfy comp dns -server" to see if it received IP address lookup for name "grac42.grid4.example.com"
Successfully verified DNS delegation of the GNS subdomain "grid4.example.com"

Verification of DNS Check was successful. 

--> Server should report 
Received IP address lookup query for name "grac42.grid4.example.com"
Received IP address lookup query for name "grac42.grid4.example.com"

On grac43:
[root@grac43 ~]# cluvfy comp dns -client -domain  grid4.example.com -vip 192.168.5.54
..
Verification of DNS Check was successful. 
--> Server should report 
Received IP address lookup query for name "grac43.grid4.example.com"
Received IP address lookup query for name "grac43.grid4.example.com"

On grac41 
[root@grac41 Desktop]#  cluvfy comp dns -client -domain  grid4.example.com -vip 192.168.5.54 
..
Verification of DNS Check was successful. 
--> Server should report 
Received IP address lookup query for name "grac41.grid4.example.com"
Received IP address lookup query for name "grac41.grid4.example.com"

 

Setup DHCP server

DHCP configuration file 
/etc/dhcp/dhcpd.conf :
ddns-update-style interim;
ignore client-updates;

subnet 192.168.5.0 netmask 255.255.255.0 {
        option routers                  192.168.5.1;                    # Default gateway to be used by DHCP clients
        option subnet-mask              255.255.255.0;                  # Default subnet mask to be used by DHCP clients.
        option ip-forwarding            off;                            # Do not forward DHCP requests.
        option broadcast-address        192.168.5.255;                  # Default broadcast address to be used by DHCP client.
        option domain-name-servers      192.168.5.50;                   # IP address of the DNS server. 
        option time-offset              -19000;                           # Central Standard Time
        option ntp-servers              192.168.5.50;                   # Default NTP server to be used by DHCP clients
        range                           192.168.5.150 192.168.5.254;    # Range of IP addresses that can be issued to DHCP client
        default-lease-time              21600;                            # Amount of time in seconds that a client may keep the IP address
        max-lease-time                  43200;
} 

/etc/sysconfig/dhcpd
# Command line options here
DHCPDARGS="eth1"

Restart the DHCP server :
[root@ns1 network-scripts]# service dhcpd restart

 

Verify  DHCP setup with cluvfy

[root@grac41 ~]#  $GRID_HOME/bin/cluvfy comp dhcp -clustername grac4 
Verifying DHCP Check 
Checking if any DHCP server exists on the network...
PRVG-5723 : Network CRS resource is configured to use DHCP provided IP addresses

Verification of DHCP Check was unsuccessful on all the specified nodes. 

From Oracle docu :
- You must run this command as root.
- Do not run this check while the default network Oracle Clusterware resource, configured to use a 
   DHCP-provided IP address, is online (because the VIPs get released and, since the cluster is online, 
   DHCP has provided IP, so there is no need to double the load on the DHCP server).
- Before running this command, ensure that the network resource is offline. Use the srvctl stop nodeapps command 
   to bring the network resource offline, if necessary.

As we are on a test cluster go ahead and stop the Nodeapps 
[root@grac41 Desktop]#  srvctl stop nodeapps -f

[root@grac41 ~]# $GRID_HOME/bin/cluvfy comp dhcp -clustername grac4 -verbose
Verifying DHCP Check 
Checking if any DHCP server exists on the network...
Checking if network CRS resource is configured and online
Network CRS resource is offline or not configured. Proceeding with DHCP checks.
CRS-10009: DHCP server returned server: 192.168.5.50, loan address : 192.168.5.165/255.255.255.0, lease time: 21600

At least one DHCP server exists on the network and is listening on port 67
Checking if DHCP server has sufficient free IP addresses for all VIPs...
Sending DHCP "DISCOVER" packets for client ID "grac4-scan1-vip"
CRS-10009: DHCP server returned server: 192.168.5.50, loan address : 192.168.5.165/255.255.255.0, lease time: 21600
...
CRS-10012: released DHCP server lease for client ID grac4-scan3-vip on port 67
CRS-10012: released DHCP server lease for client ID grac4-grac41-vip on port 67

DHCP server was able to provide sufficient number of IP addresses
The DHCP server response time is within acceptable limits
Verification of DHCP Check was successful. 

Note you can track  the lease operation with following OS command 
[root@ns1 ~]# tail -f  /var/lib/dhcpd/dhcpd.leases
}
lease 192.168.5.164 {
  starts 0 2015/01/11 17:29:10;
  ends 0 2015/01/11 17:29:10;
  tstp 0 2015/01/11 17:29:10;
  cltt 0 2015/01/11 17:29:10;
  binding state free;
  hardware ethernet 00:00:00:00:00:00;
  uid "\000grac4-grac41-vip";
}

 

Configure NTP


Configuration script :
/etc/ntp.conf
restrict default nomodify notrap noquery
restrict 127.0.0.1 
# -- CLIENT NETWORK -------
restrict 192.168.5.0 mask 255.255.255.0 nomodify notrap
# --- OUR TIMESERVERS -----  can't reach NTP servers - build my own server 
server 0.pool.ntp.org iburst
server 1.pool.ntp.org iburst
server 127.127.1.0
# --- NTP MULTICASTCLIENT ---
# --- GENERAL CONFIGURATION ---
# Undisciplined Local Clock.
fudge   127.127.1.0 stratum 9
# Drift file.
driftfile /var/lib/ntp/drift
broadcastdelay  0.008
# Keys file.
keys /etc/ntp/keys

Restart NTP daemon
[root@ns1 network-scripts]# service ntpd restart
Shutting down ntpd:                                        [  OK  ]
Starting ntpd:                                             [  OK  ]

Verify setup
[root@ns1 network-scripts]# ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 foxtrot.zq1.de  122.227.206.195  3 u    2   64    1   68.504  4608.38   1.115
 der.beste.tiger 159.173.11.127   3 u    1   64    1   38.195  4603.43  11.063
 LOCAL(0)        .LOCL.           9 l    2   64    1    0.000    0.000   0.000

 

Verify NTP setup with cluvfy

Verify NTP setup with cluvfy 
[grid@grac41 ~]$   cluvfy comp clocksync
Verifying Clock Synchronization across the cluster nodes 
Checking if Clusterware is installed on all nodes...
Check of Clusterware install passed
Checking if CTSS Resource is running on all nodes...
CTSS resource check passed
Querying CTSS for time offset on all nodes...
Query of CTSS for time offset passed
Check CTSS state started...
CTSS is in Observer state. Switching over to clock synchronization checks using NTP
Starting Clock synchronization checks using Network Time Protocol(NTP)...
NTP Configuration file check started...
NTP Configuration file check passed
Checking daemon liveness...
Liveness check passed for "ntpd"
Check for NTP daemon or service alive passed on all nodes
NTP daemon slewing option check passed
NTP daemon's boot time configuration check for slewing option passed
NTP common Time Server Check started...
Check of common NTP Time Server passed
Clock time offset check from NTP Time Server started...
Clock time offset check passed
Clock synchronization check using Network Time Protocol(NTP) passed
Oracle Cluster Time Synchronization Services check passed
Verification of Clock Synchronization across the cluster nodes was successful.

Reference

Troubleshooting Clusterware startup problems with DTRACE

First Steps which may avoid setting up DTRACE at all

Cleanup your special sockets file in /var/tmp/.oracle

Either reboot your OS or Cleanup sockets file and reboot CRS stack :
[root@hract21 Desktop]# crsctl stop crs -f
[root@hract21 Desktop]# rm -rf /var/tmp/.oracle/*
[root@hract21 Desktop]# crsctl start crs
 CRS-4123: Oracle High Availability Services has been started.

Note: A complete OS reboot may be needed to fix hanging processes waiting on DISKWAIT 
      If possible always try to do an OS reboot. 
      An OS reboot will always cleanup  /var/tmp/.oracle/*

Quickly verify your OS with a simple sh script : chk_os.sh

#!/bin/bash 
NS=ns1.example.com
HOSTNAME1=hract21.example.com
HOSTNAME2=hract22.example.com
PRIV_IP1=192.168.2.121
PRIV_IP2=192.168.2.122
PUBLIC_IF=eth1
PRIVATE_IF=eth2

echo ""
echo "Disk Space : "
df

echo ""
echo "Major Clusterware Executable Protections : "
ls -l $GRID_HOME/bin/ohasd*
ls -l $GRID_HOME/bin/orarootagent*
ls -l $GRID_HOME/bin/oraagent*
ls -l $GRID_HOME/bin/mdnsd*
ls -l $GRID_HOME/bin/evmd*
ls -l $GRID_HOME/bin/gpnpd*
ls -l $GRID_HOME/bin/evmlogger*
ls -l $GRID_HOME/bin/osysmond.*
ls -l $GRID_HOME/bin/gipcd*
ls -l $GRID_HOME/bin/cssdmonitor*
ls -l $GRID_HOME/bin/cssdagent*
ls -l $GRID_HOME/bin/ocssd*
ls -l $GRID_HOME/bin/octssd*
ls -l $GRID_HOME/bin/crsd
ls -l $GRID_HOME/bin/crsd.bin
ls -l $GRID_HOME/bin/tnslsnr


echo ""
echo "Ping Nameserver: "
ping -c 2  $NS 

echo ""
echo "Test your PUBLIC interface and your nameserver setup"
nslookup $HOSTNAME
ping -I $PUBLIC_IF -c 2   $HOSTNAME1
ping -I $PUBLIC_IF -c 2   $HOSTNAME2
 
ping -I $PRIVATE_IF -c 2   $PRIV_IP1 
ping -I $PRIVATE_IF -c 2   $PRIV_IP2

echo ""
echo "Verify protections for HOSTNAME.pid files should be : 644"
find $GRID_HOME -name hract21.pid  -exec ls -l {} \; 

echo ""
echo "Service iptables and avahi-daemon should not run - avahi-daemon uses CW port 5353 "
service iptables status
ps -elf |grep avahi | grep -v avahi

echo ""
echo "Ports :53 :5353 :42422 :8888 should not be used by NON-Clusterware processes "
echo "  - OC4J reports : tcp   0 0 ::ffff:127.0.0.1:8888  :::*  LISTEN   501 67433979  2580/java"           
netstat -taupen | egrep ":53 |:5353 |:42424 |:8888 "

echo ""
echo "Compare profile.xml the IP Address of PUBLIC and PRIVATE Interfaces "
echo " - Devices should report UP BROADCAST RUNNING MULTICAST "
echo " - Double check NETWORK addresses matches profile.xml settings   "
echo ""
$GRID_HOME/bin/gpnptool get 2>/dev/null  |  xmllint --format - | egrep 'CSS-Profile|ASM-Profile|Network id'
echo ""
ifconfig $PUBLIC_IF | egrep 'eth|inet addr|MTU'
echo ""
ifconfig $PRIVATE_IF | egrep 'eth|inet addr|MTU'

echo "Checking ASM disk status for disk named /dev/asm ...  - you may need to changes this "
ls -l  /dev/asm*

echo ""
echo "Verify ASM disk "
su - grid -c "ssh $HOSTNAME2 ocrcheck"
su - grid -c "ssh $HOSTNAME2  asmcmd lsdsk -k"
echo ""
su - grid -c "kfed read /dev/asmdisk1_10G | grep name"
echo ""
su - grid -c "kfed read /dev/asmdisk2_10G | grep name"
echo ""
su - grid -c "kfed read /dev/asmdisk3_10G | grep name"
echo ""
su - grid -c "kfed read /dev/asmdisk4_10G | grep name"
echo ""


Output:
..
Ports :53 :5353 :42422 :8888 should not be used by NON-Clusterware processes 
  - OC4J reports : tcp   0 0 ::ffff:127.0.0.1:8888  :::*  LISTEN   501 67433979  2580/java
udp        0      0 0.0.0.0:5353                0.0.0.0:*    501        54383580   28618/mdnsd.bin     
udp        0      0 0.0.0.0:5353                0.0.0.0:*    501        54383565   28618/mdnsd.bin     
udp        0      0 0.0.0.0:5353                0.0.0.0:*    501        54383564   28618/mdnsd.bin     
udp        0      0 0.0.0.0:5353                0.0.0.0:*    501        54383563   28618/mdnsd.bin     
udp        0      0 192.168.2.255:42424         0.0.0.0:*    0          54429417   28502/ohasd.bin     
udp        0      0 230.0.1.0:42424             0.0.0.0:*    0          54429416   28502/ohasd.bin     
udp        0      0 224.0.0.251:42424           0.0.0.0:*    0          54429415   28502/ohasd.bin     
udp        0      0 192.168.2.255:42424         0.0.0.0:*    501        54412444   28827/ocssd.bin     
udp        0      0 230.0.1.0:42424             0.0.0.0:*    501        54412443   28827/ocssd.bin     
udp        0      0 224.0.0.251:42424           0.0.0.0:*    501        54412442   28827/ocssd.bin     
udp        0      0 192.168.2.255:42424         0.0.0.0:*    501        54406273   28742/gipcd.bin     
udp        0      0 230.0.1.0:42424             0.0.0.0:*    501        54406272   28742/gipcd.bin     
udp        0      0 224.0.0.251:42424           0.0.0.0:*    501        54406271   28742/gipcd.bin     
udp        0      0 192.168.5.58:53             0.0.0.0:*    0          67400781   2472/gnsd.bin 
tcp        0      0 ::ffff:127.0.0.1:8888        LISTEN      501        67433979   2580/java  
--> mdnsd.bin is using port 5353
    ohasd.bin, ohasd.bin, gipcd.bin are using port 42424
    oc4j is using port 8888           
    GNS is using port 53 

Compare profile.xml the IP Address of PUBLIC and PRIVATE Intefaces 
 - Devices should report UP BROADCAST RUNNING MULTICAST 
 - Double check NETWORK addresses matches profile.xml settings   
    <gpnp:HostNetwork id="gen" HostName="*">
      <gpnp:Network id="net1" IP="192.168.5.0" Adapter="eth1" Use="public"/>
      <gpnp:Network id="net2" IP="192.168.2.0" Adapter="eth2" Use="asm,cluster_interconnect"/>
  <orcl:CSS-Profile id="css" DiscoveryString="+asm" LeaseDuration="400"/>
  <orcl:ASM-Profile id="asm" DiscoveryString="/dev/asm*" SPFile="+DATA/ract2/ASMPARAMETERFILE/registry.253.870352347" Mode="remote"/>

eth1      Link encap:Ethernet  HWaddr 08:00:27:7D:8E:49  
          inet addr:192.168.5.121  Bcast:192.168.5.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

eth2      Link encap:Ethernet  HWaddr 08:00:27:4E:C9:BF  
          inet addr:192.168.2.121  Bcast:192.168.2.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

  --> IP="192.168.5.0" Adapter="eth1" should match --> eth1 : inet addr:192.168.5.121  Bcast:192.168.5.255  Mask:255.255.255.0 
      IP="192.168.2.0" Adapter="eth2" should match --> eth2 : inet addr:192.168.2.121  Bcast:192.168.2.255  Mask:255.255.255.0

Java Tutorial: Annotation and Reflection

Annotation and Reflection

  • Annotations have a number of uses, among them:
    • Information for the compiler — Annotations can be used by the compiler to detect errors or suppress warnings.
    •   Compile-time and deployment-time processing — Software tools can process annotation information to generate code, XML files, and so forth.
    •   Runtime processing — Some annotations are available to be examined at runtime.
  • Annotation type declarations are similar to normal interface declarations.
  • An at-sign (@) precedes the interface keyword – Sample : public @interface ClassInfo
  • The annotations are not method calls and will not, by themselves, do anything.  Rather any tool like JPA need to extract the annotations at runtime and need to do execute the designed action like: Generating an object-relational mapping.
  • Following JAVA language constructs can be annotated: Class, Constructor, Field, Method, and Package
  • The Java compiler conditionally stores annotation metadata in the class files if the annotation has a RetentionPolicy of CLASS or RUNTIME. Later, the JVM or other programs can look for the metadata to determine how to interact with the program elements or change their behavior [ via Reflection API ]
  • Annotation can  also  be used to provide some info about who change a component [ Java Class , Java Methode ]

Custom Annotations Details for Runtime processing with Reflection

ClassInfo - @Target(value = ElementType.TYPE) 
  - Provide some Info who has changed a certain specific JAVA source
  - Use reflection code below to display the Anotation Info
     if(annotation instanceof ClassInfo)
     {
         ClassInfo myAnnotation = (ClassInfo) annotation;
         System.out.println(" -> autor           : " + myAnnotation.author());
         System.out.println(" -> date            : " + myAnnotation.date());
         System.out.println(" -> comment         : " + myAnnotation.comments());
     }

CanRun - @Target(value = ElementType.METHOD) 
   - Indicate that we can/should run a certain JAVA methode via reflection
   - Run that specific methode by using reflection code:  
         method.invoke(runner);

CanChange - @Target(value = ElementType.FIELD)  
   - Indicate that we can/should modify a certain JAVA methode via reflection 
   - Change a int field  by using reflection code:  
        f.setInt(runner,k);
 
CanConstruct - @Target(value = ElementType.CONSTRUCTOR) 
   - Indicates that we can/should run a certain JAVA Constructor via reflection
   - Construct a new AnnotationRunner instance and printout the int field id1  by using reflection code: 
       ctor.setAccessible(true);
       AnnotationRunner  r = (AnnotationRunner)ctor.newInstance();
       Field f = r.getClass().getDeclaredField("id1");
       f.setAccessible(true); 

Note all of the above Anotations are used during Runtime and thus  
@Retention(value = RetentionPolicy.RUNTIME) is mandatory

Java source: ClassInfo.java

package utils;
import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;
@Target(value = ElementType.TYPE)
@Retention(value = RetentionPolicy.RUNTIME)
public @interface ClassInfo 
{
    String author() default "Helmut";
    String date();
    String comments();
}

Java source: CanChange.java

package utils;
import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;
@Target(value = ElementType.FIELD)
@Retention(value = RetentionPolicy.RUNTIME)
public @interface CanChange 
    {
    }

Java source: CanRun.java

package utils;
import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;
@Target(value = ElementType.METHOD)
@Retention(value = RetentionPolicy.RUNTIME)
public @interface CanRun {
}

Java source: CanConstruct.java

package utils;
import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;
@Target(value = ElementType.CONSTRUCTOR)
@Retention(value = RetentionPolicy.RUNTIME)
public @interface CanConstruct
    {
    }

Java source – The helper class: ReflectionUtils.java

package utils;
import java.lang.annotation.Annotation;
import java.lang.reflect.Constructor;
import java.lang.reflect.Field;
import java.lang.reflect.Method;
import java.lang.reflect.Modifier;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
/**
 * Die Klasse <code>ReflectionUtils</code> ist eine Utility-Klasse, die
 * verschiedene Hilfsmethoden zur vereinfachten Handhabung von Reflection
 * bereitstellt.
 * 
* @author Michael Inden
 * 
* Copyright 2011 by Michael Inden
 */
public final class ReflectionUtils
    {

    public static String modifierToString(final int modifier)
      {
        String modifiers = "";
        if (Modifier.isPublic(modifier))
          {
            modifiers += "public ";
          }
        if (Modifier.isProtected(modifier))
          {
            modifiers += "protected ";
          }
        if (Modifier.isPrivate(modifier))
          {
            modifiers += "private ";
          }
        if (Modifier.isStatic(modifier))
          {
            modifiers += "static ";
          }
        if (Modifier.isAbstract(modifier))
          {
            modifiers += "abstract ";
          }
        if (Modifier.isFinal(modifier))
          {
            modifiers += "final ";
          }
        if (Modifier.isVolatile(modifier))
          {
            modifiers += "volatile ";
          }
        if (Modifier.isSynchronized(modifier))
          {
            modifiers += "synchronized ";
          }
        return modifiers;
      }

    public static Field findField(final Class<?> clazz, final String fieldName)
      {
// Abbruch der Rekursion
        if (clazz == null)
          {
            return null;
          }
        try
          {
            return clazz.getDeclaredField(fieldName);
          } catch (final NoSuchFieldException ex)
          {
// rekursive Suche in Superklasse
            return findField(clazz.getSuperclass(), fieldName);
          }
      }

    public static Method findMethod(final Class<?> clazz, final String methodName, final Class<?>... parameterTypes)
      {
// Abbruch der Rekursion
        if (clazz == null)
          {
            return null;
          }
        try
          {
            return clazz.getDeclaredMethod(methodName, parameterTypes);
          } catch (final NoSuchMethodException ex)
          {
// rekursive Suche in Superklasse
            return findMethod(clazz.getSuperclass(), methodName, parameterTypes);
          }
      }

    
   public static Method[] getAllMethods(final Class<?> clazz)
      {
        final List<Method> methods = new ArrayList<Method>();
        methods.addAll(Arrays.asList(clazz.getDeclaredMethods()));
        /*
        if (clazz.getSuperclass() != null)
          {
              // rekursive Suche in Superklasse
            methods.addAll(Arrays.asList(getAllMethods(clazz.getSuperclass())));
          }
                */
        return methods.toArray(new Method[0]);
      }
       
    public static Field[] getAllFields(final Class<?> clazz)
      {
        final List<Field> fields = new ArrayList<Field>();
        //   Field[] fields = cls.getDeclaredFields();
        fields.addAll(Arrays.asList(clazz.getDeclaredFields()));
       
        return fields.toArray(new Field[0]);
      }
    
     public static Constructor[] getAllConstructors(final Class<?> clazz)
      {
        final List<Constructor> constructors = new ArrayList<Constructor>();
        //   Field[] fields = cls.getDeclaredFields();
        constructors.addAll(Arrays.asList(clazz.getConstructors()));
        return constructors.toArray(new Constructor[0]);
      }
    
    public static void printCtorInfos(final Constructor<?> ctor)
      {
        System.out.println(modifierToString(ctor.getModifiers()) + " " + ctor.getName()
                + buildParameterTypeString(ctor.getParameterTypes()));
        printAnnotations(ctor.getAnnotations());
      }

    public static void printMethodInfos(final Method method)
      {
        System.out.println(modifierToString(method.getModifiers()) + method.getReturnType() + " " + method.getName()
                + buildParameterTypeString(method.getParameterTypes()));
        printAnnotations(method.getAnnotations());
      }

    public static void printFieldInfos(final Field field)
      {
        System.out.println(ReflectionUtils.modifierToString(field.getModifiers()) + field.getType() + " "
                + field.getName());
        printAnnotations(field.getAnnotations());
      }

    public static String buildParameterTypeString(final Class<?>[] parameterTypes)
      {
        if (parameterTypes.length > 0)
          {
            return "(" + Arrays.toString(parameterTypes) + ")";
          }
        return "()";
      }

    private static void printAnnotations(final Annotation[] annotations)
      {
        if (annotations.length > 0)
          {
            System.out.println("Annotations: " + Arrays.toString(annotations));
          }
      }
    
    public static void printClassInfo(final Class<?> clazz )
      {
        // System.out.println("Canonical Class Name: " + clazz.getCanonicalName() );
        System.out.println("Class Name          : "  + clazz.getName() ); 
        System.out.println("Superclass Name     : "  + clazz.getSuperclass() ); 
        System.out.println("Interfaces          : "  + Arrays.toString(clazz.getInterfaces())); 
      //  Class c  = runner.getClass();
        Annotation[] annotations = clazz.getAnnotations();
         for(Annotation annotation : annotations)
           {
           if(annotation instanceof ClassInfo)
              {
              ClassInfo myAnnotation = (ClassInfo) annotation;
              System.out.println(" -> autor           : " + myAnnotation.author());
              System.out.println(" -> date            : " + myAnnotation.date());
              System.out.println(" -> comment         : " + myAnnotation.comments());
              }
           }
      }
            
    private ReflectionUtils()
      {
      }
    }

Java source: AnnotationRunner.java

package AnnotationTest;
import utils.CanChange;
import utils.CanRun;
import utils.CanConstruct;
import utils.ClassInfo;

@ClassInfo( author="Helmut Hutzer", 
            date="8-Feb-2014",
            comments="Intial Class Creation for Testing annotations" )
public class AnnotationRunner {
    @CanChange
    public int id1 = 1;
    public int id2= 2;
    
    @CanConstruct
    public AnnotationRunner()
        {
        }
    
    public AnnotationRunner(int v_id1, int v_id2)
    {
        id1 = v_id1;
        id2 = v_id2;
    }
    
    public void method1() 
    {
        System.out.println("Hello from method1 : " + id1);
    }

    @CanRun
    public void method2() 
    {
        System.out.println("Hello from method2 !");
    }
    
    public void set_id1(int v_id) 
    {
        id1 = v_id;
    }
    
    public void set_id2(int v_id) 
    {
        id2 = v_id;
    }
}

Java source: MyTest.java

package AnnotationTest;

import java.lang.annotation.Annotation;
import utils.CanChange;
import utils.CanRun;
import utils.CanConstruct;
import java.lang.reflect.Method;
import java.lang.reflect.Field;
import java.lang.reflect.Constructor;
import java.lang.reflect.InvocationTargetException;
public class MyTest 
{
    public static void main(String[] args) 
    {
        AnnotationRunner runner = new AnnotationRunner();     
        
        System.out.println("\n-->Inspect Class:");
        utils.ReflectionUtils.printClassInfo( runner.getClass());
        
          System.out.println("\n-->Inspect Class:");
        utils.ReflectionUtils.printClassInfo(CanChange.class);
        
        System.out.println("\n--> Exploring Constructors  ");
        Constructor[] constructors =  utils.ReflectionUtils.getAllConstructors(runner.getClass());
        Constructor ctor =null;
        for (  Constructor  constructor : constructors) 
        {
            utils.ReflectionUtils.printCtorInfos(constructor);
                // Find the constructor without any paramter
            if (constructor.getGenericParameterTypes().length == 0)
                ctor = constructor;
        } 
        
          // See ; http://docs.oracle.com/javase/tutorial/reflect/member/ctorInstance.html 
        
        if ( ctor != null )
        {
            System.out.println("--> Found a constructor without parameters ");
            Annotation annos = ctor.getAnnotation(CanConstruct.class);
            if (annos != null) 
            {
                try
                {
                    System.out.println("--> Found Annotation CanConstruct   ");
                    utils.ReflectionUtils.printCtorInfos(ctor);
                    ctor.setAccessible(true);
                    AnnotationRunner  r = (AnnotationRunner)ctor.newInstance();
                    Field f = r.getClass().getDeclaredField("id1");
                    f.setAccessible(true);
                    System.out.println("--> Created new instance of class  AnnotationRunner");
                    System.out.println("--> print Field id1 via reflecion : "+   f.get(r));
                } catch ( InstantiationException | IllegalAccessException 
                            | NoSuchFieldException | InvocationTargetException ex ) 
                     { ex.printStackTrace(); }
            }
        }
        
        System.out.println("\n --> Exploring method annotations ");
         // utils.ReflectionUtils.getAllMethods scans your superclass too !
        Method[] methods =  utils.ReflectionUtils.getAllMethods(runner.getClass());
        
        for (Method method : methods) 
          {
            utils.ReflectionUtils.printMethodInfos(method);
            Annotation annos = method.getAnnotation(CanRun.class);
            if (annos != null) 
            {
                System.out.println("--> Found CanRun.class Annotation for method: " + method.getName() );
                try {
                    System.out.println("--> Invoking this method via reflection ");
                    method.invoke(runner);
                } catch (Exception e) {
                    e.printStackTrace();
                }
            }
            else
              {
              System.out.println("--> Found NO Annotation for method: " + method.getName() );  
              }
          }
//        Field[] fields   = runner.getClass().getFields();
        
        System.out.println("\n --> Exploring Attributes:");
        Field[] fields =  utils.ReflectionUtils.getAllFields(runner.getClass());
        for (Field f : fields) 
        {
            utils.ReflectionUtils.printFieldInfos(f);
            Annotation annos = f.getAnnotation(CanChange.class);
            if (annos != null) 
            {
                System.out.println("--> Found Annotation for field: " + f.getName() + " " + f.getType()  );  
                try 
                {                  
                    if (  f.getType().equals(int.class) )
                    {
                        int k = 99 ;
                        System.out.println("--> Current value for id1 : " + runner.id1 );
                        f.setInt(runner,k);
                        System.out.println("--> Changing int value for attr " +f.getName() + 
                                 " via reflection - New value: " + runner.id1 );
                    }
                } 
                catch (IllegalAccessException ex) 
                {
                    ex.printStackTrace();
                }
            }
        }
    }
} 

Program Output running MyTest.java

-->Inspect Class:
Canonical Class Name: AnnotationTest.AnnotationRunner
Class Name          : AnnotationTest.AnnotationRunner
Superclass Name     : class java.lang.Object
Interfaces          : []
 -> autor           : Helmut Hutzer
 -> date            : 8-Feb-2014
 -> comment         : Intial Class Creation for Testing annotations

-->Inspect Class:
Canonical Class Name: utils.CanChange
Class Name          : utils.CanChange
Superclass Name     : null
Interfaces          : [interface java.lang.annotation.Annotation]

--> Exploring Constructors  
public  AnnotationTest.AnnotationRunner([int, int])
public  AnnotationTest.AnnotationRunner()
Annotations: [@utils.CanConstruct()]
--> Found a constructor without parameters 
--> Found Annotation CanConstruct   
public  AnnotationTest.AnnotationRunner()
Annotations: [@utils.CanConstruct()]
--> Created new instance of class  AnnotationRunner
--> print Field id1 via reflecion : 1

--> Exploring method annotations 
public void method1()
--> Found NO Annotation for method: method1
public void method2()
Annotations: [@utils.CanRun()]
--> Found CanRun.class Annotation for method: method2
--> Invoking this method via reflection 
Hello from method2 !
public void set_id1([int])
--> Found NO Annotation for method: set_id1
public void set_id2([int])
--> Found NO Annotation for method: set_id2

 --> Exploring Attributes:
public int id1
Annotations: [@utils.CanChange()]
--> Found Annotation for field: id1 int
--> Current value for id1 : 1
--> Changing int value for attr id1 via reflection - New value: 99
public int id2

Reference

Pitfalls changing Public IP address in a RAC cluster env with detailed debugging steps

Overview

  • Changing the PUBLIC interface in a RAC env is not that simple and you need to take into account
    • Nameserver changes
    • DHCP server changes including VIPs
    • /etc/hosts changes
    • GNS VIP changes
    • PUBLIC interface changes
      #  oifcfg getif  ->  eth1  192.168.5.0  global  public
  • In any case you should read : How to Modify Private Network Information in Oracle Clusterware (Doc ID 283684.1)

If you still get problem the here some debugging details:

  • Note this tutorial use 12.1.0.2 CW logfiles structure which simplifies using grep command
    a lot as all traces can be found at:  $GRID_HOME/diag/crs/hract21/crs/trace
  • Download script crsi and run this script during booting you CRS stack with watch utility
    This gives you a good idea what component is failing or gets restarted and finally switch
    to status OFFLINE
  • As said again and again cluvfy is your friend to quickly identify the root problem
  • If the network adapter  info in profile.xml doesn’t match the ifconfig data GIPCD will not start ( This is true for PUBLIC and CLUSTERINTERCONNECT info )

In this tutorial we will debug following scenarios by reading logfiles, running OS command and by running cluvfy:

  • Case I   : Nameserver not responding –  GIPCD not starting
  • Case II  : Different  IP address in /etc/hosts and NameServer Lookup  – GIPCD not starting
  • Case III : Wrong Cluster Interconnect Address – GIPCD not starting
  • Case IV  : DHCP server sends wrong IP address – VIPs not starting
  • Case V   : Wrong GNS VIP address – GNS not starting

Potential Errors and Error types

In generell we have  2 types of Network related error

  • OS related errors ( either bind() or getaddrinfo() system call was failing )

    • If you you want to find an GIPCD related errors around between 2015-02-03 12:00:00 and 2015-02-03 12:09:50  you may run :     $ grep “2015-02-03 12:0″ *  | grep ” slos “
    • In this tutorial we handle bind()  OS system calls but you may check your traces for:
      send(),recv(), listen() and  connect() system call failures too !
    • Note – Only GIPCD errors prints OS errors with slos printout like :  slos loc :  getaddrinfo
    • For other components like MDNSD daemon  you may grep your CW traces
      for error strings: “Address already in use” , “Error Connection timed out”, “Cannot assign requested address”
  • Logical Errors
    • Are not easy to debug as we need to read and understand the CW logs more in detail.

Error Details

Error I :  Name Server related Errors – getaddrinfo () was failing

 OS system call:  getaddrinfo() is failing with errno 110:   Error Connection timed out (110)
 --> see Case I
 Search all CW traces with TS 2015-02-03 09:20:00 --> 2015-02-03 09:29:59" for failed OS Call: getaddrinfo
 [grid@hract21 trace]$  grep "2015-02-03 09:2" *  | grep " getaddrinfo"
 gipcd_2.trc:2015-02-03 09:20:09.946273 :GIPCXCPT:2157598464:  gipcmodNetworkResolve: slos loc :  getaddrinfo(
 gipcd_2.trc:2015-02-03 09:20:14.952381 :GIPCXCPT:2157598464:  gipcmodNetworkResolve: slos loc :  getaddrinfo

Error II : bind() fails  as the local IP address is not avaiable on your system  (verify with ifconfig )

OS system call:  bind () is failing with errno 99 : Error: Cannot assign requested address (99)
 --> see Case II,III
 Search all CW traces with TS 2015-02-03 15:30:00 --> 2015-02-03 15:39:59" for failed OS Call: bind
 [grid@hract21 trace]$  grep "2015-02-03 15:3" *  | grep " bind"
 gipcd_2.trc:2015-02-03 15:34:47.898380 :GIPCXCPT:2106038016:  gipcmodNetworkProcessBind: slos loc :  bind
 gipcd_2.trc:2015-02-03 16:39:43.587972 :GIPCXCPT:1288218368:  gipcmodNetworkProcessBind: slos loc :  bind

--> If OS system call:  bind () is failing with errno 98 Error : Address already in use (98)
please read :  
Troubleshooting Clusterware and Clusterware component error : Address already in use

Error III: Logical Errros ( not related OS errors )

  • Wrong DHCP Server response : see Case IV
  • Wrong GNS Server address     : see Case V

Case I:  Nameserver not responding –  GIPCD not starting

[root@hract21 Desktop]#  watch crsi
*****  Local Resources: *****
Resource NAME               INST   TARGET    STATE        SERVER          STATE_DETAILS
--------------------------- ----   ------------ ------------ --------------- -----------------------------------------
ora.evmd                       1   ONLINE    INTERMEDIATE hract21         STABLE
ora.gipcd                      1   ONLINE    OFFLINE      -               STABLE
ora.gpnpd                      1   ONLINE    ONLINE       hract21         STABLE
ora.mdnsd                      1   ONLINE    ONLINE       hract21         STABLE
ora.storage                    1   ONLINE    OFFLINE      -               STABLE
--> ora.gipcd in state INTERMEDIATE/OFFLINE ora.evmd in state INTERMEDIATE

As GIPCD doesn't come up  review tracefile :  gipcd.trc
2015-02-03 09:20:14.952363 :GIPCXCPT:2157598464:  gipcmodNetworkResolve: slos op  :  sgipcnPopulateAddrInfo
2015-02-03 09:20:14.952373 :GIPCXCPT:2157598464:  gipcmodNetworkResolve: slos dep :  Connection timed out (110)
2015-02-03 09:20:14.952381 :GIPCXCPT:2157598464:  gipcmodNetworkResolve: slos loc :  getaddrinfo(
2015-02-03 09:20:14.952391 :GIPCXCPT:2157598464:  gipcmodNetworkResolve: slos info:  server not available,try again
2015-02-03 09:20:14.952455 :GIPCXCPT:2157598464:  gipcResolveF [gipcInternalBind : gipcInternal.c : 537]: EXCEPTION[ ret gipcretFail (1) ]  failed to resolve address 0x7f035c033c10 [0000000000000311] { gipcAddress : name 'tcp://hract21.example.com', objFlags 0x0, addrFlags 0x8 }, flags 0x4000
2015-02-03 09:20:14.952486 :GIPCXCPT:2157598464:  gipcBindF [gipcInternalEndpoint : gipcInternal.c : 468]: EXCEPTION[ ret gipcretFail (1) ]  failed to bind endp 0x7f035c033070 [000000000000030f] { gipcEndpoint : localAddr 'tcp://hract21.example.com', remoteAddr '', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj (nil), sendp (nil) status 13flags 0x40008000, flags-2 0x0, usrFlags 0x240a0 }, addr 0x7f035c034890 [0000000000000316] { gipcAddress : name 'tcp://hract21.example.com', objFlags 0x0, addrFlags 0x8 }, flags 0x200a0
2015-02-03 09:20:14.952552 :GIPCXCPT:2157598464:  gipcInternalEndpoint: failed to bind address to endpoint name 'tcp://hract21.example.com', ret gipcretFail (1)
--> getaddrinfo() system all is failing -> Nameserver lookup issue

Verify Error with OS commands
[grid@hract21 trace]$  nslookup hract21
;; connection timed out; trying next origin
;; connection timed out; trying next origin
;; connection timed out; no servers could be reached

Verify Error with cluvfy 
[grid@hract21 CLUVFY]$  cluvfy comp nodeapp -n hract21
PRVF-0002 : could not retrieve local node name

Fix -> Verify the Nameserver is up and running 
1) Is your nameserver running ?
[root@ns1 ~]# service named status
version: 9.9.3-RedHat-9.9.3-P1.el6
CPUs found: 4
worker threads: 4
UDP listeners per interface: 4
number of zones: 101
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is OFF
recursive clients: 0/0/1000
tcp clients: 0/100
server is up and running
named (pid  9193) is running...

2) Can you ping your nameserver ?
[oracle@hract21 JAVA]$ ping ns1.example.com
PING ns1.example.com (192.168.5.50) 56(84) bytes of data.
64 bytes from ns1.example.com (192.168.5.50): icmp_seq=1 ttl=64 time=0.124 ms
64 bytes from ns1.example.com (192.168.5.50): icmp_seq=2 ttl=64 time=0.293 ms

3) Verify that nameserver is listening on required IP/Adress and Port 
[root@ns1 ~]# netstat -auen  | grep ":53 "
udp        0      0 192.168.5.50:53             0.0.0.0:*                               25         56734      
udp        0      0 127.0.0.1:53                0.0.0.0:*                               25         56732  

Case II  : Different  IP address in /etc/hosts and NameServer Lookup – GIPCD not starting

****  Local Resources: *****
Resource NAME               INST   TARGET    STATE        SERVER          STATE_DETAILS
--------------------------- ----   ------------ ------------ --------------- -----------------------------------------
ora.asm                        1   ONLINE    OFFLINE      -               STABLE
ora.cluster_interconnect.haip  1   ONLINE    OFFLINE      -               STABLE
ora.crf                        1   ONLINE    ONLINE       hract21         STABLE
ora.crsd                       1   ONLINE    OFFLINE      -               STABLE
ora.cssd                       1   ONLINE    OFFLINE      -               STABLE
ora.cssdmonitor                1   ONLINE     ONLINE       hract21         STABLE
ora.ctssd                      1   ONLINE    OFFLINE      -               STABLE
ora.diskmon                    1   ONLINE     OFFLINE      -               STABLE
ora.drivers.acfs               1   ONLINE    ONLINE       hract21         STABLE
ora.evmd                       1   ONLINE    INTERMEDIATE hract21         STABLE
ora.gipcd                      1   ONLINE    OFFLINE      -               STABLE
ora.gpnpd                      1   ONLINE    ONLINE       hract21         STABLE
ora.mdnsd                      1   ONLINE    ONLINE       hract21         STABLE
ora.storage                    1   ONLINE    OFFLINE      -               STABLE
--> CSSD and GIPCD remains OFFLINE - switches STATE_DETAILS from STABLE to STARTING but doen't up

gipcd.trc:
2015-02-03 15:35:02.928327 :GIPCXCPT:937420544:  gipcmodNetworkProcessBind: slos op  :  sgipcnTcpBind
2015-02-03 15:35:02.928333 :GIPCXCPT:937420544:  gipcmodNetworkProcessBind: slos dep :  Cannot assign requested address (99)
2015-02-03 15:35:02.928337 :GIPCXCPT:937420544:  gipcmodNetworkProcessBind: slos loc :  bind
2015-02-03 15:35:02.928342 :GIPCXCPT:937420544:  gipcmodNetworkProcessBind: slos info:  addr '192.168.6.121:0'
2015-02-03 15:35:02.928391 :GIPCXCPT:937420544:  gipcBindF [gipcInternalEndpoint : gipcInternal.c : 468]: EXCEPTION[ ret gipcretAddressNotAvailable (39) ]  failed to bind endp 0x7f4624027990 [0000000000000306] { gipcEndpoint : localAddr 'tcp://192.168.6.121', remoteAddr '', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj (nil), sendp 0x7f4624033be0 status 13flags 0x20008000, flags-2 0x0, usrFlags 0x20020 }, addr 0x7f4624033070 [000000000000030d] { gipcAddress : name 'tcp://hract21.example.com', objFlags 0x0, addrFlags 0x4 }, flags 0x20020
2015-02-03 15:35:02.928405 :GIPCXCPT:937420544:  gipcInternalEndpoint: failed to bind address to endpoint name 'tcp://hract21.example.com', ret gipcretAddressNotAvailable (39)
2015-02-03 15:35:02.928419 :GIPCXCPT:937420544:  gipchaDaemonThread: gipcEndpointPtr failed (tcp://hract21.example.com), ret gipcretAddressNotAvailable (39)
2015-02-03 15:35:02.928429 :GIPCHDEM:937420544:  gipchaDaemonThreadEntry: EXCEPTION[ ret gipcretAddressNotAvailable (39) ]  terminating daemon thread due to exception
2015-02-03 15:35:02.928455 :GIPCXCPT:1281627904:  gipchaInternalRegister: daemon thread state invalid gipchaThreadStateFailed (5), ret gipcretFail (1)
2015-02-03 15:35:02.928477 :GIPCHGEN:1281627904:  gipchaRegisterF [gipchaInternalResolve : gipchaInternal.c : 1204]: EXCEPTION[ ret gipcretFail (1) ]  failed to register ctx 0xfd09b0 [0000000000000011] { gipchaContext : host 'hract21', name 'gipcd_ha_name', luid 'a94decf7-00000000', name2 5132-2561-c03c-e03e, numNode 0, numInf 0, maxPriority 0, clientMode 1, nodeIncarnation 00000000-00000000 usrFlags 0x0, flags 0xd68 }, name '(null)', flags 0x4000
2015-02-03 15:35:02.928544 :GIPCHGEN:1281627904:  gipchaResolveF [gipcmodGipcResolve : gipcmodGipc.c : 863]: EXCEPTION[ ret gipcretFail (1) ]  failed to resolve ctx 0xfd09b0 [0000000000000011] { gipchaContext : host 'hract21', name 'gipcd_ha_name', luid 'a94decf7-00000000', name2 5132-2561-c03c-e03e, numNode 0, numInf 0, maxPriority 0, clientMode 1, nodeIncarnation 00000000-00000000 usrFlags 0x0, flags 0xd68 }, host 'hract21', port 'gipcdha_hract21_', flags 0x0
2015-02-03 15:35:02.928569 :GIPCXCPT:1281627904:  gipcInternalResolve: failed to resolve addr 0x7f4638099680 [000000000000016a] { gipcAddress : name 'gipcha://hract21:gipcdha_hract21_', objFlags 0x0, addrFlags 0x4 }, ret gipcretFail (1)
 
Verify Error with OS commands
[grid@hract21 trace]$ nslookup hract21
Server:        192.168.5.50
Address:    192.168.5.50#53
Name:    hract21.example.com
Address: 192.168.5.121

[grid@hract21 trace]$ ping hract21
PING hract21 (192.168.6.121) 56(84) bytes of data.
--> Opps why to different results for nslookup and ping ?
Verify IP address from  /etc/hosts
[grid@hract21 trace]$ grep hract21 /etc/hosts
192.168.6.121 hract21 hract21.example.com

Verify Error with cluvfy  
[grid@hract21 CLUVFY]$ cluvfy comp nodereach -n  hract21
Verifying node reachability 
Checking node reachability...
PRVF-6006 : unable to reach the IP addresses "hract21" from the local node
PRKC-1071 : Nodes "hract21" did not respond to ping in "3" seconds, 
PRKN-1035 : Host "hract21" is unreachable
Verification of node reachability was unsuccessful on all the specified nodes. 

-> Fix : Keep your /etc/hosts and your Bind server in sync 
         When Changing Bind Server always verify the change in /etc/hosts too

 

Case III : Wrong Cluster Interconnect Address – GIPCD not starting

[root@hract21 Desktop]#  watch crsi
*****  Local Resources: *****
Resource NAME               INST   TARGET    STATE        SERVER          STATE_DETAILS
--------------------------- ----   ------------ ------------ --------------- -----------------------------------------
ora.asm                        1   ONLINE    OFFLINE      -               STABLE
ora.cluster_interconnect.haip  1   ONLINE    OFFLINE      -               STABLE
ora.crf                        1   ONLINE    ONLINE       hract21         STABLE
ora.crsd                       1   ONLINE    OFFLINE      -               STABLE
ora.cssd                       1   ONLINE    OFFLINE      hract21         STARTING
ora.cssdmonitor                1   ONLINE    ONLINE       hract21         STABLE
ora.ctssd                      1   ONLINE    OFFLINE      -               STABLE
ora.diskmon                    1   ONLINE    OFFLINE      -               STABLE
ora.drivers.acfs               1   ONLINE    ONLINE       hract21         STABLE
ora.evmd                       1   ONLINE    INTERMEDIATE hract21         STABLE
ora.gipcd                      1   ONLINE    OFFLINE      -               STABLE
ora.gpnpd                      1   ONLINE    INTERMEDIATE hract21         STABLE
ora.mdnsd                      1   ONLINE    ONLINE       hract21         STABLE
ora.storage                    1   ONLINE    OFFLINE      -               STABLE
--> GPNPD remains in status INTERMEDIATE GIPCD is in state OFFLINE

gipcd.trc:
2015-02-03 16:39:18.324221 :GIPCHDEM:20907776:  gipchaDaemonThread: starting daemon thread hctx 0x22d39b0 [0000000000000011] { gipchaContext : host 'hract21', name 'gipcd_ha_name', luid 'df31173e-00000000', name2 02ff-37da-c08f-50b4, numNode 0, numInf 0, maxPriority 0, clientMode 1, nodeIncarnation 00000000-00000000 usrFlags 0x0, flags 0xcd60 }
2015-02-03 16:39:23.327691 :GIPCXCPT:20907776:  gipcmodNetworkProcessBind: failed to bind endp 0x7fa3dc027990 [0000000000000306] { gipcEndpoint : localAddr 'tcp://192.168.5.121', remoteAddr '', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj (nil), sendp 0x7fa3dc033c80 status 13flags 0x20008000, flags-2 0x0, usrFlags 0x20020 }, addr 0x7fa3dc032310 [0000000000000308] { gipcAddress : name 'tcp://192.168.5.121', objFlags 0x0, addrFlags 0x5 }
2015-02-03 16:39:23.327721 :GIPCXCPT:20907776:  gipcmodNetworkProcessBind: slos op  :  sgipcnTcpBind
2015-02-03 16:39:23.327727 :GIPCXCPT:20907776:  gipcmodNetworkProcessBind: slos dep :  Cannot assign requested address (99)
2015-02-03 16:39:23.327732 :GIPCXCPT:20907776:  gipcmodNetworkProcessBind: slos loc :  bind
2015-02-03 16:39:23.327736 :GIPCXCPT:20907776:  gipcmodNetworkProcessBind: slos info:  addr '192.168.5.121:0'
2015-02-03 16:39:23.327806 :GIPCXCPT:20907776:  gipcBindF [gipcInternalEndpoint : gipcInternal.c : 468]: EXCEPTION[ ret gipcretAddressNotAvailable (39) ]  failed to bind endp 0x7fa3dc027990 [0000000000000306] { gipcEndpoint : localAddr 'tcp://192.168.5.121', remoteAddr '', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj (nil), sendp 0x7fa3dc033c80 status 13flags 0x20008000, flags-2 0x0, usrFlags 0x20020 }, addr 0x7fa3dc033070 [000000000000030d] { gipcAddress : name 'tcp://hract21.example.com', objFlags 0x0, addrFlags 0x4 }, flags 0x20020
2015-02-03 16:39:23.327823 :GIPCXCPT:20907776:  gipcInternalEndpoint: failed to bind address to endpoint name 'tcp://hract21.example.com', ret gipcretAddressNotAvailable (39)
2015-02-03 16:39:23.327838 :GIPCXCPT:20907776:  gipchaDaemonThread: gipcEndpointPtr failed (tcp://hract21.example.com), ret gipcretAddressNotAvailable (39)
2015-02-03 16:39:23.327851 :GIPCHDEM:20907776:  gipchaDaemonThreadEntry: EXCEPTION[ ret gipcretAddressNotAvailable (39) ]  terminating daemon thread due to exception
2015-02-03 16:39:23.327943 : GIPCNET:20907776:  gipcmodNetworkUnprepare: failed to unprepare waits for endp 0x7fa3dc027990 [0000000000000306] { gipcEndpoint : localAddr 'tcp://192.168.5.121', remoteAddr '', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x8, pidPeer 0, readyRef (nil), ready 0, wobj (nil), sendp 0x7fa3dc033c80 status 13flags 0x26008000, flags-2 0x0, usrFlags 0x20020 }
--> Here bind system call fails with errno 99 which mean this IP  192.168.5.121 address is not available yet ! 
[root@hract21 Desktop]# cat /usr/include/asm-generic/errno.h | grep 99
#define    EADDRNOTAVAIL    99    /* Cannot assign requested address */

Verify Error with OS commands:
[root@hract21 Desktop]# ifconfig eth1
eth1      Link encap:Ethernet  HWaddr 08:00:27:7D:8E:49  
          inet addr:192.168.6.121  Bcast:192.168.6.255  Mask:255.255.255.0
[root@hract21 Desktop]#  ifconfig eth2
eth2      Link encap:Ethernet  HWaddr 08:00:27:4E:C9:BF  
          inet addr:192.168.2.121  Bcast:192.168.2.255  Mask:255.255.255.0
[root@hract21 Desktop]#   $GRID_HOME/bin/gpnptool get 2>/dev/null  |  xmllint --format - | egrep 'CSS-Profile|ASM-Profile|Network id'
    <gpnp:HostNetwork id="gen" HostName="*">
      <gpnp:Network id="net1" IP="192.168.5.0" Adapter="eth1" Use="public"/>
      <gpnp:Network id="net2" IP="192.168.2.0" Adapter="eth2" Use="asm,cluster_interconnect"/>
  <orcl:CSS-Profile id="css" DiscoveryString="+asm" LeaseDuration="400"/>
  <orcl:ASM-Profile id="asm" DiscoveryString="/dev/asm*" SPFile="+DATA/ract2/ASMPARAMETERFILE/registry.253.870352347" Mode="remote"/>
--> GPnPD expects PUBLIC interface eth1 to be bound on IP Adress 192.168.5.121 and not 192.168.6.121

Verify Error with cluvfy:
[grid@hract21 CLUVFY]$  cluvfy comp gpnp -n hract21
Verifying GPNP integrity 
--> cluvfy comp gpnp hangs 

Fix: Change interface eth1 back to  192.168.5.121 and reboot cluster stack

 

Case IV   :  DHCP server returns wrong IP address – VIPs not starting

  • Multiple DHCP server
  • DHCP server not available
Lower CRS stack starts 
*****  Local Resources: *****
Resource NAME               INST   TARGET    STATE        SERVER          STATE_DETAILS
--------------------------- ----   ------------ ------------ --------------- -----------------------------------------
ora.asm                        1   ONLINE    ONLINE       hract21         STABLE
ora.cluster_interconnect.haip  1   ONLINE    ONLINE       hract21         STABLE
ora.crf                        1   ONLINE    ONLINE       hract21         STABLE
ora.crsd                       1   ONLINE    ONLINE       hract21         STABLE
ora.cssd                       1   ONLINE    ONLINE       hract21         STABLE
ora.cssdmonitor                1   ONLINE    ONLINE       hract21         STABLE
ora.ctssd                      1   ONLINE    ONLINE       hract21         OBSERVER,STABLE
ora.diskmon                    1   OFFLINE    OFFLINE      -               STABLE
ora.drivers.acfs               1   ONLINE    ONLINE       hract21         STABLE
ora.evmd                       1   ONLINE    ONLINE       hract21         STABLE
ora.gipcd                      1   ONLINE    ONLINE       hract21         STABLE
ora.gpnpd                      1   ONLINE    ONLINE       hract21         STABLE
ora.mdnsd                      1   ONLINE    ONLINE       hract21         STABLE
ora.storage                    1   ONLINE    ONLINE       hract21         STABLE
--> Lower CRS stack is up and running 

Vips are in state STARTING 
ora.hract21.vip                1   ONLINE       OFFLINE      hract21         STARTING  
ora.hract22.vip                1   ONLINE       ONLINE       hract22         STABLE  
ora.hract23.vip                1   ONLINE       ONLINE       hract23         STABLE  
ora.mgmtdb                     1   ONLINE       ONLINE       hract23         Open,STABLE  
ora.oc4j                       1   ONLINE       ONLINE       hract22         STABLE  
ora.scan1.vip                  1   ONLINE       OFFLINE      hract21         STARTING 

crsd_orarootagent_root.trc
2015-02-03 12:06:42.065910 :CLSDYNAM:2822174464: [ora.hract21.vip]{1:35451:9} [start] DHCP client id = hract21-vip
2015-02-03 12:06:42.065929 :CLSDYNAM:2822174464: [ora.hract21.vip]{1:35451:9} [start] DHCP Server Port = 67
2015-02-03 12:06:42.065940 :CLSDYNAM:2822174464: [ora.hract21.vip]{1:35451:9} [start] DHCP sending packet from = 192.168.5.121
2015-02-03 12:06:42.065949 :CLSDYNAM:2822174464: [ora.hract21.vip]{1:35451:9} [start] DHCP sending packet to = 255.255.255.255
2015-02-03 12:06:47.068966 :GIPCXCPT:2822174464:  gipcWaitF [clsdhcp_sendmessage : clsdhcp.c : 616]: 
       EXCEPTION[ ret (uknown) (910) ]  failed to wait on obj 0x7fcb8c04d770 [0000000000000ddf]
      { gipcEndpoint : localAddr 'udp://0.0.0.0:68', remoteAddr '', numPend 5, numReady 0, numDone 0, numDead 0, numTransfer 0, 
     objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj 0x7fcb8c037e70, sendp 0x7fcb8c037cb0 status 13flags 0x20000002, flags-2 0x0, usrFlags 0x8000 }, reqList 0x7fcba8364658, nreq 1, creq 0x7fcba8364b20 timeout 5000 ms, flags 0x4000
--> After sending an DHCP request - we fail in  gipcWaitF  which means we have some troubles to contact our DHCP server
    or getting the reqired DHCP address 

Verify Error with OS commands
Download and Install dhcping:
Download location:  http://pkgs.repoforge.org/dhcping  following package : dhcping-1.2-2.2.el6.rf.x86_64.rpm
[root@hract21 Desktop]# rpm -i  /media/sf_kits/Linux/dhcping-1.2-2.2.el6.rf.x86_64.rpm
[root@hract21 Desktop]# dhcping -i eth1
Got answer from: 192.168.3.50
received from 192.168.3.50, expected from 0.0.0.0 
Got answer from: 192.168.3.50
received from 192.168.3.50, expected from 0.0.0.0
no answer
--> Here we see that we get a wrong DHCP address
[root@ns1 dhcp]# dhcping -h   08:00:27:7D:8E:49 -s 192.168.5.50 -c 192.168.5.199
no answer
--> This confirms that our DHCP server is running on wrong IP addess ( 192.168.3.50 ) and 
    can server an DHCP request for a s 192.168.5.xx address

Working dhcping output - just for reference :
[root@hract21 Desktop]#  dhcping -h   08:00:27:7D:8E:49 -s 192.168.5.50 -c 192.168.5.199
Got answer from: 192.168.5.50

Verify Error with cluvfy  commands
[root@hract21 CLUVFY]#  cluvfy comp dhcp -clustername ract2 -verbose
Verifying DHCP Check 
Checking if any DHCP server exists on the network...
Checking if network CRS resource is configured and online
Network CRS resource is offline or not configured. Proceeding with DHCP checks.
PRVG-5726 : Failed to discover DHCP servers on public network listening on port "67" using command "/u01/app/121/grid/bin/crsctl discover dhcp -clientid ract2-scan1-vip "
CRS-10010: unable to discover DHCP server in the network listening on port 67 for client ID ract2-scan1-vip
CRS-4000: Command discover failed, or completed with errors.
PRVF-5704 : No DHCP server were discovered on the public network listening on port 67
Verification of DHCP Check was unsuccessful on all the specified nodes. 

Additonal info about DHCP setup  
- I always look at /etc/dhcpd.conf wich is wrong - use /etc/dhcp/dhcpd.conf file instead !
- Note if changing  /etc/dhcpd.conf you may need change /etc/sysconfig/dhcpd 
DHCP config files: 
/etc/dhcp/dhcpd.conf 
/etc/sysconfig/dhcpd

 

Case V   : Wrong GNS VIP address – GNS not starting

[root@hract21 network-scripts]#  watch 'crs | grep gns'
ora.gns                        1   ONLINE       OFFLINE      -               STABLE
ora.gns.vip                    1   ONLINE       ONLINE       hract21         STABLE
-> GNS VIP is ONLINE but GNS doesn't sart 

gnsd.trc
Oracle Database 12c Clusterware Release 12.1.0.2.0 - Production Copyright 1996, 2014 Oracle. All rights reserved.
    CLSB:489064000: Argument count (argc) for this daemon is 7
    CLSB:489064000: Argument 0 is: /u01/app/121/grid/bin/gnsd.bin
    CLSB:489064000: Argument 1 is: -trace-level
    CLSB:489064000: Argument 2 is: 1
    CLSB:489064000: Argument 3 is: -ip-address
    CLSB:489064000: Argument 4 is: 192.168.6.58
    CLSB:489064000: Argument 5 is: -startup-endpoint
    CLSB:489064000: Argument 6 is: ipc://GNS_hract21_4625_9fe54b1833d5fbd2
2015-02-03 17:29:15.339039 :   CLSNS:489064000: main::clsns_SetTraceLevel:trace level set to 1.
2015-02-03 17:29:16.226261 :     GNS:489064000: main::clsgndmain: ##########################################
2015-02-03 17:29:16.226283 :     GNS:489064000: main::clsgndmain: GNS starting on hract21. Process ID: 29196
2015-02-03 17:29:16.226299 :     GNS:489064000: main::clsgndmain: ##########################################
2015-02-03 17:29:16.226338 :     GNS:489064000: main::clsgnSetTraceLevel: trace level set to 1.
..
2015-02-03 17:29:17.490335 :     GNS:489064000: main::clsgndGetInstanceInfo: version: 12.1.0.2.0 (0xc100200) 
                                 endpoints: tcp://192.168.6.58:63806 process ID: "29196" state: "Initializing".
2015-02-03 17:29:17.491219 :     GNS:489064000: main::clsgndadvAdvertise: Listening for commands on endpoint(s): tcp://192.168.6.58:63806.
2015-02-03 17:29:17.496441 :     GNS:349841152: Resolve::clsgndnsCreateContainerCallback: listening on port 53 address "192.168.6.58"
2015-02-03 17:29:17.499552 :  CLSDMT:351942400: PID for the Process [29196], connkey 12
2015-02-03 17:29:17.505626 :     GNS:343537408: Command #0::clsgndcpRunProcessor: Waiting for client command
2015-02-03 17:29:17.512072 :     GNS:4160747264: Command #1::clsgndcpRunProcessor: Waiting for client command
2015-02-03 17:29:17.516675 :     GNS:4156544768: Command #2::clsgndcpRunProcessor: Waiting for client command
2015-02-03 17:29:17.518326 :     GNS:4154443520: Command #3::clsgndcpRunProcessor: Waiting for client command
2015-02-03 17:29:17.747693 :     GNS:4152342272: Self-check::clsgndscRun: Name: "GNSTESTHOST.grid12c.example.com" Address: 1.2.3.4.
2015-02-03 17:29:53.882538 :     GNS:351942400: main::clsgndCLSDMExit: CLSDM request to quit received - requester: agent.
2015-02-03 17:29:53.882610 :     GNS:351942400: main::clsgndCLSDMExit: terminating GNSD on behalf of CLSDM - requester: agent.
--> Here we have some troubles as GNS was terminated

crsd_orarootagent_root.trc:
2015-02-03 17:29:24.470729 :   CLSNS:292816640: main::clsnsgFind:(:CLSNS00230:):query to find 
     GNS using service name "_Oracle-GNS._tcp" failed.: 1: clskec:has:CLSNS:5 3 args[has:CLSNS:5][mod=clsns_DNSSD_FindServers][loc=(:CLSNS00152:)]
2015-02-03 17:29:24.470771 :     
     GNS:292816640: main::clsgnctrGetGNSAddressUsingCLSNS: (:CLSGN01053:) GNS address retrieval failed with 
     error CLSNS-00025 (GNS_SERV_FIND_FAIL) - throwing CLSGN-00070. 1: clskec:has:CLSNS:25 3 args[has:CLSNS:25][mod=clsnsgFind][loc=(:CLSNS00216:)]

Verify Error with OS commands:
Check GNS and PUBLIC network interface 
[root@hract21 Desktop]# srvctl config gns
GNS is enabled.
GNS VIP addresses: 192.168.6.58
Domain served by GNS: grid12c.example.com
Check the PUBLIC network interface 
[root@hract21 network-scripts]# ifconfig
eth1:1    Link encap:Ethernet  HWaddr 08:00:27:7D:8E:49  
          inet addr:192.168.5.156  Bcast:192.168.5.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

eth1:2    Link encap:Ethernet  HWaddr 08:00:27:7D:8E:49  
          inet addr:192.168.5.157  Bcast:192.168.5.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

eth1:3    Link encap:Ethernet  HWaddr 08:00:27:7D:8E:49  
          inet addr:192.168.5.153  Bcast:192.168.5.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

eth1:4    Link encap:Ethernet  HWaddr 08:00:27:7D:8E:49  
          inet addr:192.168.5.151  Bcast:192.168.5.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

eth1:5    Link encap:Ethernet  HWaddr 08:00:27:7D:8E:49  
          inet addr:192.168.5.152  Bcast:192.168.5.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

eth1:6    Link encap:Ethernet  HWaddr 08:00:27:7D:8E:49  
          inet addr:192.168.6.58  Bcast:192.168.6.255  Mask:255.255.255.0
-->  VIPs are using 192.168.5.X as base address whereas our GNS VIP is using: 192.168.6.58
     This is not correct VIPs a GNS VIP should have the same Network address !

[root@hract21 Desktop]# srvctl config gns
GNS is enabled.
GNS VIP addresses: 192.168.6.58
Domain served by GNS: grid12c.example.com

Let's investigate whether somebody changed the GNS base add
[grid@hract21 trace]$ grep clsgndadvAdvertise gnsd.trc
Lets check wether the GNS base address was changed :
2015-02-02 12:32:09.447471 : GNS:3141969472: main::clsgndadvAdvertise: 
                             Listening for commands on endpoint(s): tcp://192.168.5.58:46453.
2015-02-03 17:22:00.410829 : GNS:4114409024: main::clsgndadvAdvertise: 
                             Listening for commands on endpoint(s): tcp://192.168.5.58:25702.
2015-02-03 17:24:51.165609 : GNS:2221307456: main::clsgndadvAdvertise: 
                              Listening for commands on endpoint(s):tcp://192.168.6.58:27105.
2015-02-03 17:29:17.491219 : GNS:489064000:  main::clsgndadvAdvertise: 
                             Listening for commands on endpoint(s): tcp://192.168.6.58:63806.
--> GNS base address was changed from  192.168.5.58 to 192.168.6.58 ! 

Verify Error with cluvy
[grid@hract21 CLUVFY]$  cluvfy comp gns -postcrsinst  -verbose
Verifying GNS integrity 
Checking GNS integrity...
Checking if the GNS subdomain name is valid...
The GNS subdomain name "grid12c.example.com" is a valid domain name
Checking if the GNS VIP belongs to same subnet as the public network...
PRVF-5213 : GNS resource configuration check failed
PRCI-1156 : The GNS VIP 192.168.6.58 does not match any of the available subnets 192.168.5.0, 192.168.2.0.
Checking if the GNS VIP is a valid address...
GNS VIP "192.168.6.58" resolves to a valid IP address
Checking the status of GNS VIP...
Checking if FDQN names for domain "grid12c.example.com" are reachable
WARNING: 
PRVF-5218 : "hract21-vip.grid12c.example.com" did not resolve into any IP address
PRVF-5827 : The response time for name lookup for name "hract21-vip.grid12c.example.com" exceeded 15 seconds
Checking status of GNS resource...
  Node          Running?                  Enabled?                
  ------------  ------------------------  ------------------------
  hract21       no                        yes                     
  hract22       no                        yes                     
  hract23       no                        yes                     
PRVF-5211 : GNS resource is not running on any node of the cluster
Checking status of GNS VIP resource...
  Node          Running?                  Enabled?                
  ------------  ------------------------  ------------------------
  hract21       yes                       yes                     
  hract22       no                        yes                     
  hract23       no                        yes                     
GNS integrity check failed
Verification of GNS integrity was unsuccessful. 
Checks did not pass for the following node(s):
    hract21
--> Cluvfy is very helpfull here as cluvfy compares the network adresses with the GNS address
    If GNS and network addresses don't match cluvfy throws PRVF-5213, PRCI-1156 error.

Fix -> Change GNS VIP back to the original address  and restart GNS
[root@hract21 network-scripts]# srvctl modify gns -vip 192.168.5.58
[root@hract21 network-scripts]# srvctl config gns 
  GNS is enabled.
  GNS VIP addresses: 192.168.5.58
  Domain served by GNS: grid12c.example.com
[root@hract21 network-scripts]# srvctl start gns
[root@hract21 network-scripts]# srvctl config gns -a -l
  GNS is enabled.
  GNS is listening for DNS server requests on port 53
  GNS is using port 5353 to connect to mDNS
  GNS status: OK
  Domain served by GNS: grid12c.example.com
  GNS version: 12.1.0.2.0
  Globally unique identifier of the cluster where GNS is running: 3d7c30fc9a0eeff3ff12b79970a14c12
  Name of the cluster where GNS is running: ract2
  Cluster type: server.
  GNS log level: 1.
  GNS listening addresses: tcp://192.168.5.58:30218.
  GNS is individually enabled on nodes: 
  GNS is individually disabled on nodes: 

Reference

Recreate GNS 12102

Backup profile.xml and OCR and gather data of current GNS setup

As of 12.1/11.2 Grid Infrastructure, the private network configuration is not only stored in OCR but also in the 
gpnp profile -  please take a backup of profile.xml on all cluster nodes before proceeding, as grid user:

[root@hract21 ~]# cd $GRID_HOME/gpnp/hract21/profiles/peer/
[root@hract21 peer]#  cp profile.xml profile.xml_bk-2-FEB-2015
[root@hract21 peer]#  ocrconfig -local -manualbackup
hract21     2015/02/02 09:04:23     /u01/app/121/grid/cdata/hract21/backup_20150202_090423.olr     0     
hract21     2015/01/30 12:40:51     /u01/app/121/grid/cdata/hract21/backup_20150130_124051.olr     0     
[root@hract21 peer]#  ocrconfig -local -showbackup
hract21     2015/02/02 09:04:23     /u01/app/121/grid/cdata/hract21/backup_20150202_090423.olr     0     
hract21     2015/01/30 12:40:51     /u01/app/121/grid/cdata/hract21/backup_20150130_124051.olr     0  

[root@hract21 peer]# oifcfg getif
eth1  192.168.5.0  global  public
eth2  192.168.2.0  global  cluster_interconnect,asm

[root@hract21 peer]# crsctl status resource ora.gns.vip -f | grep USR_ORA_VIP
USR_ORA_VIP=192.168.5.58

[root@hract21 peer]#  ifconfig eth1 | egrep 'eth|inet addr'
eth1      Link encap:Ethernet  HWaddr 08:00:27:7D:8E:49  
          inet addr:192.168.5.121  Bcast:192.168.5.255  Mask:255.255.255.0
[root@hract21 peer]# ifconfig eth2  | egrep 'eth|inet addr'
eth2      Link encap:Ethernet  HWaddr 08:00:27:4E:C9:BF  
          inet addr:192.168.2.121  Bcast:192.168.2.255  Mask:255.255.255.0
[root@hract21 peer]#  ifconfig eth3   | egrep 'eth|inet addr'
eth3      Link encap:Ethernet  HWaddr 08:00:27:3B:89:BF  
          inet addr:192.168.3.121  Bcast:192.168.3.255  Mask:255.255.255.0

[root@hract21 peer]#  srvctl config gns -a -l
GNS is enabled.
GNS is listening for DNS server requests on port 53
GNS is using port 5353 to connect to mDNS
GNS status: OK
Domain served by GNS: grid12c.example.com
GNS version: 12.1.0.2.0
Globally unique identifier of the cluster where GNS is running: 3d7c30fc9a0eeff3ff12b79970a14c12
Name of the cluster where GNS is running: ract2
Cluster type: server.
GNS log level: 1.
GNS listening addresses: tcp://192.168.5.58:39839.
GNS is individually enabled on nodes: 
GNS is individually disabled on nodes: 

Stop resources and recreate  gns, nodeapps

[root@hract21 peer]#  srvctl stop scan_listener 
[root@hract21 peer]#  srvctl stop scan
[root@hract21 peer]#  srvctl stop nodeapps -f
PRCC-1016 : ons was already stopped
PRCR-1005 : Resource ora.ons is already stopped

[root@hract21 peer]#  srvctl stop gns
[root@hract21 Desktop]#  srvctl remove gns 
Remove GNS? (y/[n]) y


[root@hract21 Desktop]# srvctl remove nodeapps
Please confirm that you intend to remove node-level applications on all nodes of the cluster (y/[n]) y
[root@hract21 Desktop]# srvctl  add gns -i 192.168.5.58 -d  grid12c.example.com
[root@hract21 Desktop]# srvctl config gns
GNS is enabled.
GNS VIP addresses: 192.168.5.58
Domain served by GNS: grid12c.example.com
[root@hract21 Desktop]# srvctl config gns -list
CLSNS-00005: operation timed out
  CLSNS-00025: unable to locate GNS
    CLSGN-00070: Service location failed.
[root@hract21 Desktop]# srvctl start gns
[root@hract21 Desktop]# srvctl config gns -list
Oracle-GNS A 192.168.5.58 Unique Flags: 0x115
ract2.Oracle-GNS SRV Target: Oracle-GNS Protocol: tcp Port: 46453 Weight: 0 Priority: 0 Flags: 0x115
ract2.Oracle-GNS TXT CLUSTER_NAME="ract2", CLUSTER_GUID="3d7c30fc9a0eeff3ff12b79970a14c12", NODE_NAME="hract21", SERVER_STATE="RUNNING", VERSION="12.1.0.2.0", DOMAIN="grid12c.example.com" Flags: 0x115
--> No VIPs there  

Recreate Nodeapps
[root@hract21 Desktop]#  srvctl add nodeapps -S 192.168.5.0/255.255.255.0/eth1 
 [root@hract21 Desktop]#  srvctl config gns -list
Oracle-GNS A 192.168.5.58 Unique Flags: 0x115
hract21-vip A 192.168.5.246 Unique Flags: 0x1
hract22-vip A 192.168.5.239 Unique Flags: 0x1
hract23-vip A 192.168.5.244 Unique Flags: 0x1
ract2.Oracle-GNS SRV Target: Oracle-GNS Protocol: tcp Port: 46453 Weight: 0 Priority: 0 Flags: 0x115
ract2.Oracle-GNS TXT CLUSTER_NAME="ract2", CLUSTER_GUID="3d7c30fc9a0eeff3ff12b79970a14c12", NODE_NAME="hract21", SERVER_STATE="RUNNING", VERSION="12.1.0.2.0", DOMAIN="grid12c.example.com" Flags: 0x115
--> Now VIPs should be ONLINE 
*****  Cluster Resources: *****
Resource NAME               INST   TARGET       STATE        SERVER          STATE_DETAILS
--------------------------- ----   ------------ ------------ --------------- -----------------------------------------
ora.hract21.vip                1   ONLINE       ONLINE       hract21         STABLE  
ora.hract22.vip                1   ONLINE       ONLINE       hract22         STABLE  
ora.hract23.vip                1   ONLINE       ONLINE       hract23         STABLE 

Restart SCAN and SCAN Listeners
[root@hract21 Desktop]#  srvctl start scan
--> Now SCANs should be ONLINE
ora.scan1.vip                  1   ONLINE       ONLINE       hract22         STABLE  
ora.scan2.vip                  1   ONLINE       ONLINE       hract23         STABLE  
ora.scan3.vip                  1   ONLINE       ONLINE       hract21         STABLE  

[root@hract21 Desktop]# srvctl start scan_listener
--> Now SCAN_LISTENER should be ONLINE
*****  Cluster Resources: *****
Resource NAME               INST   TARGET       STATE        SERVER          STATE_DETAILS
--------------------------- ----   ------------ ------------ --------------- -----------------------------------------
ora.LISTENER_SCAN1.lsnr        1   ONLINE       ONLINE       hract22         STABLE  
ora.LISTENER_SCAN2.lsnr        1   ONLINE       ONLINE       hract23         STABLE  
ora.LISTENER_SCAN3.lsnr        1   ONLINE       ONLINE       hract21         STABLE  

Verify GNS
[root@hract21 Desktop]#   srvctl config gns -list
Oracle-GNS A 192.168.5.58 Unique Flags: 0x115
hract21-vip A 192.168.5.246 Unique Flags: 0x1
hract22-vip A 192.168.5.239 Unique Flags: 0x1
hract23-vip A 192.168.5.244 Unique Flags: 0x1
ract2-scan A 192.168.5.238 Unique Flags: 0x1
ract2-scan A 192.168.5.243 Unique Flags: 0x1
ract2-scan A 192.168.5.245 Unique Flags: 0x1
ract2-scan1-vip A 192.168.5.243 Unique Flags: 0x1
ract2-scan2-vip A 192.168.5.245 Unique Flags: 0x1
ract2-scan3-vip A 192.168.5.238 Unique Flags: 0x1
ract2.Oracle-GNS SRV Target: Oracle-GNS Protocol: tcp Port: 46453 Weight: 0 Priority: 0 Flags: 0x115
ract2.Oracle-GNS TXT CLUSTER_NAME="ract2", CLUSTER_GUID="3d7c30fc9a0eeff3ff12b79970a14c12", NODE_NAME="hract21", 
   SERVER_STATE="RUNNING", VERSION="12.1.0.2.0", DOMAIN="grid12c.example.com" Flags: 0x115
--> VIPS, SCAN and SCAN VIPS should be ONLINE 
    Congrats you have successfully reconfigured GNS on 12.1.0.2 !

Potential problem : PRCN-2065,PRCN-2067  during recreating nodeapps

Note stopping nodeapps should stop the ONS !
[grid@hract21 trace]$  srvctl stop nodeapps -n hract21 -f
*****  Local Resources: *****
Rescource NAME                 TARGET     STATE           SERVER       STATE_DETAILS                       
-------------------------      ---------- ----------      ------------ ------------------                  
ora.ons                        OFFLINE    OFFLINE         hract21      STABLE   
ora.ons                        ONLINE     ONLINE          hract22      STABLE   
ora.ons                        ONLINE     ONLINE          hract23      STABLE   
[root@hract21 Desktop]# netstat -tapen | egrep '6100|6200'
-> Ons is stopped - port 6100 and 6200 not actice !
Sometimes during my testing  the remote  ONS port was still active after  :
  srvctl stop nodeapps -f
Later if we try to create the nodeapps we get the following error:
[root@hract21 Desktop]#  srvctl add nodeapps -S 192.168.5.0/255.255.255.0/eth1
PRCN-2065 : Ports 6200 are not available on the nodes given
PRCN-2067 : Port 6200 is not available on nodes: hract21,hract22,hract23

Verify TCP prot  status :
[root@hract22 ~]# netstat -taupen | grep 6200
tcp        0      0 :::6200                    ..  LISTEN      501        441704     21856/ons           
tcp        0      0 ::ffff:192.168.5.122:6200  ..  ESTABLISHED 501        67450915   21856/ons           
tcp        0      0 ::ffff:192.168.5.122:6200  ..  ESTABLISHED 501        72457163   21856/ons 
ONS was still running a occupied port 6200. This creates the above error ! 

WA: use the -skip parameter ( for details please read BUG 18317414 ) 
What is this really doing ?
[root@hract21 Desktop]# srvctl add nodeapps -skip -help
    -skip        Skip reachability check of VIP address and port validation for ONS

Now recreate the nodeapps with the skip paramter
[root@hract21 Desktop]#   srvctl add nodeapps  -skip  -S 192.168.5.0/255.255.255.0/eth1
--> Worked !!

Reference

  • Bug 18317414 : LNX64-12.1-INSTALL-SCC:RERUN ROOT.SH FAILED AT ADD NODEAPPS

Troubleshooting Clusterware and Clusterware component error : Address already in use

Generic RAC Portnumber Information

                                                  Default Port   Port Range  Protocol  Used for 
                                                  Number                               CI only? 
Cluster Synchronization Service daemon (CSSD)     42424          Dynamic     UDP       Yes
Oracle Grid Interprocess Communication (GIPCD)    42424          Dynamic     UDP       Yes
Oracle HA Services daemon (OHASD)                 42424          Dynamic     UDP       Yes
Multicast Domain Name Service (MDNSD)              5353          Dynamic     UDP/TCP    No 
Oracle Grid Naming Service (GNSD)                    53          53 (public) UDP/TCP    No
Oracle Notification Services (ONS)                 6100 (local)  Configured  TCP        No
                                                   6200 (remote)   manually
    
Port 42424 :
CSSD  : The Cluster Synchronization Service (CSS) daemon uses a fixed port for node restart 
        advisory messages.This port is used on all interfaces that have broadcast capability. 
        Broadcast  occurs only when a node  eviction restart is imminent.
OHASD : The Oracle High Availability Services (OHAS) daemon starts the Oracle Clusterware 
         stack.
GIPCD : A support daemon that enables Redundant Interconnect Usage.

Port 5353 :
MDNSD : The mDNS process is a background process on Linux and UNIX, and a service on Window, 
        and is necessary  for Grid Plug and Play and GNS.

Port 53: 
GNSD  : The Oracle Grid Naming Service daemon provides a gateway between the cluster mDNS and 
        external DNS servers. 
        The gnsd process performs name resolution within the cluster.

Port 6100/6200 :
ONS   : Port for ONS, used to publish and subscribe service for communicating information about 
        Fast Application Notification (FAN) events. The FAN notification process uses system 
        events that Oracle Database publishes  when cluster servers become unreachable or if 
        network interfaces fail.
        Use srvctl to modify ONS port

Verify port usage at OS level
As GNS runs only on a single node the cluster we need to Relocate GNS first :
[root@hract21 ~]# srvctl relocate gns -n hract21 

[root@hract21 Desktop]#  netstat -taupen |grep ":42424 "
udp        0      0 192.168.2.255:42424         0.0.0.0:*  0          10361774   11545/ohasd.bin     
udp        0      0 230.0.1.0:42424             0.0.0.0:*  0          10361773   11545/ohasd.bin     
udp        0      0 224.0.0.251:42424           0.0.0.0:*  0          10361772   11545/ohasd.bin     
udp        0      0 192.168.2.255:42424         0.0.0.0:*  501        10361732   11764/gipcd.bin     
udp        0      0 230.0.1.0:42424             0.0.0.0:*  501        10361731   11764/gipcd.bin     
udp        0      0 224.0.0.251:42424           0.0.0.0:*  501        10361730   11764/gipcd.bin     
udp        0      0 192.168.2.255:42424         0.0.0.0:*  501        10361722   11825/ocssd.bin     
udp        0      0 230.0.1.0:42424             0.0.0.0:*  501        10361721   11825/ocssd.bin     
udp        0      0 224.0.0.251:42424           0.0.0.0:*  501        10361720   11825/ocssd.bin 

[root@hract21 Desktop]# netstat -taupen |grep ":53 "
udp        0      0 192.168.5.58:53             0.0.0.0:*   0          46593880   5261/gnsd.bin  

[root@hract21 Desktop]#  netstat -taupen |grep ":5353 "
udp        0      0 0.0.0.0:5353            0.0.0.0:*  501        1378331    11724/mdnsd.bin     
udp        0      0 0.0.0.0:5353            0.0.0.0:*  501        1378210    11724/mdnsd.bin     
udp        0      0 0.0.0.0:5353            0.0.0.0:*  501        1378209    11724/mdnsd.bin     
udp        0      0 0.0.0.0:5353            0.0.0.0:*  501        1378208    11724/mdnsd.bin 

[root@hract21 Desktop]#  netstat -taupen |grep ":6100 "
tcp        0      0 127.0.0.1:6100     0.0.0.0:*     LISTEN  501  10419706   31762/ons    
..

 

Prepare Test program JavaUDPServer.java

Source can be found here : Simple Java example of UDP Client/Server communication

[root@hract21 JAVA]#  javac JavaUDPServer.java

Testing when a port is free and our program can successful listen to that port: 
[root@hract21 JAVA]# java  JavaUDPServer 59
Listening on UDP Port: 59
--> press <cntrl>C to terminate the program

Testing program  when port is already in use
[root@hract21 JAVA]# java  JavaUDPServer  53
Listening on UDP Port: 53
Jan 31, 2015 4:57:29 PM JavaUDPServer main
SEVERE: null
java.net.BindException: Address already in use
    at java.net.PlainDatagramSocketImpl.bind0(Native Method)
    at java.net.PlainDatagramSocketImpl.bind(PlainDatagramSocketImpl.java:125)
    at java.net.DatagramSocket.bind(DatagramSocket.java:372)

 

Case I: Clusterware startup fails as  Portnumber:  42424  is in use

Start our test program to block UPD port 42424
[root@hract21 JAVA]#  java  JavaUDPServer  42424
Listening on UDP Port: 42424

Start CRS and monitor local CRS stack
[root@hract21 Desktop]# crsct start crs
*****  Local Resources: *****
Resource NAME               INST   TARGET    STATE        SERVER          STATE_DETAILS
--------------------------- ----   ------------ ------------ --------------- -----------------------------------------
ora.asm                        1   ONLINE    OFFLINE      -               STABLE
ora.cluster_interconnect.haip  1   ONLINE    OFFLINE      -               STABLE
ora.crf                        1   ONLINE    OFFLINE      -               STABLE
ora.crsd                       1   ONLINE    OFFLINE      -               STABLE
ora.cssd                       1   ONLINE    OFFLINE      hract21         STARTING
ora.cssdmonitor                1   ONLINE     ONLINE       hract21         STABLE
ora.ctssd                      1   ONLINE    OFFLINE      -               STABLE
ora.diskmon                    1   OFFLINE    OFFLINE      -               STABLE
ora.drivers.acfs               1   ONLINE    ONLINE       hract21         STABLE
ora.evmd                       1   ONLINE    INTERMEDIATE hract21         STABLE
ora.gipcd                      1   ONLINE    ONLINE       hract21         STABLE
ora.gpnpd                      1   ONLINE    ONLINE       hract21         STABLE
ora.mdnsd                      1   ONLINE    ONLINE       hract21         STABLE
ora.storage                    1   ONLINE    OFFLINE      -               STABLE

--> evmd process remain in status INTERMEDIATE . Local cluster stack doesn't up !
Investigate Trace files:
alert.log: 
2015-01-31 17:14:54.492 [CSSDAGENT(22642)]CRS-5818: Aborted command 'start' for resource 'ora.cssd'. Details at (:CRSAGF00113:) {0:9:3} in /u01/app/grid/diag/crs/hract21/crs/trace/ohasd_cssdagent_root.trc.
Sat Jan 31 17:14:59 2015
Errors in file /u01/app/grid/diag/crs/hract21/crs/trace/ocssd.trc  (incident=1):
CRS-8503 [] [] [] [] [] [] [] [] [] [] [] []

gipcd.trc:
2015-01-31 17:20:27.606277 :GIPCHTHR:812046080:  gipchaWorkerCreateInterface: created local interface for node 'hract21', haName 'gipcd_ha_name', inf 'udp://192.168.2.121:28764' inf 0x7fef0c190b30
2015-01-31 17:20:27.606350 :GIPCXCPT:812046080:  gipcmodNetworkProcessBind: failed to bind endp 0x7fef182d8230 [000000000001e71a] { gipcEndpoint : localAddr 'mcast://224.0.0.251:42424/192.168.2.121', remoteAddr '', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj (nil), sendp 0x7fef182da320 status 13flags 0x20000000, flags-2 0x0, usrFlags 0xc000 }, addr 0x7fef182d8cf0 [000000000001e71c] { gipcAddress : name 'mcast://224.0.0.251:42424/192.168.2.121', objFlags 0x0, addrFlags 0x5 }
2015-01-31 17:20:27.606358 :GIPCXCPT:812046080:  gipcmodNetworkProcessBind: slos op  :  sgipcnMctBind
2015-01-31 17:20:27.606360 :GIPCXCPT:812046080:  gipcmodNetworkProcessBind: slos dep :  Address already in use (98)
2015-01-31 17:20:27.606361 :GIPCXCPT:812046080:  gipcmodNetworkProcessBind: slos loc :  bind
2015-01-31 17:20:27.606363 :GIPCXCPT:812046080:  gipcmodNetworkProcessBind: slos info:  Invalid argument
2015-01-31 17:20:27.606399 :GIPCXCPT:812046080:  gipcBindF [gipcInternalEndpoint : gipcInternal.c : 468]: EXCEPTION[ ret gipcretAddressInUse (20) ]  failed to bind endp 0x7fef182d8230 [000000000001e71a] { gipcEndpoint : localAddr 'mcast://224.0.0.251:42424/192.168.2.121', remoteAddr '', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj (nil), sendp 0x7fef182da320 status 13flags 0x20000000, flags-2 0x0, usrFlags 0xc000 }, addr 0x7fef182d9a20 [000000000001e721] { gipcAddress : name 'mcast://224.0.0.251:42424/192.168.2.121', objFlags 0x0, addrFlags 0x4 }, flags 0x8000
2015-01-31 17:20:27.606408 :GIPCXCPT:812046080:  gipcInternalEndpoint: failed to bind address to endpoint name 'mcast://224.0.0.251:42424/192.168.2.121', ret gipcretAddressInUse (20)
2015-01-31 17:20:27.606426 :GIPCHTHR:812046080:  gipchaWorkerUpdateInterface: EXCEPTION[ ret gipcretAddressInUse (20) ]  failed to create local interface 'udp://192.168.2.121', 0x7fef0c190b30 { host '', haName 'gipcd_ha_name', local (nil), ip '192.168.2.121', subnet '192.168.2.0', mask '255.255.255.0', mac '08-00-27-4e-c9-bf', ifname 'eth2', numRef 0, numFail 0, idxBoot 0, flags 0x1841 }, hctx 0x10639b0 [0000000000000011] { gipchaContext : host 'hract21', name 'gipcd_ha_name', luid '8c45d6e7-00000000', name2 3aca-bf27-17d5-691e, numNode 0, numInf 1, maxPriority 0, clientMode 1, nodeIncarnation d64c9b7c-06451148 usrFlags 0x0, flags 0x2d65 }
2015-01-31 17:20:27.606432 :GIPCHGEN:812046080:  gipchaInterfaceDisable: disabling interface 0x7fef0c190b30 { host '', haName 'gipcd_ha_name', local (nil), ip '192.168.2.121', subnet '192.168.2.0', mask '255.255.255.0', mac '08-00-27-4e-c9-bf', ifname 'eth2', numRef 0, numFail 0, idxBoot 0, flags 0x1841 }
2015-01-31 17:20:27.606438 :GIPCHDEM:812046080:  gipchaWorkerCleanInterface: performing cleanup of disabled interface 0x7fef0c190b30 { host '', haName 'gipcd_ha_name', local (nil), ip '192.168.2.121', subnet '192.168.2.0', mask '255.255.255.0', mac '08-00-27-4e-c9-bf', ifname 'eth2', numRef 0, numFail 0, idxBoot 0, flags 0x1861 }
2015-01-31 17:20:27.60

Investigate the error more in detail  : 
gipcmodNetworkProcessBind: slos dep :  Address already in use (98) 
[root@hract21 Desktop]# cat  /usr/include/asm-generic/errno.h | grep 98
#define    EADDRINUSE    98    /* Address already in use */

Locate the port number :
gipcEndpoint : localAddr 'mcast://224.0.0.251:42424/192.168.2.121 -->   42424 is the port 
--> CW   can't listen on port 42424 ! 

Locate the  blocking process at OS level
[root@hract21 Desktop]#    netstat -taupen |grep ":42424 "
udp        0      0 :::42424         ....    22338/java          
[root@hract21 Desktop]# ps -elf | grep 22338
0 S root     22338 26783  0  80   0 - 438331 futex_ 17:04 pts/12  00:00:01 java JavaUDPServer 42424

--> Yep our java program blocks CW from comming up ! Kill java program and restart CW 
[root@hract21 Desktop]# kill -9 22338
[root@hract21 Desktop]# crsctl stop crs -f
[root@hract21 Desktop]# crsctl start crs

 

Case II: Clusterware startup fails as Portnumber  5353  is in use

Start our test program and block MDSND port 5353
[root@hract21 JAVA]#  java  JavaUDPServer 5353
Listening on UDP Port: 5353

*****  Local Resources: *****
Resource NAME               INST   TARGET       STATE        SERVER          STATE_DETAILS
--------------------------- ----   ------------ ------------ --------------- -----------------------------------------
ora.asm                        1   ONLINE       ONLINE       hract21         STABLE
ora.cluster_interconnect.haip  1   ONLINE    ONLINE       hract21         STABLE
ora.crf                        1   ONLINE    ONLINE       hract21         STABLE
ora.crsd                       1   ONLINE    ONLINE       hract21         STABLE
ora.cssd                       1   ONLINE    ONLINE       hract21         STABLE
ora.cssdmonitor                1   ONLINE     ONLINE       hract21         STABLE
ora.ctssd                      1   ONLINE    ONLINE       hract21         OBSERVER,STABLE
ora.diskmon                    1   OFFLINE    OFFLINE      -               STABLE
ora.drivers.acfs               1   ONLINE    ONLINE       hract21         STABLE
ora.evmd                       1   ONLINE    ONLINE       hract21         STABLE
ora.gipcd                      1   ONLINE    ONLINE       hract21         STABLE
ora.gpnpd                      1   ONLINE    ONLINE       hract21         STABLE
ora.mdnsd                      1   ONLINE    INTERMEDIATE hract21         STABLE
ora.storage                    1   ONLINE    ONLINE       hract21         STABLE
--> MDMSD daemon doesn' start 

mdnsd.trc :
Oracle Database 12c Clusterware Release 12.1.0.2.0 - Production Copyright 1996, 2014 Oracle. All rights reserved.
    CLSB:2559100480: Argument count (argc) for this daemon is 1
    CLSB:2559100480: Argument 0 is: /u01/app/121/grid/bin/mdnsd.bin
2015-01-31 17:40:17.131516 :  CLSDMT:2554820352: PID for the Process [9863], connkey 9
2015-01-31 17:40:18.042329 :    MDNS:2559100480:  mdnsd interface eth0 (0x2 AF=2 f=0x1043 mcast=-1) 192.168.1.9 mask 255.255.255.0 FAILED. Error 98 (Address already in use)
2015-01-31 17:40:18.043191 :    MDNS:2559100480:  mdnsd interface eth1 (0x3 AF=2 f=0x1043 mcast=-1) 192.168.5.121 mask 255.255.255.0 FAILED. Error 98 (Address already in use)
2015-01-31 17:40:18.046952 :    MDNS:2559100480:  mdnsd interface eth1:1 (0x3 AF=2 f=0x1043 mcast=-1) 192.168.5.241 mask 255.255.255.0 FAILED. Error 98 (Address already in use)
2015-01-31 17:40:18.047574 :    MDNS:2559100480:  mdnsd interface eth1:2 (0x3 AF=2 f=0x1043 mcast=-1) 192.168.5.242 mask 255.255.255.0 FAILED. Error 98 (Address already in use)
2015-01-31 17:40:18.047597 :    MDNS:2559100480:  mdnsd interface eth2 (0x4 AF=2 f=0x1043 mcast=-1) 192.168.2.121 mask 255.255.255.0 FAILED. Error 98 (Address already in use)
2015-01-31 17:40:18.047612 :    MDNS:2559100480:  mdnsd interface eth2:1 (0x4 AF=2 f=0x1043 mcast=-1) 169.254.213.86 mask 255.255.0.0 FAILED. Error 98 (Address already in use)
2015-01-31 17:40:18.049171 :    MDNS:2559100480:  mdnsd interface eth3 (0x5 AF=2 f=0x1043 mcast=-1) 192.168.3.121 mask 255.255.255.0 FAILED. Error 98 (Address already in use)
2015-01-31 17:40:18.049222 :    MDNS:2559100480:  mdnsd interface lo (0x1 AF=2 f=0x49 mcast=-1) 127.0.0.1 mask 255.0.0.0 FAILED. Error 98 (Address already in use)
2015-01-31 17:40:18.049236 :    MDNS:2559100480:  Error! No valid netowrk interfaces found to setup mDNS.
2015-01-31 17:40:18.049240 :    MDNS:2559100480:  Oracle mDNSResponder ver. mDNSResponder-1076 (Jun 30 2014 19:39:45) , init_rv=-65537
2015-01-31 17:40:18.049335 :    MDNS:2559100480:  stopping

--> Here we only get the error :  Address already in use  but info about  the portnumber. 
    We need to reference above list and remember that MSDNS is running on port 5353 

Now we can locate the blocking process , kill that process and restart clusterware
[root@hract21 Desktop]#  netstat -taupen |grep ":5353 "
udp        0      0 :::5353         ...            50111629   7252/java   
Again our java program prevents CW from startup. Kill the that process and resart CW.
[root@hract21 Desktop]# kill -9 7252

 

Case III: Investigate GNS startup problem due to Error:  Address already in use

Relocate GNS to a different host
[root@hract21 Desktop]# srvctl relocate gns -n hract23
ora.gns                        1   ONLINE       ONLINE       hract23         STABLE
ora.gns.vip                    1   ONLINE       ONLINE       hract23         STABLE

Now occupy port 53 by running our JAVA program:
[root@hract21 JAVA]# java  JavaUDPServer 53
Listening on UDP Port: 53

Now try to bring back the GNS 
[root@hract21 Desktop]# srvctl relocate gns -n hract21
*****  Cluster Resources: *****
Resource NAME               INST   TARGET       STATE        SERVER          STATE_DETAILS
--------------------------- ----   ------------ ------------ --------------- -----------------------------------------
ora.gns                        1   ONLINE       OFFLINE      hract21         STARTING
ora.gns.vip                    1   ONLINE       ONLINE       hract21         STABLE
--> GNS is in status STARTING but doesn't come up

gnsd.trc :
2015-01-31 18:09:13.518516 :GIPCXCPT:255158016:  gipcmodNetworkProcessBind: slos op  :  sgipcnTcpBind
2015-01-31 18:09:13.518518 :GIPCXCPT:255158016:  gipcmodNetworkProcessBind: slos dep :  Address already in use (98)
2015-01-31 18:09:13.518520 :GIPCXCPT:255158016:  gipcmodNetworkProcessBind: slos loc :  bind
2015-01-31 18:09:13.518521 :GIPCXCPT:255158016:  gipcmodNetworkProcessBind: slos info:  addr '192.168.5.58:53'
2015-01-31 18:09:13.518577 :GIPCXCPT:255158016:  gipcBindF [gipcInternalEndpoint : gipcInternal.c : 468]: EXCEPTION[ ret gipcretAddressInUse (20) ]  failed to bind endp 0x7ff7000034c0 [0000000000001fc6] { gipcEndpoint : localAddr 'udp://192.168.5.58:53', remoteAddr '', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj (nil), sendp 0x7ff7000050f0 status 13flags 0x20008000, flags-2 0x0, usrFlags 0x24000 }, addr 0x7ff7000047f0 [0000000000001fcd] { gipcAddress : name 'udp://192.168.5.58:53', objFlags 0x0, addrFlags 0x4 }, flags 0x20000
2015-01-31 18:09:13.518589 :GIPCXCPT:255158016:  gipcInternalEndpoint: failed to bind address to endpoint name 'udp://192.168.5.58:53', ret gipcretAddressInUse (20)
2015-01-31 18:09:13.518608 :GIPCXCPT:255158016:  gipcEndpointF [clsgngipcCreateEndpointInternal : clsgngipc.c : 2008]: EXCEPTION[ ret gipcretAddressInUse (20) ]  failed endp create ctx 0x7ff7196f3c80 [0000000000001e99] { gipcContext : traceLevel 2, fieldLevel 0x0, numDead 0, numPending 0, numZombie 0, numObj 4, numWait 0, numReady 0, wobj 0x7ff7196f1c10, hgid 0000000000001e9a, flags 0x1a, objFlags 0x0 }, name 'udp://192.168.5.58:53', flags 0x24000
2015-01-31 18:09:13.518728 :     GNS:255158016: Resolve::clsgndnsCreateContainerCallback: (:CLSGN01163:) Error - Address in use: port 53 address "192.168.5.58". 1: clskec:has:CLSGN:208 2 args[has:CLSGN:208][udp://192.168.5.58:53]
2: clskec:has:gipc:20 1 args[has:gipc:20]
3: clskec:has:CLSU:910 4 args[has][mod=gipcInternalEndpoint][loc=473][msg=failed to bind address to endpoint name 'udp://192.168.5.58:53']
2015-01-31 18:09:13.518769 :     GNS:255158016: Resolve::clsgndnsCreateContainer: (:CLSGN00927:) failed to listen on all addresses - throwing error.
default:255158016: listen failed with 1 errors
1: clskec:has:CLSGN:208 3 args[has:CLSGN:208][192.168.5.58][53]

The following error messages tell use Linux errno code and the related portnumber :
2015-01-31 18:09:13.518518 :GIPCXCPT:255158016:  gipcmodNetworkProcessBind: slos dep :  Address already in use (98)
2015-01-31 18:09:13.518521 :GIPCXCPT:255158016:  gipcmodNetworkProcessBind: slos info:  addr '192.168.5.58:53'

Again locate the port number and kill the process
[root@hract21 Desktop]#  netstat -taupen |grep ":53 "
udp    16128      0 :::53          ...          51417680   23723/java
Again kill the process which holds the port number and restart CW 
[root@hract21 Desktop]# kill -9  23723

Now test whether Relocated GNS works again
[root@hract21 ~]#   srvctl relocate gns -n hract21
*****  Cluster Resources: *****
Resource NAME               INST   TARGET       STATE        SERVER          STATE_DETAILS
--------------------------- ----   ------------ ------------ --------------- -----------------------------------------
ora.gns                        1   ONLINE       ONLINE       hract21         STABLE
ora.gns.vip                    1   ONLINE       ONLINE       hract21         STABL

Complete Portnumber Usage of a working RAC system

[root@hract21 ~]#   netstat -taupen |grep 192.168
tcp        0      0 192.168.5.242:1521          0.0.0.0:*                   LISTEN      501        50803141   17310/tnslsnr       
tcp        0      0 192.168.5.241:1521          0.0.0.0:*                   LISTEN      501        50793916   17258/tnslsnr       
tcp        0      0 192.168.5.121:1521          0.0.0.0:*                   LISTEN      501        50793894   17258/tnslsnr       
tcp        0      0 192.168.2.121:1522          0.0.0.0:*                   LISTEN      501        50790436   17212/tnslsnr       
tcp        0      0 192.168.2.121:61020         0.0.0.0:*                   LISTEN      0          50773311   16994/osysmond.bin  
tcp        0      0 192.168.5.121:42942         0.0.0.0:*                   LISTEN      501        50724207   16454/gipcd.bin     
tcp        0      0 192.168.5.58:39839          0.0.0.0:*                   LISTEN      0          51856456   27381/gnsd.bin      
tcp        0      0 192.168.5.232:36063         0.0.0.0:*                   LISTEN      502        8145376    596/exectask        
tcp        0      0 192.168.5.121:15043         0.0.0.0:*                   LISTEN      0          50730332   16281/ohasd.bin     
tcp        0      0 192.168.5.121:42942         192.168.5.123:28657         ESTABLISHED 501        50841598   16454/gipcd.bin     
tcp        0      0 192.168.5.241:1521          192.168.5.121:55119         ESTABLISHED 501        50829166   17258/tnslsnr       
tcp        0      0 192.168.2.121:46847         192.168.2.122:1522          ESTABLISHED 0          50774509   17012/crsd.bin      
tcp        0      0 192.168.2.121:1522          192.168.2.123:60331         ESTABLISHED 501        50795614   17212/tnslsnr       
tcp        0      0 192.168.2.121:1522          192.168.2.121:16025         ESTABLISHED 501        50829535   17212/tnslsnr       
tcp        0      0 192.168.2.121:1522          192.168.2.122:54611         ESTABLISHED 501        50796842   17212/tnslsnr       
tcp        0      0 192.168.2.121:46865         192.168.2.122:1522          ESTABLISHED 501        50829527   17468/asm_lreg_+ASM 
tcp        0      0 192.168.5.242:1521          192.168.5.121:61101         ESTABLISHED 501        50838159   17310/tnslsnr       
tcp        0      0 192.168.5.121:42942         192.168.5.122:32304         ESTABLISHED 501        50841582   16454/gipcd.bin     
tcp        1      0 192.168.1.9:39471           80.150.192.73:80            CLOSE_WAIT  0          50900520   4786/clock-applet   
tcp        0      0 192.168.2.121:1522          192.168.2.121:16024         ESTABLISHED 501        50829534   17212/tnslsnr       
tcp        0      0 192.168.2.121:16024         192.168.2.121:1522          ESTABLISHED 501        50829529   17468/asm_lreg_+ASM 
tcp        0      0 192.168.2.121:16025         192.168.2.121:1522          ESTABLISHED 501        50829531   17468/asm_lreg_+ASM 
tcp        0      0 192.168.2.121:28139         192.168.2.123:1522          ESTABLISHED 501        50829525   17468/asm_lreg_+ASM 
tcp        0      0 192.168.5.121:64227         192.168.5.122:35547         ESTABLISHED 501        50790718   16454/gipcd.bin     
tcp        0      0 192.168.5.121:21046         192.168.5.123:6200          ESTABLISHED 501        50787900   17215/ons           
tcp        0      0 192.168.5.121:59844         192.168.5.50:22             ESTABLISHED 0          44509382   13726/ssh           
tcp        0      0 192.168.5.121:61101         192.168.5.242:1521          ESTABLISHED 502        50838158   17721/ora_lreg_bank 
tcp        0      0 192.168.5.58:39839          192.168.5.121:34266         TIME_WAIT   0          0          -                   
tcp        0      0 192.168.5.121:16432         192.168.5.122:6200          ESTABLISHED 501        50787901   17215/ons           
tcp        0      0 192.168.2.121:39861         192.168.2.123:61021         ESTABLISHED 0          50769440   16994/osysmond.bin  
tcp        0      0 192.168.5.121:55125         192.168.5.241:1521          ESTABLISHED 502        50837652   17721/ora_lreg_bank 
tcp        0      0 192.168.5.121:55119         192.168.5.241:1521          ESTABLISHED 501        50829165   17468/asm_lreg_+ASM 
tcp        0      0 192.168.5.241:1521          192.168.5.121:55125         ESTABLISHED 501        50837653   17258/tnslsnr       
tcp        0      0 192.168.5.121:10242         192.168.5.123:17701         ESTABLISHED 501        50790723   16454/gipcd.bin     
tcp        0      0 192.168.5.121:55728         192.168.5.123:22            ESTABLISHED 0          27679552   25184/ssh           
udp        0      0 192.168.2.121:35570         0.0.0.0:*                               0          50731287   16281/ohasd.bin     
udp        0      0 192.168.2.121:51962         0.0.0.0:*                               0          50751183   16922/octssd.bin    
udp        0      0 192.168.2.255:42424         0.0.0.0:*                               501        50734962   16537/ocssd.bin     
udp        0      0 192.168.2.255:42424         0.0.0.0:*                               0          50731290   16281/ohasd.bin     
udp        0      0 192.168.2.255:42424         0.0.0.0:*                               501        50725223   16454/gipcd.bin     
udp        0      0 192.168.2.121:15891         0.0.0.0:*                               501        50734959   16537/ocssd.bin     
udp        0      0 192.168.2.121:12075         0.0.0.0:*                               501        50782505   16408/evmd.bin      
udp        0      0 192.168.5.58:53             0.0.0.0:*                               0          51856599   27381/gnsd.bin      
udp        0      0 192.168.5.58:123            0.0.0.0:*                               38         51843931   1291/ntpd           
udp        0      0 192.168.5.242:123           0.0.0.0:*                               38         50803109   1291/ntpd           
udp        0      0 192.168.5.241:123           0.0.0.0:*                               38         50793859   1291/ntpd           
udp        0      0 192.168.3.121:123           0.0.0.0:*                               0          43573989   1291/ntpd           
udp        0      0 192.168.2.121:123           0.0.0.0:*                               0          43573987   1291/ntpd           
udp        0      0 192.168.5.121:123           0.0.0.0:*                               0          43573984   1291/ntpd           
udp        0      0 192.168.1.9:123             0.0.0.0:*                               0          43573983   1291/ntpd           
udp        0      0 192.168.2.121:53498         0.0.0.0:*                               0          50776026   17012/crsd.bin      
udp        0      0 192.168.2.121:45379         0.0.0.0:*                               501        50725220   16454/gipcd.bin

Troubleshooting hint CW startproblems due to  Address already in use errors

Before CW startup verify the the following ports are not in use at all 
[root@hract21 Desktop]#    netstat -taupen |grep ":42424 "
[root@hract21 Desktop]#    netstat -taupen |grep ":5353 "
[root@hract21 Desktop]#    netstat -taupen |grep ":53 "
[root@hract21 Desktop]#    netstat -taupen |egrep ":6100 |:6200"
If you find any processes not belonging to the Oracle Clusterware stack you need to kill/stop 
these processes


If having problem with Clusterware startup or CW components startup ( GSD, VIPs ) you may 
check your clusterware tracefils for  "Address already in use" Error .

Note the tracefile location has changed for RAC 12.1.0.2 :
[grid@hract21 trace]$  grep -l "Address already in use" *
gipcd.trc
gnsd.trc
mdnsd.trc
ocssd.trc
ohasd.trc

Now find details:
# grep "Address already in use" ohasd.trc  mdnsd.trc  ocssd.trc gnsd.trc  gipcd.trc gnsd.trc | grep "2015-01-31 17"
ohasd.trc:2015-01-31 17:30:16.613432 :GIPCXCPT:2420897536:  gipcmodNetworkProcessBind: slos dep :  Address already in use (98)
mdnsd.trc:2015-01-31 17:40:18.049222 :    MDNS:2559100480:  mdnsd interface lo (0x1 AF=2 f=0x49 mcast=-1) 127.0.0.1 mask 255.0.0.0 FAILED. Error 98 (Address already in use)
ocssd.trc:2015-01-31 17:04:56.013085 :GIPCXCPT:3986515712:  gipcmodNetworkProcessBind: slos dep :  Address already in use (98)
gipcd.trc:2015-01-31 17:06:25.204775 :GIPCXCPT:812046080:   gipcmodNetworkProcessBind: slos dep :  Address already in use (98)
gnsd.trc:2015-01-31 18:09:13.518518 :GIPCXCPT:2551580oblematic Portnumber and 16:    gipcmodNetworkProcessBind: slos dep :  Address already in use (98)

For mdnsd.trc we already know the port number          :   5353   
For ohasd.trc, ocssd.trc, gipcd.trc the port number is :  42424
For GNS the tracefiles provides details about the problematic Portnumber and IP-adress
[grid@hract21 trace]$  grep gipcmodNetworkProcessBind  gnsd.trc  
2015-01-31 18:09:13.518483 :GIPCXCPT:255158016:  gipcmodNetworkProcessBind: failed to bind endp 0x7ff7000034c0 [0000000000001fc6] { gipcEndpoint : localAddr 'udp://192.168.5.58:53', remoteAddr '', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 0, readyRef (nil), ready 0, wobj (nil), sendp 0x7ff7000050f0 status 13flags 0x20008000, flags-2 0x0, usrFlags 0x24000 }, addr 0x7ff7000038b0 [0000000000001fc8] { gipcAddress : name 'udp://192.168.5.58:53', objFlags 0x0, addrFlags 0x5 }
2015-01-31 18:09:13.518516 :GIPCXCPT:255158016:  gipcmodNetworkProcessBind: slos op  :  sgipcnTcpBind
2015-01-31 18:09:13.518518 :GIPCXCPT:255158016:  gipcmodNetworkProcessBind: slos dep :  Address already in use (98)
2015-01-31 18:09:13.518520 :GIPCXCPT:255158016:  gipcmodNetworkProcessBind: slos loc :  bind
2015-01-31 18:09:13.518521 :GIPCXCPT:255158016:  gipcmodNetworkProcessBind: slos info:  addr '192.168.5.58:53'

 

Reference