How to Validate Network and Name Resolution Setup for the Clusterware and RAC

作者: Maclean Liu , post on April 7th, 2010 , English Version
【本站文章除注明转载外,均为本站原创编译】
转载请注明:文章转载自: Oracle Clinic – Maclean Liu的个人技术博客 [http://www.oracledatabase12g.com/]
本文标题: How to Validate Network and Name Resolution Setup for the Clusterware and RAC
本文永久地址: http://www.oracledatabase12g.com/archives/how-to-validate-network-and-name-resolution-setup-for-the-clusterware-and-rac.html

Applies to:

Oracle Server – Enterprise Edition – Version: 10.1.0.2 to 11.2.0.0 – Release: 10.1 to 11.2
Information in this document applies to any platform.

Goal

Cluster Verification Utility (aka CVU, command runcluvfy.sh or cluvfy) does very good checking on the network and name resolution setup, but it may not capture all issues. If the network and name resolution is not setup properly before installation, it is likely the installation will fail; if network or name resolution is malfunctioning, likely the clusterware and/or RAC will have issues. The goal of this note is to provide a list of things to verify regarding the network and name resolution setup for Grid Infrastructure (clusterware) and RAC.

Solution

A. Requirement

Network ping with package size of Network Adapter (NIC) MTU should work on all public and private network.

IP address 127.0.0.1 should only map to localhost and/or localhost.localdomain, not anything else.

NIC name should be same for corresponding network on all nodes.

MTU should be the same for corresponding network on all nodes.

Network size should be same for corresponding network on all nodes.

As the private network needs to be directly attached, traceroute should work with a packet size of NIC MTU without fragmentation or going through the routing table on all private networks in 1 hop.

Firewall needs to be turned off on the private network.

For 10.1 to 11.1, name resolution should work for the public, private and virtual names.

For 11.2 without Grid Naming Service (aka GNS), name resolution should work for all public, virtual, and SCAN names; and if SCAN is configured in DNS, it shouldn’t be in local hosts file.

For 11.2.0.2 and above, multicast group 230.0.1.0 should work on private network; with patch 9974223, both group 230.0.1.0 and 224.0.0.251 are supported.

 

OS level bonding is recommended for the private network for pre-11.2.0.2.  Depending on the platform, you may implement bonding, teaming, Etherchannel, IPMP, MultiPrivNIC etc, please consult with your OS vendor for details. Started from 11.2.0.2, Redundant Interconnect and HAIP is introduced to provide native support for multiple private network, see note <=”" a=”">1210883.1> for details.

B. Example of what we expect

Example below shows what we expect while validating the network and name resolution setup. As the network setup is slightly different for 11gR2 and 11gR1 or below, we have both case in the below example. The difference between 11gR1 or below and 11gR2 is for 11gR1, we need a public name, VIP name, private hostname, and we rely on the private name to find out the private IP for cluster communication.   For 11gR2, we do not rely on the private name anymore, rather the private network is selected based on the GPnP profile while the clusterware comes up. Assuming a 3-node cluster with  the following node information:

11gR1 or below cluster:

Nodename |Public IP |VIP name |VIP        |Private |private IP1 |private IP2           

         |NIC/MTU   |         |           |Name1   |NIC/MTU     |

---------|----------|---------|-----------|--------|----------------------

rac1     |120.0.0.1 |rac1v    |120.0.0.11 |rac1p   |10.0.0.1    |

         |eth0/1500 |         |           |        |eth1/1500   |

---------|----------|---------|-----------|--------|----------------------

rac2     |120.0.0.2 |rac2v    |120.0.0.12 |rac2p   |10.0.0.2    |

         |eth0/1500 |         |           |        |eth1/1500   |

---------|----------|---------|-----------|--------|----------------------

rac3     |120.0.0.3 |rac3v    |120.0.0.13 |rac3p   |10.0.0.3    |

         |eth0/1500 |         |           |        |eth1/1500   |

---------|----------|---------|-----------|--------|----------------------

11gR2 cluster

Nodename |Public IP |VIP name |VIP        |private IP1 |           

         |NIC/MTU   |         |           |NIC/MTU     |

---------|----------|---------|-----------|------------|----------

rac1     |120.0.0.1 |rac1v    |120.0.0.11 |10.0.0.1    | 

         |eth0/1500 |         |           |eth1/1500   |

---------|----------|---------|-----------|------------|----------

rac2     |120.0.0.2 |rac2v    |120.0.0.12 |10.0.0.2    |

         |eth0/1500 |         |           |eth1/1500   |

---------|----------|---------|-----------|------------|----------

rac3     |120.0.0.3 |rac3v    |120.0.0.13 |10.0.0.3    |

         |eth0/1500 |         |           |eth1/1500   |

---------|----------|---------|-----------|------------|----------

SCAN name |SCAN IP1   |SCAN IP2   |SCAN IP3   

----------|-----------|-----------|--------------------

scancl1   |120.0.0.21 |120.0.0.22 |120.0.0.23 

----------|-----------|-----------|--------------------

Below is what is needed to be verify on each node – please note the example is from a Linux platform:

1. To find out the MTU
/bin/netstat -in
Kernel Interface table
Iface       MTU Met    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0 1500 0   203273      0      0      0     2727      0      0      0 BMRU

# In above example MTU is set to 1500 for eth0

2. To find out the IP address and subnet, compare Broadcast and Mask on all nodes
/sbin/ifconfig
eth0 Link encap:Ethernet  HWaddr 00:16:3E:11:11:11
inet addr:120.0.0.1  Bcast:120.0.0.127  Mask:255.255.255.128
inet6 addr: fe80::216:3eff:fe11:1111/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500 Metric:1
RX packets:203245 errors:0 dropped:0 overruns:0 frame:0
TX packets:2681 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:63889908 (60.9 MiB)  TX bytes:319837 (312.3 KiB)
..

In the above example, the IP address for eth0 is 120.0.0.1, broadcast is 120.0.0.127, and net mask is  255.255.255.128, which is subnet of 120.0.0.0 with a maximum of 126 IP addresses.  You may refer to 3rd party web sites to calculate the subnet from the ifconfig output.

3. Run all ping commands twice to make sure result is consistent

Below is an example ping output from node1 public IP to node2 public hostname:

PING rac2 (120.0.0.2) from 120.0.0.1 : 1500(1528) bytes of data.
1508 bytes from rac1 (120.0.0.2): icmp_seq=1 ttl=64 time=0.742 ms
1508 bytes from rac1 (120.0.0.2): icmp_seq=2 ttl=64 time=0.415 ms

— rac2 ping statistics —
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.415/0.578/0.742/0.165 ms

3.1 Ping all public nodenames from the local public IP with packet size of MTU
/bin/ping -s 1500 -c 2 -I 120.0.0.1 rac1
/bin/ping -s 1500 -c 2 -I 120.0.0.1 rac1
/bin/ping -s 1500 -c 2 -I 120.0.0.1 rac2
/bin/ping -s 1500 -c 2 -I 120.0.0.1 rac2
/bin/ping -s 1500 -c 2 -I 120.0.0.1 rac3
/bin/ping -s 1500 -c 2 -I 120.0.0.1 rac3

3.2.1 Ping all private IP(s) from all local private IP(s) with packet size of MTU
# applies to 11gR2 example, private name is optional
/bin/ping -s 1500 -c 2 -I 10.0.0.1  10.0.0.1
/bin/ping -s 1500 -c 2 -I 10.0.0.1  10.0.0.1
/bin/ping -s 1500 -c 2 -I 10.0.0.1  10.0.0.2
/bin/ping -s 1500 -c 2 -I 10.0.0.1  10.0.0.2
/bin/ping -s 1500 -c 2 -I 10.0.0.1  10.0.0.3
/bin/ping -s 1500 -c 2 -I 10.0.0.1  10.0.0.3

3.2.2 Ping all private nodename from local private IP with packet size of MTU
# applies to 11gR1 and earlier example
/bin/ping -s 1500 -c 2 -I 10.0.0.1  rac1p
/bin/ping -s 1500 -c 2 -I 10.0.0.1  rac1p
/bin/ping -s 1500 -c 2 -I 10.0.0.1  rac2p
/bin/ping -s 1500 -c 2 -I 10.0.0.1  rac2p
/bin/ping -s 1500 -c 2 -I 10.0.0.1  rac3p
/bin/ping -s 1500 -c 2 -I 10.0.0.1  rac3p

4. Traceroute private network

Example below shows traceroute from node1 private IP to node2 private hostname

traceroute to rac2p (10.0.0.2), 30 hops max, 1472 byte packets
1  rac2p (10.0.0.2)  0.626 ms  0.567 ms  0.529 ms

MTU size packet traceroute complete in 1 hop without going through the routing table.

Note: traceroute option “-F” may not work on RHEL3/4 OEL4 due to OS bug, refer to note: 752844.1 for details.

4.1 Traceroute all private IP(s) from all local private IP(s) with
# packet size of MTU – on Linux packet length need to be MTU-28
# applies to 11gR2
/bin/traceroute -s 10.0.0.1  -r -F 10.0.0.1  1472
/bin/traceroute -s 10.0.0.1  -r -F 10.0.0.2  1472
/bin/traceroute -s 10.0.0.1  -r -F 10.0.0.3  1472

4.2 Traceroute all private nodename from local private IP with packet size of MTU
# applies to 11gR1 and earlier example
/bin/traceroute -s 10.0.0.1 -r -F rac1p 1472
/bin/traceroute -s 10.0.0.1 -r -F rac2p 1472
/bin/traceroute -s 10.0.0.1 -r -F rac3p 1472

5. Ping VIP hostname
# Ping of all VIP nodename should resolve to correct IP
# Before the clusterware is installed, ping should be able to resolve VIP nodename but
# should fail as VIP is managed by the clusterware
# After the clusterware is up and running, ping should succeed
/bin/ping -c 2 rac1v
/bin/ping -c 2 rac1v
/bin/ping -c 2 rac2v
/bin/ping -c 2 rac2v
/bin/ping -c 2 rac3v
/bin/ping -c 2 rac3v

6. Ping SCAN name
# applies to 11gR2
# Ping of SCAN name should resolve to correct IP
# Before the clusterware is installed, ping should be able to resolve SCAN name but
# should fail as SCAN VIP is managed by the clusterware
# After the clusterware is up and running, ping should succeed
/bin/ping -s 1500 -c 2 -I 120.0.0.1 scancl1
/bin/ping -s 1500 -c 2 -I 120.0.0.1 scancl1
/bin/ping -s 1500 -c 2 -I 120.0.0.1 scancl1

7. Nslookup VIP hostname and SCAN name
# applies to 11gR2
# To check whether VIP nodename and SCAN name are setup properly in DNS
/usr/bin/nslookup rac1v
/usr/bin/nslookup rac2v
/usr/bin/nslookup rac3v
/usr/bin/nslookup scancl1

8. To check name resolution order
# /etc/nsswitch.conf on Linux, Solaris and hp-ux, /etc/netsvc.conf on AIX
/bin/grep ^hosts /etc/nsswitch.conf
hosts:      files dns

9. To check local hosts file
# If local files is in naming switch setting (nsswitch.conf), to make sure
# hosts file doesn’t have typo or misconfiguration, grep all nodename and IP
# 127.0.0.1 should not map to SCAN name, public, private and VIP hostname

/bin/grep rac1       /etc/hosts
/bin/grep rac2       /etc/hosts
/bin/grep rac3       /etc/hosts
/bin/grep rac1v      /etc/hosts
/bin/grep rac2v      /etc/hosts
/bin/grep rac3v      /etc/hosts
/bin/grep 120.0.0.1  /etc/hosts
/bin/grep 120.0.0.2  /etc/hosts
/bin/grep 120.0.0.3  /etc/hosts
/bin/grep 120.0.0.11 /etc/hosts
/bin/grep 120.0.0.12 /etc/hosts
/bin/grep 120.0.0.13 /etc/hosts

# For 11gR2 example
/bin/grep 10.0.0.1   /etc/hosts
/bin/grep 10.0.0.2   /etc/hosts
/bin/grep 10.0.0.3   /etc/hosts
/bin/grep 10.0.0.11  /etc/hosts
/bin/grep 10.0.0.12  /etc/hosts
/bin/grep 10.0.0.13  /etc/hosts

# For 11gR1 and earlier example
/bin/grep rac1p      /etc/hosts
/bin/grep rac2p      /etc/hosts
/bin/grep rac3p      /etc/hosts
/bin/grep 10.0.0.1   /etc/hosts
/bin/grep 10.0.0.2   /etc/hosts
/bin/grep 10.0.0.3   /etc/hosts

# For 11gR2 example
# If SCAN name is setup in DNS, it should not in local hosts file
/bin/grep scancl1      /etc/hosts
/bin/grep 120.0.0.21 /etc/hosts
/bin/grep 120.0.0.22 /etc/hosts
/bin/grep 120.0.0.23 /etc/hosts

C. Syntax reference

Please refer to below for command syntax on different platform

Linux:
/bin/netstat -in
/sbin/ifconfig
/bin/ping -s MTU -c 2 -I source_IP nodename
/bin/traceroute -s source_IP -r -F  nodename-priv MTU-28
/usr/bin/nslookup

Solaris:
/bin/netstat -in
/usr/sbin/ifconfig -a
/usr/sbin/ping -i source_IP -s nodename MTU 2
/usr/sbin/traceroute -s source_IP -r -F nodename-priv MTU
/usr/sbin/nslookup

HP-UX:
/usr/bin/netstat -in
/usr/sbin/ifconfig NIC
/usr/sbin/ping -i source_IP nodename MTU -n 2
/usr/contrib/bin/traceroute -s source_IP -r -F nodename-priv MTU
/bin/nslookup

AIX:
/bin/netstat -in
/usr/sbin/ifconfig -a
/usr/sbin/ping -i source_IP -s MTU -c 2 nodename
/bin/traceroute -s source_IP -r nodename-priv MTU
/bin/nslookup

D. Multicast/Broadcast

Started from 11.2.0.2, multicast group 230.0.1.0 should work on private network. patch 9974223 introduces support for another group 224.0.0.251

Please refer to note 1212703.1 to verify whether multicast is working fine.

On hp-ux, if 10 Gigabit Ethernet is used as private network adapter, without driver revision B.11.31.1009.01 or later of the 10GigEthr-02 software bundle, multicast may not work.  Run “swlist 10GigEthr-02″ command to identify the current version on your HP server.

E. Runtime network issues

OSWatcher or Cluster Health Monitor(IPD/OS) can be deployed to capture runtime network issues.

F. Symptoms of a network issue

ping doesn’t work

traceroute doesn’t work

name resolution doesn’t work

2010-11-21 13:00:44.455: [ GIPCNET][1252870464]gipcmodNetworkProcessConnect: [network] failed connect attempt endp 0xc7c5590 [0000000000000356] { gipcEndpoint : localAddr ‘gipc://racnode3:08b1-c475-a88e-8387#10.10.10.23#27573′, remoteAddr ‘gipc://racnode2:nm_rac-cluster#192.168.0.22#26869′, numPend 0, numReady 1, numDone 0, numDead 0, numTransfer 0, objFlags 0×0, pidPeer 0, flags 0×80612, usrFlags 0×0 }, req 0xc7c5310 [000000000000035f] { gipcConnectRequest : addr ‘gipc://racnode2:nm_rac-cluster#192.168.0.22#26869′, parentEn
2010-11-21 13:00:44.455: [ GIPCNET][1252870464]gipcmodNetworkProcessConnect: slos op : sgipcnTcpConnect
2010-11-21 13:00:44.455: [ GIPCNET][1252870464]gipcmodNetworkProcessConnect: slos dep : No route to host (113)

2010-11-29 10:52:38.603: [GIPCHALO][2314] gipchaLowerProcessNode: no valid interfaces found to node for 2614824036 ms, node 111ea99b0 { host ‘aixprimusrefdb1′, haName ’1e0b-174e-37bc-a515′, srcLuid 2612fa8e-3db4fcb7, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [55 : 55], createTime 2614768983, flags 0×4 }
2010-11-29 10:52:42.299: [ CRSMAIN][515] Policy Engine is not initialized yet!
2010-11-29 10:52:43.554: [ OCRMAS][3342]proath_connect_master:1: could not yet connect to master retval1 = 203, retval2 = 203
2010-11-29 10:52:43.554: [ OCRMAS][3342]th_master:110′: Could not yet connect to new master [1]

2010-11-04 12:33:22.133: [ GIPCNET][2314] gipcmodNetworkProcessSend: slos op :  sgipcnUdpSend
2010-11-04 12:33:22.133: [ GIPCNET][2314] gipcmodNetworkProcessSend: slos dep :  Message too long (59)
2010-11-04 12:33:22.133: [ GIPCNET][2314] gipcmodNetworkProcessSend: slos loc :  sendto
2010-11-04 12:33:22.133: [ GIPCNET][2314] gipcmodNetworkProcessSend: slos info :  dwRet 4294967295, addr ’19

2010-02-03 23:26:25.804: [GIPCXCPT][1206540320]gipcmodGipcPassInitializeNetwork: failed to find any interfaces in clsinet, ret gipcretFail (1)
2010-02-03 23:26:25.804: [GIPCGMOD][1206540320]gipcmodGipcPassInitializeNetwork: EXCEPTION[ ret gipcretFail (1) ]  failed to determine host from clsinet, using default
..
2010-02-03 23:26:25.810: [    CSSD][1206540320]clsssclsnrsetup: gipcEndpoint failed, rc 39
2010-02-03 23:26:25.811: [    CSSD][1206540320]clssnmOpenGIPCEndp: failed to listen on gipc addr gipc://rac1:nm_eotcs- ret 39
2010-02-03 23:26:25.811: [    CSSD][1206540320]clssscmain: failed to open gipc endp

2010-09-20 11:52:54.014: [    CSSD][1103055168]clssnmvDHBValidateNCopy: node 1, racnode1, has a disk HB, but no network HB, DHB has rcfg 180441784, wrtcnt, 453, LATS 328297844, lastSeqNo 452, uniqueness 1284979488, timestamp 1284979973/329344894
2010-09-20 11:52:54.016: [    CSSD][1078421824]clssgmWaitOnEventValue: after CmInfo State  val 3, eval 1 waited 0
..  >>>> after a long delay
2010-09-20 12:02:39.578: [    CSSD][1103055168]clssnmvDHBValidateNCopy: node 1, racnode1, has a disk HB, but no network HB, DHB has rcfg 180441784, wrtcnt, 1037, LATS 328883434, lastSeqNo 1036, uniqueness 1284979488, timestamp 1284980558/329930254
2010-09-20 12:02:39.895: [    CSSD][1107286336]clssgmExecuteClientRequest: MAINT recvd from proc 2 (0xe1ad870)

2009-12-10 06:28:31.974: [  OCRMAS][20]proath_connect_master:1: could not connect to master  clsc_ret1 = 9, clsc_ret2 = 9
2009-12-10 06:28:31.974: [  OCRMAS][20]th_master:11: Could not connect to the new master
2009-12-10 06:29:01.450: [ CRSMAIN][2] Policy Engine is not initialized yet!
2009-12-10 06:29:31.489: [ CRSMAIN][2] Policy Engine is not initialized yet!

2009-12-31 00:42:08.110: [ COMMCRS][10]clsc_receive: (102b03250) Error receiving, ns (12535, 12560), transport (505, 145, 0)

© 2010 – 2011, www.oracledatabase12g.com. 版权所有.文章允许转载,但必须以链接方式注明源地址,否则追究法律责任.

相关文章 | Related posts:

  1. Network Interface No Longer Operational?
  2. SCRIPT: VALIDATE.SQL to ANALYZE .. VALIDATE STRUCTURE objects in a Tablespace
  3. Streams Propagation Tuning with Network Parameters
  4. Know about RAC Clusterware Process OPROCD
  5. 9i Best Practices For Streams RAC Setup
  6. How to Create Two Different Bonding Configurations in OEL 4
  7. HowTo validate a date/timestamp column
  8. SQL*Net PERFORMANCE TUNING UTILIZING UNDERLYING NETWORK PROTOCOL
  9. How To Calculate The Required Network Bandwidth Transfer Of Archivelogs In Dataguard Environments
  10. 11gR2 Grid Infrastructure Redundant Interconnect and ora.cluster_interconnect.haip

Leave a Reply

  

  

  

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>