Download location for 12c cluvfy
http://www.oracle.com/technetwork/database/options/clustering/downloads/index.html
- Cluster Verification Utility Download for Oracle Grid Infrastructure 12c
- Always download the newest cluvfy version from above linke
- The latest CVU version (July 2013) can be used with all currently supported Oracle RAC versions, including Oracle RAC 10g, Oracle RAC 11g and Oracle RAC 12c.
Impact of latest Cluvfy version
It's nothing more annoying than debugging a RAC problem which is finally a Cluvfy BUG. The latest Download from January 2015 shows the following version [grid@gract1 ~/CLUVFY-JAN-2015]$ bin/cluvfy -version 12.1.0.1.0 Build 112713x8664 whereas my current 12.1 installation reports the following version [grid@gract1 ~/CLUVFY-JAN-2015]$ cluvfy -version 12.1.0.1.0 Build 100213x866
Cluvfy trace Location
If you have installed cluvfy in /home/grid/CLUVFY-JAN-2015 the related cluvfy traces could be found in cv/log subdirectory [root@gract1 CLUVFY-JAN-2015]# ls /home/grid/CLUVFY-JAN-2015/cv/log cvutrace.log.0 cvutrace.log.0.lck Note some cluvfy commands like : # cluvfy comp dhcp -clustername gract -verbose must be run as root ! Im that case the default trace location may not have the correct permissions . In that uses the script below to set Trace Level and Trace Location
Setting Cluvfy trace File Locaton and Trace Level in a bash script
The following bash script sets the cluvfy trace location and the cluvfy trace level
#!/bin/bash
rm -rf /tmp/cvutrace
mkdir /tmp/cvutrace
export CV_TRACELOC=/tmp/cvutrace
export SRVM_TRACE=true
export SRVM_TRACE_LEVEL=2
Why cluvfy version matters ?
Yesterday I debugged a DHCP problem starting with cluvfy : [grid@gract1 ~]$ cluvfy -version 12.1.0.1.0 Build 100213x8664 [root@gract1 network-scripts]# cluvfy comp dhcp -clustername gract -verbose Verifying DHCP Check Checking if any DHCP server exists on the network... <null> At least one DHCP server exists on the network and is listening on port 67 Checking if DHCP server has sufficient free IP addresses for all VIPs... Sending DHCP "DISCOVER" packets for client ID "gract-scan1-vip" <null> Sending DHCP "REQUEST" packets for client ID "gract-scan1-vip" <null> .. DHCP server was able to provide sufficient number of IP addresses The DHCP server response time is within acceptable limits Verification of DHCP Check was unsuccessful on all the specified nodes. --> As verification was unsuccessful I started Network Tracing using tcpdump. But Network tracing looks good and I get a bad feeling about cluvfy ! What to do next ? Install the newest cluvfy version and rerun the test ! [grid@gract1 ~/CLUVFY-JAN-2015]$ bin/cluvfy -version 12.1.0.1.0 Build 112713x8664 Now rerun test : [root@gract1 CLUVFY-JAN-2015]# bin/cluvfy comp dhcp -clustername gract -verbose Verifying DHCP Check Checking if any DHCP server exists on the network... DHCP server returned server: 192.168.5.50, loan address: 192.168.5.150/255.255.255.0, lease time: 21600 At least one DHCP server exists on the network and is listening on port 67 Checking if DHCP server has sufficient free IP addresses for all VIPs... Sending DHCP "DISCOVER" packets for client ID "gract-scan1-vip" Checking if DHCP server has sufficient free IP addresses for all VIPs... Sending DHCP "DISCOVER" packets for client ID "gract-scan1-vip" DHCP server returned server: 192.168.5.50, loan address: 192.168.5.150/255.255.255.0, lease time: 21600 Sending DHCP "REQUEST" packets for client ID "gract-scan1-vip" .. released DHCP server lease for client ID "gract-gract1-vip" on port "67" DHCP server was able to provide sufficient number of IP addresses The DHCP server response time is within acceptable limits Verification of DHCP Check was successful.
Why you should always review your cluvfy logs ?
Per default cluvfy logs are under CV_HOME/cv/logs [grid@gract1 ~/CLUVFY-JAN-2015]$ cluvfy stage -pre crsinst -n gract1 Performing pre-checks for cluster services setup Checking node reachability... Node reachability check passed from node "gract1" Checking user equivalence... User equivalence check passed for user "grid" ERROR: An error occurred in creating a TaskFactory object or in generating a task list PRCT-1011 : Failed to run "oifcfg". Detailed error: [] PRCT-1011 : Failed to run "oifcfg". Detailed error: [] This error is not very helpful at all ! Reviewing cluvfy logfiles for details: [root@gract1 log]# cd $GRID_HOME/cv/log Cluvfy log cvutrace.log.0 : [Thread-49] [ 2015-01-22 08:51:25.283 CET ] [StreamReader.run:65] OUTPUT>PRIF-10: failed to initialize the cluster registry [main] [ 2015-01-22 08:51:25.286 CET ] [RuntimeExec.runCommand:144] runCommand: process returns 1 [main] [ 2015-01-22 08:51:25.286 CET ] [RuntimeExec.runCommand:161] RunTimeExec: output> [main] [ 2015-01-22 08:51:25.286 CET ] [RuntimeExec.runCommand:164] PRIF-10: failed to initialize the cluster registry [main] [ 2015-01-22 08:51:25.286 CET ] [RuntimeExec.runCommand:170] RunTimeExec: error> [main] [ 2015-01-22 08:51:25.286 CET ] [RuntimeExec.runCommand:192] Returning from RunTimeExec.runCommand [main] [ 2015-01-22 08:51:25.286 CET ] [CmdToolUtil.doexecuteLocally:884] retval = 1 [main] [ 2015-01-22 08:51:25.286 CET ] [CmdToolUtil.doexecuteLocally:885] exitval = 1 [main] [ 2015-01-22 08:51:25.286 CET ] [CmdToolUtil.doexecuteLocally:886] rtErrLength = 0 [main] [ 2015-01-22 08:51:25.286 CET ] [CmdToolUtil.doexecuteLocally:892] Failed to execute command. Command = [/u01/app/121/grid/bin/oifcfg, getif, -from, gpnp] env = null error = [] [main] [ 2015-01-22 08:51:25.287 CET ] [ClusterNetworkInfo.getNetworkInfoFromOifcfg:152] INSTALLEXCEPTION: occured while getting cluster network info. messagePRCT-1011 : Failed to run "oifcfg". Detailed error: [] [main] [ 2015-01-22 08:51:25.287 CET ] [TaskFactory.getNetIfFromOifcfg:4352] Exception occured while getting network information. msg=PRCT-1011 : Failed to run "oifcfg". Detailed error: [] Here we get a better error message : PRIF-10: failed to initialize the cluster registry and we extract the failing command : /u01/app/121/grid/bin/oifcfg getif Now we can retry the OS command as OS level [grid@gract1 ~/CLUVFY-JAN-2015]$ /u01/app/121/grid/bin/oifcfg getif PRIF-10: failed to initialize the cluster registry Btw, if you have uploaded the new cluvfy command you get a much better error output [grid@gract1 ~/CLUVFY-JAN-2015]$ bin/cluvfy stage -pre crsinst -n gract1 ERROR: PRVG-1060 : Failed to retrieve the network interface classification information from an existing CRS home at path "/u01/app/121/grid" on the local node PRCT-1011 : Failed to run "oifcfg". Detailed error: PRIF-10: failed to initialize the cluster registry For Fixing PRVG-1060,PRCT-1011,PRIF-10 runnung above cluvfy commnads please read following article: Common cluvfy errors and warnings
Run cluvfy before CRS installation by passing network connections for PUBLIC and CLUSTER_INTERCONNECT
$ ./bin/cluvfy stage -pre crsinst -n grac121,grac122 -networks eth1:192.168.1.0:PUBLIC/eth2:192.168.2.0:cluster_interconnect
Run cluvfy before doing an UPGRADE
grid@grac41 /]$ cluvfy stage -pre crsinst -upgrade -n grac41,grac42,grac43 -rolling -src_crshome $GRID_HOME
-dest_crshome /u01/app/grid_new -dest_version 12.1.0.1.0 -fixup -fixupdir /tmp -verbose
Run cluvfy 12.1 for preparing a 10.2.1.0 CRS installation
Always install newest cluvfy version even for 10gR2 CRS validations! [root@ract1 ~]$ ./bin/cluvfy -version 12.1.0.1.0 Build 112713x8664 Verify OS setup on ract1 [root@ract1 ~]$ ./bin/cluvfy comp sys -p crs -r 10gR2 -n ract1 -verbose -fixup --> Run required scripts [root@ract1 ~]# /tmp/CVU_12.1.0.1.0_oracle/runfixup.sh All Fix-up operations were completed successfully. Repeat this step on ract2 [root@ract2 ~]$ ./bin/cluvfy comp sys -p crs -r 10gR2 -n ract2 -verbose -fixup --> Run required scripts [root@ract2 ~]# /tmp/CVU_12.1.0.1.0_oracle/runfixup.sh All Fix-up operations were completed successfully. Now verify System requirements on both nodes [oracle@ract1 cluvfy12]$ ./bin/cluvfy comp sys -p crs -r 10gR2 -n ract1 -verbose -fixup Verifying system requirement .. NOTE: No fixable verification failures to fix Finally run cluvfy to test CRS installation readiness $ cluvfy12/bin/cluvfy stage -pre crsinst -r 10gR2 \ -networks eth1:192.168.1.0:PUBLIC/eth2:192.168.2.0:cluster_interconnect \ -n ract1,ract2 -verbose .. Pre-check for cluster services setup was successful.
Run cluvfy comp software to check file protections for GRID and RDBMS installations
- Note : Not all files are checked ( SHELL scripts like ohasd are missing ) – Bug 18407533 – CLUVFY DOES NOT VERIFY ALL FILES
- Config File : $GRID_HOME/cv/cvdata/ora_software_cfg.xml
Run cluvfy comp software to verify GRID stack [grid@grac41 ~]$ cluvfy comp software -r 11gR2 -n grac41 -verbose Verifying software Check: Software 1178 files verified Software check passed Verification of software was successful. Run cluvfy comp software to verify RDBMS stack [oracle@grac43 ~]$ cluvfy comp software -d $ORACLE_HOME -r 11gR2 -verbose Verifying software Check: Software 1780 files verified Software check passed Verification of software was successful.
Run cluvfy before CRS installation on a single node and create a script for fixable errors
$ ./bin/cluvfy comp sys -p crs -n grac121 -verbose -fixup Verifying system requirement Check: Total memory Node Name Available Required Status ------------ ------------------------ ------------------------ ---------- grac121 3.7426GB (3924412.0KB) 4GB (4194304.0KB) failed Result: Total memory check failed ... ***************************************************************************************** Following is the list of fixable prerequisites selected to fix in this session ****************************************************************************************** -------------- --------------- ---------------- Check failed. Failed on nodes Reboot required? -------------- --------------- ---------------- Hard Limit: maximum open grac121 no file descriptors Execute "/tmp/CVU_12.1.0.1.0_grid/runfixup.sh" as root user on nodes "grac121" to perform the fix up operations manually --> Now run runfixup.sh" as root on nodes "grac121" Press ENTER key to continue after execution of "/tmp/CVU_12.1.0.1.0_grid/runfixup.sh" has completed on nodes "grac121" Fix: Hard Limit: maximum open file descriptors Node Name Status ------------------------------------ ------------------------ grac121 successful Result: "Hard Limit: maximum open file descriptors" was successfully fixed on all the applicable nodes Fix up operations were successfully completed on all the applicable nodes Verification of system requirement was unsuccessful on all the specified nodes. Note errrors like to low memory/swap needs manual intervention: Check: Total memory Node Name Available Required Status ------------ ------------------------ ------------------------ ---------- grac121 3.7426GB (3924412.0KB) 4GB (4194304.0KB) failed Result: Total memory check failed Fix that error at OS level and rerun the above cluvfy command
Performing post-checks for hardware and operating system setup
- cluvfy stage -post hwos test multicast communication with multicast group “230.0.1.0”
[grid@grac42 ~]$ cluvfy stage -post hwos -n grac42,grac43 -verbose Performing post-checks for hardware and operating system setup Checking node reachability... Check: Node reachability from node "grac42" Destination Node Reachable? ------------------------------------ ------------------------ grac42 yes grac43 yes Result: Node reachability check passed from node "grac42" Checking user equivalence... Check: User equivalence for user "grid" Node Name Status ------------------------------------ ------------------------ grac43 passed grac42 passed Result: User equivalence check passed for user "grid" Checking node connectivity... Checking hosts config file... Node Name Status ------------------------------------ ------------------------ grac43 passed grac42 passed Verification of the hosts config file successful Interface information for node "grac43" Name IP Address Subnet Gateway Def. Gateway HW Address MTU ------ --------------- --------------- --------------- --------------- ----------------- ------ eth0 10.0.2.15 10.0.2.0 0.0.0.0 10.0.2.2 08:00:27:38:10:76 1500 eth1 192.168.1.103 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:F6:18:43 1500 eth1 192.168.1.59 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:F6:18:43 1500 eth1 192.168.1.170 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:F6:18:43 1500 eth1 192.168.1.177 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:F6:18:43 1500 eth2 192.168.2.103 192.168.2.0 0.0.0.0 10.0.2.2 08:00:27:1C:30:DD 1500 eth2 169.254.125.13 169.254.0.0 0.0.0.0 10.0.2.2 08:00:27:1C:30:DD 1500 virbr0 192.168.122.1 192.168.122.0 0.0.0.0 10.0.2.2 52:54:00:ED:19:7C 1500 Interface information for node "grac42" Name IP Address Subnet Gateway Def. Gateway HW Address MTU ------ --------------- --------------- --------------- --------------- ----------------- ------ eth0 10.0.2.15 10.0.2.0 0.0.0.0 10.0.2.2 08:00:27:6C:89:27 1500 eth1 192.168.1.102 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:63:08:07 1500 eth1 192.168.1.165 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:63:08:07 1500 eth1 192.168.1.178 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:63:08:07 1500 eth1 192.168.1.167 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:63:08:07 1500 eth2 192.168.2.102 192.168.2.0 0.0.0.0 10.0.2.2 08:00:27:DF:79:B9 1500 eth2 169.254.96.101 169.254.0.0 0.0.0.0 10.0.2.2 08:00:27:DF:79:B9 1500 virbr0 192.168.122.1 192.168.122.0 0.0.0.0 10.0.2.2 52:54:00:ED:19:7C 1500 Check: Node connectivity for interface "eth1" Source Destination Connected? ------------------------------ ------------------------------ ---------------- grac43[192.168.1.103] grac43[192.168.1.59] yes grac43[192.168.1.103] grac43[192.168.1.170] yes .. grac42[192.168.1.165] grac42[192.168.1.167] yes grac42[192.168.1.178] grac42[192.168.1.167] yes Result: Node connectivity passed for interface "eth1" Check: TCP connectivity of subnet "192.168.1.0" Source Destination Connected? ------------------------------ ------------------------------ ---------------- grac42:192.168.1.102 grac43:192.168.1.103 passed grac42:192.168.1.102 grac43:192.168.1.59 passed grac42:192.168.1.102 grac43:192.168.1.170 passed grac42:192.168.1.102 grac43:192.168.1.177 passed grac42:192.168.1.102 grac42:192.168.1.165 passed grac42:192.168.1.102 grac42:192.168.1.178 passed grac42:192.168.1.102 grac42:192.168.1.167 passed Result: TCP connectivity check passed for subnet "192.168.1.0" Check: Node connectivity for interface "eth2" Source Destination Connected? ------------------------------ ------------------------------ ---------------- grac43[192.168.2.103] grac42[192.168.2.102] yes Result: Node connectivity passed for interface "eth2" Check: TCP connectivity of subnet "192.168.2.0" Source Destination Connected? ------------------------------ ------------------------------ ---------------- grac42:192.168.2.102 grac43:192.168.2.103 passed Result: TCP connectivity check passed for subnet "192.168.2.0" Checking subnet mask consistency... Subnet mask consistency check passed for subnet "192.168.1.0". Subnet mask consistency check passed for subnet "192.168.2.0". Subnet mask consistency check passed. Result: Node connectivity check passed Checking multicast communication... Checking subnet "192.168.1.0" for multicast communication with multicast group "230.0.1.0"... Check of subnet "192.168.1.0" for multicast communication with multicast group "230.0.1.0" passed. Checking subnet "192.168.2.0" for multicast communication with multicast group "230.0.1.0"... Check of subnet "192.168.2.0" for multicast communication with multicast group "230.0.1.0" passed. Check of multicast communication passed. Checking for multiple users with UID value 0 Result: Check for multiple users with UID value 0 passed Check: Time zone consistency Result: Time zone consistency check passed Checking shared storage accessibility... Disk Sharing Nodes (2 in count) ------------------------------------ ------------------------ /dev/sdb grac43 /dev/sdk grac42 .. Disk Sharing Nodes (2 in count) ------------------------------------ ------------------------ /dev/sdp grac43 grac42 Shared storage check was successful on nodes "grac43,grac42" Checking integrity of name service switch configuration file "/etc/nsswitch.conf" ... Checking if "hosts" entry in file "/etc/nsswitch.conf" is consistent across nodes... Checking file "/etc/nsswitch.conf" to make sure that only one "hosts" entry is defined More than one "hosts" entry does not exist in any "/etc/nsswitch.conf" file All nodes have same "hosts" entry defined in file "/etc/nsswitch.conf" Check for integrity of name service switch configuration file "/etc/nsswitch.conf" passed Post-check for hardware and operating system setup was successful.
Debugging Voting disk problems with: cluvfy comp vdisk
As your CRS stack may not be up run these commands from a node which is up and running [grid@grac42 ~]$ cluvfy comp ocr -n grac41 Verifying OCR integrity Checking OCR integrity... Checking the absence of a non-clustered configuration... All nodes free of non-clustered, local-only configurations ERROR: PRVF-4194 : Asm is not running on any of the nodes. Verification cannot proceed. OCR integrity check failed Verification of OCR integrity was unsuccessful on all the specified nodes. [grid@grac42 ~]$ cluvfy comp vdisk -n grac41 Verifying Voting Disk: Checking Oracle Cluster Voting Disk configuration... ERROR: PRVF-4194 : Asm is not running on any of the nodes. Verification cannot proceed. ERROR: PRVF-5157 : Could not verify ASM group "OCR" for Voting Disk location "/dev/asmdisk1_udev_sdf1" ERROR: PRVF-5157 : Could not verify ASM group "OCR" for Voting Disk location "/dev/asmdisk1_udev_sdg1" ERROR: PRVF-5157 : Could not verify ASM group "OCR" for Voting Disk location "/dev/asmdisk1_udev_sdh1" PRVF-5431 : Oracle Cluster Voting Disk configuration check failed UDev attributes check for Voting Disk locations started... UDev attributes check passed for Voting Disk locations Verification of Voting Disk was unsuccessful on all the specified nodes. Debugging steps at OS level Verify disk protections and use kfed to read disk header [grid@grac41 ~/cluvfy]$ ls -l /dev/asmdisk1_udev_sdf1 /dev/asmdisk1_udev_sdg1 /dev/asmdisk1_udev_sdh1 b---------. 1 grid asmadmin 8, 81 May 14 09:51 /dev/asmdisk1_udev_sdf1 b---------. 1 grid asmadmin 8, 97 May 14 09:51 /dev/asmdisk1_udev_sdg1 b---------. 1 grid asmadmin 8, 113 May 14 09:51 /dev/asmdisk1_udev_sdh1 [grid@grac41 ~/cluvfy]$ kfed read /dev/asmdisk1_udev_sdf1 KFED-00303: unable to open file '/dev/asmdisk1_udev_sdf1'
Debugging file protection problems with: cluvfy comp software
- Related BUG: 18350484 : 112042GIPSU:”CLUVFY COMP SOFTWARE” FAILED IN 112042GIPSU IN HPUX
Investigate file protection problems with cluvfy comp software Cluvfy checks file protections against ora_software_cfg.xml [grid@grac41 cvdata]$ cd /u01/app/11204/grid/cv/cvdata [grid@grac41 cvdata]$ grep gpnp ora_software_cfg.xml <File Path="bin/" Name="gpnpd.bin" Permissions="0755"/> <File Path="bin/" Name="gpnptool.bin" Permissions="0755"/> Change protections and verify wiht cluvfy [grid@grac41 cvdata]$ chmod 444 /u01/app/11204/grid/bin/gpnpd.bin [grid@grac41 cvdata]$ cluvfy comp software -verbose | grep gpnpd /u01/app/11204/grid/bin/gpnpd.bin..."Permissions" did not match reference Permissions of file "/u01/app/11204/grid/bin/gpnpd.bin" did not match the expected value. [Expected = "0755" ; Found = "0444"] Now correct problem and verify again [grid@grac41 cvdata]$ chmod 755 /u01/app/11204/grid/bin/gpnpd.bin [grid@grac41 cvdata]$ cluvfy comp software -verbose | grep gpnpd --> No errors were reported anymore
Debugging CTSSD/NTP problems with: cluvfy comp clocksync
[grid@grac41 ctssd]$ cluvfy comp clocksync -n grac41,grac42,grac43 -verbose Verifying Clock Synchronization across the cluster nodes Checking if Clusterware is installed on all nodes... Check of Clusterware install passed Checking if CTSS Resource is running on all nodes... Check: CTSS Resource running on all nodes Node Name Status ------------------------------------ ------------------------ grac43 passed grac42 passed grac41 passed Result: CTSS resource check passed Querying CTSS for time offset on all nodes... Result: Query of CTSS for time offset passed Check CTSS state started... Check: CTSS state Node Name State ------------------------------------ ------------------------ grac43 Observer grac42 Observer grac41 Observer CTSS is in Observer state. Switching over to clock synchronization checks using NTP Starting Clock synchronization checks using Network Time Protocol(NTP)... NTP Configuration file check started... The NTP configuration file "/etc/ntp.conf" is available on all nodes NTP Configuration file check passed Checking daemon liveness... Check: Liveness for "ntpd" Node Name Running? ------------------------------------ ------------------------ grac43 yes grac42 yes grac41 yes Result: Liveness check passed for "ntpd" Check for NTP daemon or service alive passed on all nodes Checking NTP daemon command line for slewing option "-x" Check: NTP daemon command line Node Name Slewing Option Set? ------------------------------------ ------------------------ grac43 yes grac42 yes grac41 yes Result: NTP daemon slewing option check passed Checking NTP daemon's boot time configuration, in file "/etc/sysconfig/ntpd", for slewing option "-x" Check: NTP daemon's boot time configuration Node Name Slewing Option Set? ------------------------------------ ------------------------ grac43 yes grac42 yes grac41 yes Result: NTP daemon's boot time configuration check for slewing option passed Checking whether NTP daemon or service is using UDP port 123 on all nodes Check for NTP daemon or service using UDP port 123 Node Name Port Open? ------------------------------------ ------------------------ grac43 yes grac42 yes grac41 yes NTP common Time Server Check started... NTP Time Server ".LOCL." is common to all nodes on which the NTP daemon is running Check of common NTP Time Server passed Clock time offset check from NTP Time Server started... Checking on nodes "[grac43, grac42, grac41]"... Check: Clock time offset from NTP Time Server Time Server: .LOCL. Time Offset Limit: 1000.0 msecs Node Name Time Offset Status ------------ ------------------------ ------------------------ grac43 0.0 passed grac42 0.0 passed grac41 0.0 passed Time Server ".LOCL." has time offsets that are within permissible limits for nodes "[grac43, grac42, grac41]". Clock time offset check passed Result: Clock synchronization check using Network Time Protocol(NTP) passed Oracle Cluster Time Synchronization Services check passed Verification of Clock Synchronization across the cluster nodes was successful. At OS level you can run ntpq -p [root@grac41 dev]# ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== *ns1.example.com LOCAL(0) 10 u 90 256 377 0.072 -238.49 205.610 LOCAL(0) .LOCL. 12 l 15h 64 0 0.000 0.000 0.000
Running cluvfy stage -post crsinst after a failed Clusterware startup
- Note you should run cluvfy from a ndoe which is up and runnung to get best results
CRS resource status [grid@grac41 ~]$ my_crs_stat_init NAME TARGET STATE SERVER STATE_DETAILS ------------------------- ---------- ---------- ------------ ------------------ ora.asm ONLINE OFFLINE Instance Shutdown ora.cluster_interconnect.haip ONLINE OFFLINE ora.crf ONLINE ONLINE grac41 ora.crsd ONLINE OFFLINE ora.cssd ONLINE OFFLINE STARTING ora.cssdmonitor ONLINE ONLINE grac41 ora.ctssd ONLINE OFFLINE ora.diskmon OFFLINE OFFLINE ora.drivers.acfs ONLINE OFFLINE ora.evmd ONLINE OFFLINE ora.gipcd ONLINE ONLINE grac41 ora.gpnpd ONLINE ONLINE grac41 ora.mdnsd ONLINE ONLINE grac41 Verify CRS status with cluvfy ( CRS on grac42 is up and running ) [grid@grac42 ~]$ cluvfy stage -post crsinst -n grac41,grac42 -verbose Performing post-checks for cluster services setup Checking node reachability... Check: Node reachability from node "grac42" Destination Node Reachable? ------------------------------------ ------------------------ grac42 yes grac41 yes Result: Node reachability check passed from node "grac42" Checking user equivalence... Check: User equivalence for user "grid" Node Name Status ------------------------------------ ------------------------ grac42 passed grac41 passed Result: User equivalence check passed for user "grid" Checking node connectivity... Checking hosts config file... Node Name Status ------------------------------------ ------------------------ grac42 passed grac41 passed Verification of the hosts config file successful Interface information for node "grac42" Name IP Address Subnet Gateway Def. Gateway HW Address MTU ------ --------------- --------------- --------------- --------------- ----------------- ------ eth0 10.0.2.15 10.0.2.0 0.0.0.0 10.0.2.2 08:00:27:6C:89:27 1500 eth1 192.168.1.102 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:63:08:07 1500 eth1 192.168.1.59 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:63:08:07 1500 eth1 192.168.1.178 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:63:08:07 1500 eth1 192.168.1.170 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:63:08:07 1500 eth2 192.168.2.102 192.168.2.0 0.0.0.0 10.0.2.2 08:00:27:DF:79:B9 1500 eth2 169.254.96.101 169.254.0.0 0.0.0.0 10.0.2.2 08:00:27:DF:79:B9 1500 virbr0 192.168.122.1 192.168.122.0 0.0.0.0 10.0.2.2 52:54:00:ED:19:7C 1500 Interface information for node "grac41" Name IP Address Subnet Gateway Def. Gateway HW Address MTU ------ --------------- --------------- --------------- --------------- ----------------- ------ eth0 10.0.2.15 10.0.2.0 0.0.0.0 10.0.2.2 08:00:27:82:47:3F 1500 eth1 192.168.1.101 192.168.1.0 0.0.0.0 10.0.2.2 08:00:27:89:E9:A2 1500 eth2 192.168.2.101 192.168.2.0 0.0.0.0 10.0.2.2 08:00:27:6B:E2:BD 1500 virbr0 192.168.122.1 192.168.122.0 0.0.0.0 10.0.2.2 52:54:00:ED:19:7C 1500 Check: Node connectivity for interface "eth1" Source Destination Connected? ------------------------------ ------------------------------ ---------------- grac42[192.168.1.102] grac42[192.168.1.59] yes grac42[192.168.1.102] grac42[192.168.1.178] yes grac42[192.168.1.102] grac42[192.168.1.170] yes grac42[192.168.1.102] grac41[192.168.1.101] yes grac42[192.168.1.59] grac42[192.168.1.178] yes grac42[192.168.1.59] grac42[192.168.1.170] yes grac42[192.168.1.59] grac41[192.168.1.101] yes grac42[192.168.1.178] grac42[192.168.1.170] yes grac42[192.168.1.178] grac41[192.168.1.101] yes grac42[192.168.1.170] grac41[192.168.1.101] yes Result: Node connectivity passed for interface "eth1" Check: TCP connectivity of subnet "192.168.1.0" Source Destination Connected? ------------------------------ ------------------------------ ---------------- grac42:192.168.1.102 grac42:192.168.1.59 passed grac42:192.168.1.102 grac42:192.168.1.178 passed grac42:192.168.1.102 grac42:192.168.1.170 passed grac42:192.168.1.102 grac41:192.168.1.101 passed Result: TCP connectivity check passed for subnet "192.168.1.0" Check: Node connectivity for interface "eth2" Source Destination Connected? ------------------------------ ------------------------------ ---------------- grac42[192.168.2.102] grac41[192.168.2.101] yes Result: Node connectivity passed for interface "eth2" Check: TCP connectivity of subnet "192.168.2.0" Source Destination Connected? ------------------------------ ------------------------------ ---------------- grac42:192.168.2.102 grac41:192.168.2.101 passed Result: TCP connectivity check passed for subnet "192.168.2.0" Checking subnet mask consistency... Subnet mask consistency check passed for subnet "192.168.1.0". Subnet mask consistency check passed for subnet "192.168.2.0". Subnet mask consistency check passed. Result: Node connectivity check passed Checking multicast communication... Checking subnet "192.168.1.0" for multicast communication with multicast group "230.0.1.0"... Check of subnet "192.168.1.0" for multicast communication with multicast group "230.0.1.0" passed. Checking subnet "192.168.2.0" for multicast communication with multicast group "230.0.1.0"... Check of subnet "192.168.2.0" for multicast communication with multicast group "230.0.1.0" passed. Check of multicast communication passed. Check: Time zone consistency Result: Time zone consistency check passed Checking Oracle Cluster Voting Disk configuration... ERROR: PRVF-4193 : Asm is not running on the following nodes. Proceeding with the remaining nodes. --> Expected error as lower CRS stack is not completly up and running grac41 Oracle Cluster Voting Disk configuration check passed Checking Cluster manager integrity... Checking CSS daemon... Node Name Status ------------------------------------ ------------------------ grac42 running grac41 not running ERROR: PRVF-5319 : Oracle Cluster Synchronization Services do not appear to be online. Cluster manager integrity check failed --> Expected error as lower CRS stack is not completely up and running UDev attributes check for OCR locations started... Result: UDev attributes check passed for OCR locations UDev attributes check for Voting Disk locations started... Result: UDev attributes check passed for Voting Disk locations Check default user file creation mask Node Name Available Required Comment ------------ ------------------------ ------------------------ ---------- grac42 22 0022 passed grac41 22 0022 passed Result: Default user file creation mask check passed Checking cluster integrity... Node Name ------------------------------------ grac41 grac42 grac43 Cluster integrity check failed This check did not run on the following node(s): grac41 Checking OCR integrity... Checking the absence of a non-clustered configuration... All nodes free of non-clustered, local-only configurations ERROR: PRVF-4193 : Asm is not running on the following nodes. Proceeding with the remaining nodes. grac41 --> Expected error as lower CRS stack is not completely up and running Checking OCR config file "/etc/oracle/ocr.loc"... OCR config file "/etc/oracle/ocr.loc" check successful ERROR: PRVF-4195 : Disk group for ocr location "+OCR" not available on the following nodes: grac41 --> Expected error as lower CRS stack is not completly up and running NOTE: This check does not verify the integrity of the OCR contents. Execute 'ocrcheck' as a privileged user to verify the contents of OCR. OCR integrity check failed Checking CRS integrity... Clusterware version consistency passed The Oracle Clusterware is healthy on node "grac42" ERROR: PRVF-5305 : The Oracle Clusterware is not healthy on node "grac41" CRS-4535: Cannot communicate with Cluster Ready Services CRS-4530: Communications failure contacting Cluster Synchronization Services daemon CRS-4534: Cannot communicate with Event Manager CRS integrity check failed --> Expected error as lower CRS stack is not completly up and running Checking node application existence... Checking existence of VIP node application (required) Node Name Required Running? Comment ------------ ------------------------ ------------------------ ---------- grac42 yes yes passed grac41 yes no exists VIP node application is offline on nodes "grac41" Checking existence of NETWORK node application (required) Node Name Required Running? Comment ------------ ------------------------ ------------------------ ---------- grac42 yes yes passed grac41 yes no failed PRVF-4570 : Failed to check existence of NETWORK node application on nodes "grac41" --> Expected error as lower CRS stack is not completly up and running Checking existence of GSD node application (optional) Node Name Required Running? Comment ------------ ------------------------ ------------------------ ---------- grac42 no no exists grac41 no no exists GSD node application is offline on nodes "grac42,grac41" Checking existence of ONS node application (optional) Node Name Required Running? Comment ------------ ------------------------ ------------------------ ---------- grac42 no yes passed grac41 no no failed PRVF-4576 : Failed to check existence of ONS node application on nodes "grac41" --> Expected error as lower CRS stack is not completly up and running Checking Single Client Access Name (SCAN)... SCAN Name Node Running? ListenerName Port Running? ---------------- ------------ ------------ ------------ ------------ ------------ grac4-scan.grid4.example.com grac43 true LISTENER_SCAN1 1521 true grac4-scan.grid4.example.com grac42 true LISTENER_SCAN2 1521 true Checking TCP connectivity to SCAN Listeners... Node ListenerName TCP connectivity? ------------ ------------------------ ------------------------ grac42 LISTENER_SCAN1 yes grac42 LISTENER_SCAN2 yes TCP connectivity to SCAN Listeners exists on all cluster nodes Checking name resolution setup for "grac4-scan.grid4.example.com"... Checking integrity of name service switch configuration file "/etc/nsswitch.conf" ... Checking if "hosts" entry in file "/etc/nsswitch.conf" is consistent across nodes... Checking file "/etc/nsswitch.conf" to make sure that only one "hosts" entry is defined More than one "hosts" entry does not exist in any "/etc/nsswitch.conf" file All nodes have same "hosts" entry defined in file "/etc/nsswitch.conf" Check for integrity of name service switch configuration file "/etc/nsswitch.conf" passed SCAN Name IP Address Status Comment ------------ ------------------------ ------------------------ ---------- grac4-scan.grid4.example.com 192.168.1.165 passed grac4-scan.grid4.example.com 192.168.1.168 passed grac4-scan.grid4.example.com 192.168.1.170 passed Verification of SCAN VIP and Listener setup passed Checking OLR integrity... Checking OLR config file... ERROR: PRVF-4184 : OLR config file check failed on the following nodes: grac41 grac41:Group of file "/etc/oracle/olr.loc" did not match the expected value. [Expected = "oinstall" ; Found = "root"] Fix : [grid@grac41 ~]$ ls -l /etc/oracle/olr.loc -rw-r--r--. 1 root root 81 May 11 14:02 /etc/oracle/olr.loc root@grac41 Desktop]# chown root:oinstall /etc/oracle/olr.loc Checking OLR file attributes... OLR file check successful OLR integrity check failed Checking GNS integrity... Checking if the GNS subdomain name is valid... The GNS subdomain name "grid4.example.com" is a valid domain name Checking if the GNS VIP belongs to same subnet as the public network... Public network subnets "192.168.1.0" match with the GNS VIP "192.168.1.0" Checking if the GNS VIP is a valid address... GNS VIP "192.168.1.59" resolves to a valid IP address Checking the status of GNS VIP... Checking if FDQN names for domain "grid4.example.com" are reachable PRVF-5216 : The following GNS resolved IP addresses for "grac4-scan.grid4.example.com" are not reachable: "192.168.1.168" PRKN-1035 : Host "192.168.1.168" is unreachable --> GNS resolved IP addresses are reachable GNS resolved IP addresses are reachable GNS resolved IP addresses are reachable GNS resolved IP addresses are reachable Checking status of GNS resource... Node Running? Enabled? ------------ ------------------------ ------------------------ grac42 yes yes grac41 no yes GNS resource configuration check passed Checking status of GNS VIP resource... Node Running? Enabled? ------------ ------------------------ ------------------------ grac42 yes yes grac41 no yes GNS VIP resource configuration check passed. GNS integrity check passed OCR detected on ASM. Running ACFS Integrity checks... Starting check to see if ASM is running on all cluster nodes... PRVF-5110 : ASM is not running on nodes: "grac41," --> Expected error as lower CRS stack is not completly up and running Starting Disk Groups check to see if at least one Disk Group configured... Disk Group Check passed. At least one Disk Group configured Task ACFS Integrity check failed Checking to make sure user "grid" is not in "root" group Node Name Status Comment ------------ ------------------------ ------------------------ grac42 passed does not exist grac41 passed does not exist Result: User "grid" is not part of "root" group. Check passed Checking if Clusterware is installed on all nodes... Check of Clusterware install passed Checking if CTSS Resource is running on all nodes... Check: CTSS Resource running on all nodes Node Name Status ------------------------------------ ------------------------ grac42 passed grac41 failed PRVF-9671 : CTSS on node "grac41" is not in ONLINE state, when checked with command "/u01/app/11204/grid/bin/crsctl stat resource ora.ctssd -init" --> Expected error as lower CRS stack is not completly up and running Result: Check of CTSS resource passed on all nodes Querying CTSS for time offset on all nodes... Result: Query of CTSS for time offset passed Check CTSS state started... Check: CTSS state Node Name State ------------------------------------ ------------------------ grac42 Observer CTSS is in Observer state. Switching over to clock synchronization checks using NTP Starting Clock synchronization checks using Network Time Protocol(NTP)... NTP Configuration file check started... The NTP configuration file "/etc/ntp.conf" is available on all nodes NTP Configuration file check passed Checking daemon liveness... Check: Liveness for "ntpd" Node Name Running? ------------------------------------ ------------------------ grac42 yes Result: Liveness check passed for "ntpd" Check for NTP daemon or service alive passed on all nodes Checking NTP daemon command line for slewing option "-x" Check: NTP daemon command line Node Name Slewing Option Set? ------------------------------------ ------------------------ grac42 yes Result: NTP daemon slewing option check passed Checking NTP daemon's boot time configuration, in file "/etc/sysconfig/ntpd", for slewing option "-x" Check: NTP daemon's boot time configuration Node Name Slewing Option Set? ------------------------------------ ------------------------ grac42 yes Result: NTP daemon's boot time configuration check for slewing option passed Checking whether NTP daemon or service is using UDP port 123 on all nodes Check for NTP daemon or service using UDP port 123 Node Name Port Open? ------------------------------------ ------------------------ grac42 yes NTP common Time Server Check started... NTP Time Server ".LOCL." is common to all nodes on which the NTP daemon is running Check of common NTP Time Server passed Clock time offset check from NTP Time Server started... Checking on nodes "[grac42]"... Check: Clock time offset from NTP Time Server Time Server: .LOCL. Time Offset Limit: 1000.0 msecs Node Name Time Offset Status ------------ ------------------------ ------------------------ grac42 0.0 passed Time Server ".LOCL." has time offsets that are within permissible limits for nodes "[grac42]". Clock time offset check passed Result: Clock synchronization check using Network Time Protocol(NTP) passed PRVF-9652 : Cluster Time Synchronization Services check failed --> Expected error as lower CRS stack is not completly up and running Checking VIP configuration. Checking VIP Subnet configuration. Check for VIP Subnet configuration passed. Checking VIP reachability Check for VIP reachability passed. Post-check for cluster services setup was unsuccessful. Checks did not pass for the following node(s): grac41
Verify your DHCP setup ( only if using GNS )
[root@gract1 Desktop]# cluvfy comp dhcp -clustername gract -verbose Checking if any DHCP server exists on the network... PRVG-5723 : Network CRS resource is configured to use DHCP provided IP addresses Verification of DHCP Check was unsuccessful on all the specified nodes. --> If network resource is ONLINE you aren't allowed to run this command DESCRIPTION: Checks if DHCP server exists on the network and is capable of providing required number of IP addresses. This check also verifies the response time for the DHCP server. The checks are all done on the local node. For port values less than 1024 CVU needs to be run as root user. If -networks is specified and it contains a PUBLIC network then DHCP packets are sent on the public network. By default the network on which the host IP is specified is used. This check must not be done while default network CRS resource configured to use DHCP provided IP address is online. In my case even stopping nodeapps doesn't help . Only a full cluster shutdown the command seems query the DHCP server ! [root@gract1 Desktop]# cluvfy comp dhcp -clustername gract -verbose Verifying DHCP Check Checking if any DHCP server exists on the network... Checking if network CRS resource is configured and online Network CRS resource is offline or not configured. Proceeding with DHCP checks. CRS-10009: DHCP server returned server: 192.168.1.50, loan address : 192.168.1.170/255.255.255.0, lease time: 21600 At least one DHCP server exists on the network and is listening on port 67 Checking if DHCP server has sufficient free IP addresses for all VIPs... Sending DHCP "DISCOVER" packets for client ID "gract-scan1-vip" CRS-10009: DHCP server returned server: 192.168.1.50, loan address : 192.168.1.170/255.255.255.0, lease time: 21600 Sending DHCP "REQUEST" packets for client ID "gract-scan1-vip" CRS-10009: DHCP server returned server: 192.168.1.50, loan address : 192.168.1.170/255.255.255.0, lease time: 21600 Sending DHCP "DISCOVER" packets for client ID "gract-scan2-vip" CRS-10009: DHCP server returned server: 192.168.1.50, loan address : 192.168.1.169/255.255.255.0, lease time: 21600 Sending DHCP "REQUEST" packets for client ID "gract-scan2-vip" CRS-10009: DHCP server returned server: 192.168.1.50, loan address : 192.168.1.169/255.255.255.0, lease time: 21600 Sending DHCP "DISCOVER" packets for client ID "gract-scan3-vip" CRS-10009: DHCP server returned server: 192.168.1.50, loan address : 192.168.1.168/255.255.255.0, lease time: 21600 Sending DHCP "REQUEST" packets for client ID "gract-scan3-vip" CRS-10009: DHCP server returned server: 192.168.1.50, loan address : 192.168.1.168/255.255.255.0, lease time: 21600 Sending DHCP "DISCOVER" packets for client ID "gract-gract1-vip" CRS-10009: DHCP server returned server: 192.168.1.50, loan address : 192.168.1.174/255.255.255.0, lease time: 21600 Sending DHCP "REQUEST" packets for client ID "gract-gract1-vip" CRS-10009: DHCP server returned server: 192.168.1.50, loan address : 192.168.1.174/255.255.255.0, lease time: 21600 CRS-10012: released DHCP server lease for client ID gract-scan1-vip on port 67 CRS-10012: released DHCP server lease for client ID gract-scan2-vip on port 67 CRS-10012: released DHCP server lease for client ID gract-scan3-vip on port 67 CRS-10012: released DHCP server lease for client ID gract-gract1-vip on port 67 DHCP server was able to provide sufficient number of IP addresses The DHCP server response time is within acceptable limits Verification of DHCP Check was successful. The nameserver /var/log/messages shows the following: Jan 21 14:42:53 ns1 dhcpd: DHCPDISCOVER from 00:00:00:00:00:00 via eth2 Jan 21 14:42:54 ns1 dhcpd: DHCPOFFER on 192.168.1.170 to 00:00:00:00:00:00 via eth2 Jan 21 14:42:54 ns1 dhcpd: DHCPDISCOVER from 00:00:00:00:00:00 via eth2 Jan 21 14:42:54 ns1 dhcpd: DHCPOFFER on 192.168.1.170 to 00:00:00:00:00:00 via eth2 Jan 21 14:42:54 ns1 dhcpd: DHCPDISCOVER from 00:00:00:00:00:00 via eth2 Jan 21 14:42:54 ns1 dhcpd: DHCPOFFER on 192.168.1.170 to 00:00:00:00:00:00 via eth2 Jan 21 14:42:55 ns1 dhcpd: Wrote 6 leases to leases file. Jan 21 14:42:55 ns1 dhcpd: DHCPREQUEST for 192.168.1.170 (192.168.1.50) from 00:00:00:00:00:00 via eth2 Jan 21 14:42:55 ns1 dhcpd: DHCPACK on 192.168.1.170 to 00:00:00:00:00:00 via eth2 Jan 21 14:42:55 ns1 dhcpd: DHCPDISCOVER from 00:00:00:00:00:00 via eth2 Jan 21 14:42:56 ns1 dhcpd: DHCPOFFER on 192.168.1.169 to 00:00:00:00:00:00 via eth2 Jan 21 14:42:56 ns1 dhcpd: DHCPDISCOVER from 00:00:00:00:00:00 via eth2