RAC Wait Events Summary

GC Current Block 2-Way/3-Way

  • For a Current Block in read mode a KPJUERPR ( Protected Read ) lock is requested
  • Wait event can occur for READ and WRITE activities
  • If accessing a locally mastered block on the local instance no GCS lock is needed ( FG can access this block without any GCS support – Affinity locking )
  • Excessive waits for gc current block are either related to inefficient QEP leading to nummerous block visits or application affinity not being in play

GC CR  Block 2-Way/3-Way ( for detail please read follwing article )

  • CR block transfers are requested for Read Only access and are specific to a certain session and SQL statment
  • Next execution of same statement will again trigger this wait event as SCN is increased and CR block needs to be refreshed
  •  No locks are maintained by GCS for CR blocks
  • Long-pending transaction on highly accessed objects can lead to CR storm ( Run this sort of transactions at a less busy timeframe )

GC CR Grant 2-Way/GC Current Grant 2-Way

  • If a block is not resident in any buffer cache  LMS grants the FG process to read the block from disk
  • Excessive waits for gc cr/current 2-way wait event can be caused by a undersize buffer cache or SQL stmt flushing the buffer cache
  • This sort of messages can be used to measure your Network performance as very less pressing is done for this event

GC CR Block Busy/GC Current Block Busy

 

  •  LMS performed add work due to concurrency-related issues ( like build a CR block and apply UNDO to reconstruct a block with query SCN )

GC CR Block Congested/GC Current Block Congested

  • if LMS process did not process a request within 1ms than LMS marks the response to that block with the congestion wait event
  • Root cause: LMS is suffering CPU scheduling, LMS is suffering resources like memory ( paging )
  • As LMS processes are RT processes OS scheduling delays should be minimal

GC current request/GC CR request

  • These are placeholder requests which should which should be mapped to one of the above wait events once the LMS responds
  • If there is no response in 0.5s from LMS ( 6s on windows ) the acoounted time is added to GC Lost Block wait event

GC Log Flush Sync

  • for details please  read following link..
  • in a healthy database 90 % of GCS Log flush Sync requet should be finished in 2ms or less
  • Always check related instances for LOG FILE SYNC waiut event as this event also reduces the available Redo I/O bandwith 
  • Monitor Wait distribution/histogramms  with following script : @event_histogram_from_awr.sql

 

GC Current Block Busy/GC CR block busy

  • Busy events indicate that LMS needs to perform add. work before sending a block ( like apply Redo to build CR block )

GC Buffer Busy Acquire/Release

  • For RHS index growth you see GC Buffer Busy Acquire/Release and GC Buffer Busy events and gc current request for the same block
  • If the block is on instance 1 all other sessions from instance 1 are waiting on gc buffer busy aquire whereas sessions on the other instances are waiting on gc buffer busy release
  • Always monitor GV$session P1,P2,P3 to get detail information about what blocks are used

 

Relocate OCR and Voting Disks to a different ASM diskgroup ( 11.2.0.4 )

Create a new diskgroup for OCR and Voting disk

Use ascma and create new Data group named OCR and verify that this datagroup is mounted
$ asmcmd lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  NORMAL  N         512   4096  1048576     40944    25698            10236            7731              0             Y  DATA/
MOUNTED  NORMAL  N         512   4096  1048576      6141     5730             2047            1841              0             N  OCR/
Attention:
  To avoid error CRS-4602: Failed 27 to add voting file .. during running $GRID_HOME/bin/crsctl  replace votedisk  double check that the
  the newly created diskgroup is mounted on any cluster instances by running 
$ asmcmd lsdg
$ asmcmd lsdsk 
on each instance.

Trace File Analyzer Collector – TFA Collector Usage and installation

Best TFA practices

  • To cleanup TFA BDB you need to uninstall/install TFA
  • Always use at least 4h for diag collection period ( 4h is default )
  • Check TFA repository for free space
  • Add your OSWatcher files  by running : tfactl directory add /home/OSW/oswbb/archive
  • Always provide graphical output for OSWatcher data for each node
  •  Sample:   # java -jar oswbba.jar -i  ./new_archive  -P grac1 -6 -7 -8  -B  Oct 23 10:00:00 2013 -E Oct 23 12:00:00 2013
  • Check that zip file for all nodes are created – rerun trace collection command if not
# ls /u01/app/grid/tfa/repository/collection_Wed_Oct_23_12_26_11_CEST_2013_node_all/*.zip
/u01/app/grid/tfa/repository/collection_Wed_Oct_23_12_26_11_CEST_2013_node_all/grac1.tfa_Wed_Oct_23_12_26_11_CEST_2013.zip
/u01/app/grid/tfa/repository/collection_Wed_Oct_23_12_26_11_CEST_2013_node_all/grac2.tfa_Wed_Oct_23_12_26_11_CEST_2013.zip
/u01/app/grid/tfa/repository/collection_Wed_Oct_23_12_26_11_CEST_2013_node_all/grac3.tfa_Wed_Oct_23_12_26_11_CEST_2013.zip
  • Check above zip files are holding major trace files ( rdbms alert.log / CRS alert.log / ocssd.log )
# unzip -l grac1.tfa_Wed_Oct_23_12_26_11_CEST_2013.zip | egrep 'alert|ocssd.log'
  2019952  10-23-2013 12:26   grac1//u01/app/11203/grid/log/grac1/cssd/ocssd.log
    10802  10-23-2013 12:27   grac1/rdbms/grace2/grac21/trace/alert_grac21.log
      353  10-23-2013 12:27   grac1//u01/app/11203/grid/log/grac1/alertgrac1.log

Reinstall TFA ( > 11.2.0.4 ) on a specific node

Check status 
[root@grac42 ~]# $GRID_HOME/bin/tfactl print status
.----------------------------------------------------------------------------------------------.
| Host   | Status of TFA | PID   | Port | Version    | Build ID             | Inventory Status |
+--------+---------------+-------+------+------------+----------------------+------------------+
| grac42 | RUNNING       |  1015 | 5000 | 12.1.2.0.0 | 12120020140619094932 | COMPLETE         |
| grac41 | RUNNING       | 31453 | 5000 | 12.1.2.0.0 | 12120020140619094932 | COMPLETE         |
'--------+---------------+-------+------+------------+----------------------+------------------'

Uninstal TFA on grac42
[root@grac42 ~]#  $GRID_HOME/bin/tfactl  uninstall
TFA will be Uninstalled on Node grac42: 
Removing TFA from grac42 only
Please remove TFA locally on any other configured nodes
.
[root@grac41 Desktop]#  $GRID_HOME/bin/tfactl print status
.----------------------------------------------------------------------------------------------.
| Host   | Status of TFA | PID   | Port | Version    | Build ID             | Inventory Status |
+--------+---------------+-------+------+------------+----------------------+------------------+
| grac41 | RUNNING       | 31453 | 5000 | 12.1.2.0.0 | 12120020140619094932 | COMPLETE         |
'--------+---------------+-------+------+------------+----------------------+------------------'
--> TFA removed on node grac42

Re-install TFA 
[root@grac42 ~]# $GRID_HOME/crs/install/tfa_setup.sh -silent -crshome /u01/app/11204/grid
Starting TFA installation
Using JAVA_HOME : /u01/app/11204/grid/jdk/jre
..
Installing TFA on grac42:
HOST: grac42    TFA_HOME: /u01/app/11204/grid/tfa/grac42/tfa_home

.--------------------------------------------------------------------------.
| Host   | Status of TFA | PID  | Port | Version    | Build ID             |
+--------+---------------+------+------+------------+----------------------+
| grac42 | RUNNING       | 8457 | 5000 | 12.1.2.0.0 | 12120020140619094932 |
'--------+---------------+------+------+------------+----------------------'

Verify TFA status
root@grac42 ~]#    $GRID_HOME/bin/tfactl print status
.----------------------------------------------------------------------------------------------.
| Host   | Status of TFA | PID   | Port | Version    | Build ID             | Inventory Status |
+--------+---------------+-------+------+------------+----------------------+------------------+
| grac42 | RUNNING       |  8457 | 5000 | 12.1.2.0.0 | 12120020140619094932 | COMPLETE         |
| grac41 | RUNNING       | 31453 | 5000 | 12.1.2.0.0 | 12120020140619094932 | COMPLETE         |
'--------+---------------+-------+------+------------+----------------------+------------------'
--> TFA reinstalled on node grac42

Installation of TFA Collector on top of 11.2.0.3

#  ./installTFALite.sh
Starting TFA installation
Enter a location for installing TFA (/tfa will be appended if not supplied) [/home/oracle/RAC/TFA]: 
  /u01/app/grid
Enter a Java Home that contains Java 1.5 or later : 
  /usr/lib/jvm/jre-1.6.0-openjdk.x86_64
Would you like to do a [L]ocal only or [C]lusterwide installation ? [L|l|C|c] [C] : C
TFA Will be Installed on the Following Nodes:
++++++++++++++++++++++++++++++++++++++++++++
Install Nodes
=============
grac1
grac2
grac3
Do you wish to make changes to the Node List ? [Y/y/N/n] [N] N
Installing TFA on grac1
HOST: grac1    TFA_HOME: /u01/app/grid/tfa/grac1/tfa_home
Installing TFA on grac2
HOST: grac2    TFA_HOME: /u01/app/grid/tfa/grac2/tfa_home
Installing TFA on grac3
HOST: grac3    TFA_HOME: /u01/app/grid/tfa/grac3/tfa_home
Host grac2 is part of TFA cluster
Host grac3 is part of TFA cluster
.-----------------------------------------------.
| Host  | Status of TFA | PID  | Port | Version |
+-------+---------------+------+------+---------+
| grac1 | RUNNING       | 9241 | 5000 | 2.5.1.5 |
| grac2 | RUNNING       | 6462 | 5000 | 2.5.1.5 |
| grac3 | RUNNING       | 6555 | 5000 | 2.5.1.5 |
'-------+---------------+------+------+---------'
Summary of TFA Installation:
.--------------------------------------------------------.
|                          grac1                         |
+---------------------+----------------------------------+
| Parameter           | Value                            |
+---------------------+----------------------------------+
| Install location    | /u01/app/grid/tfa/grac1/tfa_home |
| Repository location | /u01/app/grid/tfa/repository     |
| Repository usage    | 0 MB out of 2629 MB              |
'---------------------+----------------------------------'
.--------------------------------------------------------.
|                          grac2                         |
+---------------------+----------------------------------+
| Parameter           | Value                            |
+---------------------+----------------------------------+
| Install location    | /u01/app/grid/tfa/grac2/tfa_home |
| Repository location | /u01/app/grid/tfa/repository     |
| Repository usage    | 0 MB out of 2629 MB              |
'---------------------+----------------------------------'
.--------------------------------------------------------.
|                          grac3                         |
+---------------------+----------------------------------+
| Parameter           | Value                            |
+---------------------+----------------------------------+
| Install location    | /u01/app/grid/tfa/grac3/tfa_home |
| Repository location | /u01/app/grid/tfa/repository     |
| Repository usage    | 0 MB out of 2629 MB              |
'---------------------+----------------------------------'
Removed ssh equivalency setup on grac3
Removed ssh equivalency setup on grac2
TFA is successfully installed..

Start , stop and shutdown TFA

Stop TFA
# /etc/init.d/init.tfa stop
Stopping TFA
TFA is running  - Will wait 5 seconds (up to 3 times)  
TFA is running  - Will wait 5 seconds (up to 3 times)  
TFA is running  - Will wait 5 seconds (up to 3 times)  
TFAmain Force Stopped Successfully
. . . 
Successfully stopped TFA..

Start TFA
# /etc/init.d/init.tfa  start
Starting TFA..
start: Job is already running: oracle-tfa
Waiting up to 100 seconds for TFA to be started..
. . . . . 
.
. . . . . 
Successfully started TFA Process..
. . . . . 
TFA Started and listening for commands

Other useful  commands 
Restart TFA
# /etc/init.d/init.tfa  restart

Stop TFAMain process and removes related inittab entries
# /etc/init.d/init.tfa shutdown

Verify TFA status

Verify TFA runtime status
# /u01/app/grid/tfa/bin/tfactl print status
.-----------------------------------------------.
| Host  | Status of TFA | PID  | Port | Version |
+-------+---------------+------+------+---------+
| grac1 | RUNNING       | 9241 | 5000 | 2.5.1.5 |
| grac2 | RUNNING       | 6462 | 5000 | 2.5.1.5 |
| grac3 | RUNNING       | 6555 | 5000 | 2.5.1.5 |
'-------+---------------+------+------+---------'

Show current TFA configuration

# /u01/app/grid/tfa/bin/tfactl print config
.---------------------------------------------------.
| Configuration Parameter                 | Value   |
+-----------------------------------------+---------+
| TFA version                             | 2.5.1.5 |
| Automatic diagnostic collection         | OFF     |
| Trimming of files during diagcollection | ON      |
| Repository current size (MB) in grac1   | 6       |
| Repository maximum size (MB) in grac1   | 2629    |
| Trace level                             | 1       |
'-----------------------------------------+---------'

Use tfactl to check for out standing actions ( for example during diacollect )

#   $GRID_HOME/tfa/bin/tfactl  print  actions
.------------------------------------------------------------------------.
| HOST   | TIME         | ACTION         | STATUS  | COMMENTS            |
+--------+--------------+----------------+---------+---------------------+
| grac42 | Mar 04 08:56 | Collect traces | RUNNING | Collection details: |
|        |              | & zip          |         |                     |
|        |              |                |         | Zip file:           |
|        |              |                |         | tfa_Tue_Mar_4_08_56 |
|        |              |                |         | _25_CET_2014.zip    |
|        |              |                |         | Tag:                |
|        |              |                |         | collection_Tue_Mar_ |
|        |              |                |         | 4_08_56_25_CET_2014 |
|        |              |                |         | _node_all           |
+--------+--------------+----------------+---------+---------------------+
| grac42 | Mar 04 08:56 | Run inventory  | RUNNING | RDBMS               |
|        |              |                |         | all:ASM:CRS:DBWLM:A |
|        |              |                |         | CFS:CRS:ASM:OS:INST |
|        |              |                |         | ALL:TNS:CHMOS       |
+--------+--------------+----------------+---------+---------------------+
| grac41 | Mar 04 08:56 | Run inventory  | RUNNING | -c:RDBMS            |
|        |              |                |         | all:ASM:CRS:DBWLM:A |
|        |              |                |         | CFS:CRS:ASM:OS:INST |
|        |              |                |         | ALL:TNS:CHMOS       |
+--------+--------------+----------------+---------+---------------------+
| grac41 | Mar 04 08:56 | Collect traces | RUNNING | Collection details: |
|        |              | & zip          |         |                     |
|        |              |                |         | Zip file:           |
|        |              |                |         | tfa_Tue_Mar_4_08_56 |
|        |              |                |         | _25_CET_2014.zip    |
|        |              |                |         | Tag:                |
|        |              |                |         | collection_Tue_Mar_ |
|        |              |                |         | 4_08_56_25_CET_2014 |
|        |              |                |         | _node_all           |
+--------+--------------+----------------+---------+---------------------+
| grac43 | Mar 04 08:56 | Collect traces | RUNNING | Collection details: |
|        |              | & zip          |         |                     |
|        |              |                |         | Zip file:           |
|        |              |                |         | tfa_Tue_Mar_4_08_56 |
|        |              |                |         | _25_CET_2014.zip    |
|        |              |                |         | Tag:                |
|        |              |                |         | collection_Tue_Mar_ |
|        |              |                |         | 4_08_56_25_CET_2014 |
|        |              |                |         | _node_all           |
+--------+--------------+----------------+---------+---------------------+
| grac43 | Mar 04 08:56 | Run inventory  | RUNNING | RDBMS               |
|        |              |                |         | all:ASM:CRS:DBWLM:A |
|        |              |                |         | CFS:CRS:ASM:OS:INST |
|        |              |                |         | ALL:TNS:CHMOS       |
'--------+--------------+----------------+---------+---------------------'

Use tfactl to check for Errors and Startups

Check for errors:
# /u01/app/grid/tfa/bin/tfactl print errors 
++++++ Error Start +++++
Event Id          : GRACE232lppjmk62mrsu7si7ltbau98j
File Name         : /u01/app/oracle/diag/rdbms/grace2/GRACE21/trace/alert_GRACE21.log
Error Code        : ORA-1109
Error Description : ORA-1109 signalled during: ALTER DATABASE CLOSE NORMAL...
Error Time        : Mon Jul 22 12:37:06 CEST 2013
Startup Time          : Mon Jul 22 12:36:39 CEST 2013
Trace File Name   : NONE
++++++ Error End +++++
..
Check for database startups:
# /u01/app/grid/tfa/bin/tfactl print startups
++++++ Startup Start +++++
Event Id     : GRACE27rpp25v1c83sago9ohf1s6gn8s
File Name    : /u01/app/oracle/diag/rdbms/grace2/GRACE21/trace/alert_GRACE21.log
Startup Time : Mon Jul 22 12:32:34 CEST 2013
Dummy        : FALSE
++++++ Startup End +++++
++++++ Startup Start +++++
Event Id     : GRACE2nahm9evlmrtssl7etl36u7n25f
File Name    : /u01/app/oracle/diag/rdbms/grace2/GRACE21/trace/alert_GRACE21.log
Startup Time : Mon Jul 22 12:36:39 CEST 2013
Dummy        : FALSE
++++++ Startup End +++++

Collect tracefiles using TFA

# /u01/app/grid/tfa/bin/tfactl diagcollect -all -since 1h
Collecting data for all components using above parameters...
Running an inventory clusterwide ...
Run inventory completed locally ...
Collection name tfa_Thu_Sep_26_11_58_47_CEST_2013.zip
Sending diagcollect request to host : grac2
Sending diagcollect request to host : grac3
Getting list of files satisfying time range [Thu Sep 26 10:59:30 CEST 2013, Thu Sep 26 11:59:30 CEST 2013]
grac1: Zipping File: /u01/app/oracle/diag/rdbms/grace2/GRACE2_1/trace/GRACE2_1_dbrm_4375.trc
...
grac1: Zipping File: /u01/app/oracle/diag/rdbms/race2/RACE21/trace/alert_RACE21.log
Trimming file : /u01/app/oracle/diag/rdbms/race2/RACE21/trace/alert_RACE21.log with original file size : 255kB
Collecting extra files...
Total Number of Files checked : 2181
Total Size of all Files Checked : 1.7GB
Number of files containing required range : 41
Total Size of Files containing required range : 106MB
Number of files trimmed : 16
Total Size of data prior to zip : 43MB
Saved 77MB by trimming files
Zip file size : 1.9MB
Total time taken : 101s
Completed collection of zip files.

Logs are collected to:
/u01/app/grid/tfa/repository/collection_Thu_Sep_26_11_58_47_CEST_2013_node_all/grac1.tfa_Thu_Sep_26_11_58_47_CEST_2013.zip
/u01/app/grid/tfa/repository/collection_Thu_Sep_26_11_58_47_CEST_2013_node_all/grac2.tfa_Thu_Sep_26_11_58_47_CEST_2013.zip
/u01/app/grid/tfa/repository/collection_Thu_Sep_26_11_58_47_CEST_2013_node_all/grac3.tfa_Thu_Sep_26_11_58_47_CEST_2013.zip

Collect trace by a specific time
# /u01/app/grid/tfa/bin/tfactl diagcollect -all -from "Oct/18/2013 00:00:00" -to "Oct/18/2013 06:00:00" 

Collecting data for all components using above parameters...
Scanning files from Oct/18/2013 00:00:00 to Oct/18/2013 06:00:00
Running an inventory clusterwide ...

Collecting diagnostic data for a specific day
# $GRID_HOME/bin/tfactl diagcollect -all -for "Mar/22/2014"
Collecting data for all components using above parameters...
Collecting data for all nodes
Scanning files for Mar/22/2014
Repository Location in grac41 : /u01/app/grid/tfa/repository
2014/03/22 14:03:35 CET : Running an inventory clusterwide ...
2014/03/22 14:03:36 CET : Collection Name : tfa_Sat_Mar_22_14_03_29_CET_2014.zip
2014/03/22 14:03:43 CET : Sending diagcollect request to host : grac42
2014/03/22 14:03:43 CET : Sending diagcollect request to host : grac43 
....
Logs are collected to:
/u01/app/grid/tfa/repository/collection_Sat_Mar_22_14_03_29_CET_2014_node_all/grac41.tfa_Sat_Mar_22_14_03_29_CET_2014.zip
/u01/app/grid/tfa/repository/collection_Sat_Mar_22_14_03_29_CET_2014_node_all/grac42.tfa_Sat_Mar_22_14_03_29_CET_2014.zip
/u01/app/grid/tfa/repository/collection_Sat_Mar_22_14_03_29_CET_2014_node_all/grac43.tfa_Sat_Mar_22_14_03_29_CET_2014.zip

 

Add and remove directory

  • TFA directories may change due to switching from adming managed to policy managed
  • You run add database / add instance
  • Keep your TFA repository in sync with any changes
Remove directories:
# /u01/app/grid/tfa/bin/tfactl directory remove  /u01/app/oracle/diag/rdbms/grace2/GRACE2_1/trace
# /u01/app/grid/tfa/bin/tfactl directory remove /u01/app/oracle/diag/rdbms/grace2/GRACE21/trace      

Add a directory:
# /u01/app/grid/tfa/bin/tfactl directory  add  /u01/app/oracle/diag/rdbms/grace2/grac21/trace

 

Add OSWatcher archive for TFA processing

Check TFA archive location:
$  ps -ef | grep OSWatcher  | grep -v grep
root     11018     1  0 14:51 ?        00:00:00 /bin/sh ./OSWatcher.sh 10 60 gzip /home/OSW/oswbb/archive
root     12028 11018  0 14:51 ?        00:00:00 /bin/sh ./OSWatcherFM.sh 60 /home/OSW/oswbb/archive
-->  /home/OSW/oswbb/archive is OSWatcher tracefile location

Run now on each instance:
# /u01/app/grid/tfa/bin/tfactl directory add /home/OSW/oswbb/archive

Add CHM directory

  • Note you should not need to add CHM directory as this directory should be already availabe
  • Check with  $GRID_HOME/tfa/bin/tfactl  print directories  | grep crf
First check whether a crf directory is already available
$  $GRID_HOME/tfa/bin/tfactl  print directories  | grep crf
| /u01/app/11204/grid/log/grac41/crf | CRS                                    | public     | root     |
If your get above output - stop here as the directory is already listed

Find CHM repository directory
Note : ologgerd is only running on a single node - Read following link to find this node. 

$ ps -ef | grep ologgerd | grep -v grep
root     16996     1  0 01:06 ?        00:04:19 /u01/app/11203/grid/bin/ologgerd -m grac2 -r -d /u01/app/11203/grid/crf/db/grac1

Add directory to TFA
# /u01/app/grid/tfa//bin/tfactl directory add /u01/app/11203/grid/crf/db/grac1
Failed to add directory to TFA. Unable to determine parameters for directory: /u01/app/11203/grid/crf/db/grac1

Please enter component for this Directory [RDBMS|CRS|ASM|INSTALL|OS|CFGTOOLS] : OS
Running Inventory ...
.----------------------------------------------------------------------------------------------------------------------------.
| Trace Directory                                                   | Component                      | Permission | Added By |
+-------------------------------------------------------------------+--------------------------------+------------+----------+
| /etc/oracle                                                       | CRS                            | public     | root     |
+-------------------------------------------------------------------+--------------------------------+------------+----------+
| /home/OSW/oswbb/archive                                           | OS                             | public     | root     |
+-------------------------------------------------------------------+--------------------------------+------------+----------+
| /u01/app/11203/grid/cfgtoollogs/opatch                            | INSTALL                        | public     | root     |
+-------------------------------------------------------------------+--------------------------------+------------+----------+
| /u01/app/11203/grid/crf/db/grac1                                  | OS                             | public     | root     |
+-------------------------------------------------------------------+--------------------------------+------------+----------+

 

Verify TFA repository space

# /u01/app/grid/tfa/bin/tfactl print repository
.-----------------------------------------------------.
|                        grac1                        |
+----------------------+------------------------------+
| Repository Parameter | Value                        |
+----------------------+------------------------------+
| Location             | /u01/app/grid/tfa/repository |
| Maximum Size (MB)    | 2092                         |
| Current Size (MB)    | 156                          |
| Status               | OPEN                         |
'----------------------+------------------------------'

.-----------------------------------------------------.
|                        grac3                        |
+----------------------+------------------------------+
| Repository Parameter | Value                        |
+----------------------+------------------------------+
| Location             | /u01/app/grid/tfa/repository |
| Maximum Size (MB)    | 2273                         |
| Current Size (MB)    | 67                           |
| Status               | OPEN                         |
'----------------------+------------------------------'

.-----------------------------------------------------.
|                        grac2                        |
+----------------------+------------------------------+
| Repository Parameter | Value                        |
+----------------------+------------------------------+
| Location             | /u01/app/grid/tfa/repository |
| Maximum Size (MB)    | 1981                         |
| Current Size (MB)    | 60                           |
| Status               | OPEN                         |
'----------------------+------------------------------'

Setting TFA tracelevel and review trace file

# /u01/app/grid/tfa/bin/tfactl set tracelevel=4 -c
Running on Host : grac1 
Running Check through Java CLI
Opening Port file /u01/app/grid/tfa/grac1/tfa_home/internal/port.txtOpening Parameter file /u01/app/grid/tfa/grac1/tfa_home/tfa_setup.txtWe got : CheckOK
...
'-----------------------------------------+---------'
| Configuration Parameter                 | Value   |
+-----------------------------------------+---------+
| TFA version                             | 2.5.1.5 |
| Automatic diagnostic collection         | OFF     |
| Trimming of files during diagcollection | ON      |
| Repository current size (MB) in grac1   | 131     |
| Repository maximum size (MB) in grac1   | 2629    |
| Trace level                             | 4       |
'-----------------------------------------+---------'
#### Done ####

Tracefile name and location
# pwd
/u01/app/grid/tfa/grac3/tfa_home/log
#  ls -rlt
-rw-r--r-- 1 root root     3641 Oct 23 10:26 syserrorout
-rw-r--r-- 1 root root   134679 Oct 23 10:28 diagcollect.log
-rw-r--r-- 1 root root 25108979 Oct 23 10:32 tfa.10.22.2013-19.54.37.log

With level 4 tracing we see following message for a tracefile added ZIP file
10.23.2013-10.26.30 -- MESSAGE : ZipTracesForDatesThread Transferring from /u01/app/oracle/diag/rdbms/grace2/grac23/trace/alert_grac23.log to /u01/app/grid/tfa/repository/temp_1382516788278/collections/alert_grac23.log

Uninstall TFA

# cd  /u01/app/grid/tfa/grac1/tfa_home/bin
# ./uninstalltfa.sh
Stopping TFA in grac1...
Shutting down TFA
oracle-tfa stop/waiting
Killing TFA running with pid 15860
Successfully shutdown TFA..
Stopping TFA in grac2 and removing /u01/app/grid/tfa/grac2/tfa_home...
Removing TFA from grac2...
Stopping TFA in grac2...
Shutting down TFA
oracle-tfa stop/waiting
Killing TFA running with pid 3251
Successfully shutdown TFA..
Deleting TFA support files on grac2:
Removing /etc/init.d/init.tfa...
Removing /u01/app/grid/tfa/bin...
Removing /u01/app/grid/tfa/grac2...
Stopping TFA in grac3 and removing /u01/app/grid/tfa/grac3/tfa_home...
Removing TFA from grac3...
Stopping TFA in grac3...
Shutting down TFA
oracle-tfa stop/waiting
Killing TFA running with pid 1615
Successfully shutdown TFA..
Deleting TFA support files on grac3:
Removing /etc/init.d/init.tfa...
Removing /u01/app/grid/tfa/bin...
Removing /u01/app/grid/tfa/grac3...
Deleting TFA support files on grac1:
Removing /etc/init.d/init.tfa...
Removing /u01/app/grid/tfa/bin...
Removing /u01/app/grid/tfa/grac1...

Upgrade TFA 2.5 to 3.1

  • If you install SupportBundle_v1_3 you will get TFA 3.1
  • 11.2.0.4 has installed TFA 2.5.1.5  per default
  •  Reinstall TFA 11.2.0.4 in a 11.2.0.4 GRID_HOME  after a downgrade
     /u01/app/grid/11.2.0.4/crs/install/tfa_setup.sh -silent -crshome /u01/app/grid/11.2.0.4
Extract zip file from SupportBundle_v1_3 and run 
# ./installTFALite
Starting TFA installation
TFA is already installed. Patching /u01/app/11204/grid/tfa/grac41/tfa_home...
TFA will be Patched on: 
grac41
grac42
grac43
Do you want to continue with patching TFA? [Y|N] [Y]: 
Checking for ssh equivalency in grac42
grac42 is configured for ssh user equivalency for root user
Checking for ssh equivalency in grac43
grac43 is configured for ssh user equivalency for root user
Auto patching is enabled in grac41
Key Stores are already updated in grac42
Key Stores are already updated in grac43
Shutting down TFA for Patching...
Shutting down TFA
oracle-tfa stop/waiting
. . . . . 
Killing TFA running with pid 18946
. . . 
Successfully shutdown TFA..
Renaming /u01/app/11204/grid/tfa/grac41/tfa_home/jar to /u01/app/11204/grid/tfa/grac41/tfa_home/jlib
Adding INSTALL_TYPE = GI to tfa_setup.txt
Copying /u01/app/11204/grid/tfa/grac41/tfa_home/output/ to /u01/app/grid/tfa/grac41/
The current version of Berkeley DB is 4.0.103
Copying je-4.1.27.jar to /u01/app/11204/grid/tfa/grac41/tfa_home/jlib/
Copying je-5.0.84.jar to /u01/app/11204/grid/tfa/grac41/tfa_home/jlib/
Running DbPreUpgrade_4_1 utility
Output of upgrade : Pre-upgrade succeeded
Creating ZIP: /u01/app/11204/grid/tfa/grac41/tfa_home/internal/tfapatch.zip
Running commands to fix init.tfa and tfactl in localhost
Starting TFA in grac41...
Starting TFA..
oracle-tfa start/running, process 9222
Waiting up to 100 seconds for TFA to be started..
. . . . . 
Successfully started TFA Process..
. . . . . 
TFA Started and listening for commands
Removing /u01/app/11204/grid/tfa/grac41/tfa_home/jlib/je-4.0.103.jar
Enabling Access for Non-root Users on grac41...
Adding default users and groups to TFA Access list...
Using SSH to patch TFA to remote nodes:

Applying Patch on grac42:
...

Applying Patch on grac43:
..

When running tfactl diagcollect like you will get following Inventory Status message
# $GRID_HOME/tfa/bin/tfactl diagcollect -all -since 1h
# $GRID_HOME/tfa/bin/tfactl  print status
.-----------------------------------------------------------------------------------------.
| Host   | Status of TFA | PID   | Port | Version | Build ID           | Inventory Status |
+--------+---------------+-------+------+---------+--------------------+------------------+
| grac42 | RUNNING       |  3951 | 5000 |     3.1 | 310020140205043544 | RUNNING          |
| grac41 | RUNNING       |  4201 | 5000 |     3.1 | 310020140205043544 | RUNNING          |
| grac43 | RUNNING       | 22284 | 5000 |     3.1 | 310020140205043544 | RUNNING          |
'--------+---------------+-------+------+---------+--------------------+------------------'
  • Note it will take some time until Inventory status goes to completed
  • You can monitory outstanding actions by running:   #  $GRID_HOME/tfa/bin/tfactl  print actions

Check TFA for integration of CHM and OSWatcher Output

Script check_tools.sh
#!/bin/sh
#
# OSWatcher location: /u01/app/11204/grid/oswbb -> search string oswbb
#     CHM   location: /u01/app/11204/grid/crf   -> search string crf
#
OSWatcher_loc_string="oswbb"
CHM_loc_string="crf"

echo  "-> OSWatcher status grac41:"
sudo ssh grac41 "ps -elf | grep OSWatcher  | grep -v grep"
echo  "-> OSWatcher status grac42:"
ssh grac42 "ps -elf | grep OSWatcher  | grep -v grep"
echo  "-> OSWatcher status grac43:"
ssh grac43 "ps -elf | grep OSWatcher  | grep -v grep"

echo "-> CHM Mater/Replica info"
ssh grac41 "$GRID_HOME/bin/oclumon manage -get MASTER REPLICA"
echo "-> CHM state "
ssh grac41 "$GRID_HOME/bin/crsctl status res ora.crf -init | grep STATE"
ssh grac42 "$GRID_HOME/bin/crsctl status res ora.crf -init | grep STATE"
ssh grac43 "$GRID_HOME/bin/crsctl status res ora.crf -init | grep STATE"

echo "-> TFA print status and open actions  "
ssh grac41 $GRID_HOME/tfa/bin/tfactl print status
ssh grac41 $GRID_HOME/tfa/bin/tfactl print actions  
echo "-> TFA directories for OSWatcher/CHM  grac41"
echo "-> TFA directory search string: $OSWatcher_loc_string|$CHM_loc_string"
ssh grac41 $GRID_HOME/tfa/bin/tfactl  print directories  | egrep "$OSWatcher_loc_string|$CHM_loc_string"
echo "-> TFA directories for OSWatcher/CHM  grac42"
ssh grac42 $GRID_HOME/tfa/bin/tfactl  print directories  | egrep "$OSWatcher_loc_string|$CHM_loc_string"
echo "-> TFA directories for OSWatcher/CHM  grac43"
ssh grac43 $GRID_HOME/tfa/bin/tfactl  print directories  | egrep "$OSWatcher_loc_string|$CHM_loc_string

Output from check_tools.sh script running on a 11.2.0.4 RAC system
[grid@grac41 ~]$ ./check_tools.sh
-> OSWatcher status grac41:
[sudo] password for grid: 
0 S root     15448     1  0  80   0 - 26670 wait   13:36 ?        00:00:00 /bin/sh ./OSWatcher.sh 48 10 gzip /u01/app/11204/grid/oswbb/archive
0 S root     16274 15448  0  80   0 - 26535 wait   13:36 ?        00:00:00 /bin/sh ./OSWatcherFM.sh 10 /u01/app/11204/grid/oswbb/archive
-> OSWatcher status grac42:
0 S root     15230     1  0  80   0 - 26669 wait   13:38 ?        00:00:00 /bin/sh ./OSWatcher.sh 48 10 gzip /u01/app/11204/grid/oswbb/archive
0 S root     15970 15230  0  80   0 - 26535 wait   13:38 ?        00:00:00 /bin/sh ./OSWatcherFM.sh 10 /u01/app/11204/grid/oswbb/archive
-> OSWatcher status grac43:
0 S root     21210     1  0  80   0 - 26668 wait   13:23 ?        00:00:01 /bin/sh ./OSWatcher.sh 48 10 gzip /u01/app/11204/grid/oswbb/archive
0 S root     21860 21210  0  80   0 - 26535 wait   13:24 ?        00:00:00 /bin/sh ./OSWatcherFM.sh 10 /u01/app/11204/grid/oswbb/archive
-> CHM Mater/Replica info
Master = grac43
Replica = grac42
 Done 
-> CHM state 
STATE=ONLINE on grac41
STATE=ONLINE on grac42
STATE=ONLINE on grac43
-> TFA print status and open actions  

.-----------------------------------------------------------------------------------------.
| Host   | Status of TFA | PID   | Port | Version | Build ID           | Inventory Status |
+--------+---------------+-------+------+---------+--------------------+------------------+
| grac41 | RUNNING       |  4201 | 5000 |     3.1 | 310020140205043544 | COMPLETE         |
| grac42 | RUNNING       |  3951 | 5000 |     3.1 | 310020140205043544 | COMPLETE         |
| grac43 | RUNNING       | 22284 | 5000 |     3.1 | 310020140205043544 | COMPLETE         |
'--------+---------------+-------+------+---------+--------------------+------------------'
.------------------------------------------.
| HOST | TIME | ACTION | STATUS | COMMENTS |
+------+------+--------+--------+----------+
'------+------+--------+--------+----------'

-> TFA directories for OSWatcher/CHM  grac41
-> TFA directory search string: oswbb|crf
| /u01/app/11204/grid/log/grac41/crf | CRS                                    | public     | root     |
| /u01/app/11204/grid/log/grac41/crf | CRS                                    | public     | root     |
| /u01/app/11204/grid/oswbb/archive  | OS                                     | private    | root     |
-> TFA directories for OSWatcher/CHM  grac42
| /u01/app/11204/grid/log/grac42/crf | CRS                  | public     | root     |
| /u01/app/11204/grid/log/grac42/crf | CRS                  | public     | root     |
| /u01/app/11204/grid/oswbb/archive  | OS                   | private    | root     |
-> TFA directories for OSWatcher/CHM  grac43
| /u01/app/11204/grid/log/grac43/crf | CRS                  | public     | root     |
| /u01/app/11204/grid/log/grac43/crf | CRS                  | public     | root     |
| /u01/app/11204/grid/oswbb/archive  | OS                   | private    | root     |

OSWatcher – Installation and Usage

Install OSWatcher

  • Download OSWatcher from OTN and Untar oswatcher ( see Note Doc ID 301137.1 )
  • Untar the reladet TAR archive : tar xvf oswbb601.tar
  • Before starting OSWatcher is checking for a process with OSWatcher string in its full path .
  • Don’t install OSWatcher in a directory named OSWatcher and use full PATH to start the tool. Even running a gedit session (  $ gedit OSWatcher.dat ) will signal that OSWatcher is running and you can’t start OSWatcher
  • In short :  ps -e | grep OSWatch  should not return any results before starting OSWatcher
# tar xvf oswbb601.tar 
oswbb/
oswbb/src/
oswbb/src/tombody.gif
oswbb/src/Thumbs.db
oswbb/src/missing_graphic.gif

 

Create file private.net for monitoring Cluster interconnet



Create file private.net based on Exampleprivate.net - here is the Linux Version 

#####################################################################
# This file contains examples of how to monitor private networks. To
# monitor your private networks create an executable file in this same
# directory named private.net. Use the example for your host os below.
# Make sure not to remove the last line in this file. Your file
# private.net MUST contain the rm lock.file line.
######################################################################
#Linux Example
######################################################################
echo "zzz ***"`date`
traceroute -r -F grac1int.example.com
traceroute -r -F grac2int.example.com
traceroute -r -F grac3int.example.com
######################################################################
rm locks/lock.file

ORA-15040, ORA-15042 errors mounting a diskgroup

Check current status

Try manually mount diskgroup
ASMCMD> mount -a
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "1" is missing from group number "1" 

Check ASM alert.log
SQL> alter diskgroup ACFS mount 
NOTE: cache registered group ACFS number=1 incarn=0x09884abc
NOTE: cache began mount (first) of group ACFS number=1 incarn=0x09884abc
NOTE: Assigning number (1,0) to disk (/dev/oracleasm/disks/ACFS_DATA)
Mon Aug 19 10:29:24 2013
NOTE: GMON heartbeating for grp 1
GMON querying group 1 at 334 for pid 31, osid 8882
NOTE: Assigning number (1,1) to disk ()
GMON querying group 1 at 335 for pid 31, osid 8882
NOTE: cache dismounting (clean) group 1/0x09884ABC (ACFS) 
NOTE: messaging CKPT to quiesce pins Unix process pid: 8882, image: oracle@grac1.example.com (TNS V1-V3)
NOTE: dbwr not being msg'd to dismount
NOTE: lgwr not being msg'd to dismount
NOTE: cache dismounted group 1/0x09884ABC (ACFS) 
NOTE: cache ending mount (fail) of group ACFS number=1 incarn=0x09884abc
NOTE: cache deleting context for group ACFS 1/0x09884abc
GMON dismounting group 1 at 336 for pid 31, osid 8882
NOTE: Disk  in mode 0x8 marked for de-assignment
NOTE: Disk  in mode 0x8 marked for de-assignment
ERROR: diskgroup ACFS was not mounted
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "1" is missing from group number "1" 
ERROR: alter diskgroup ACFS mount
Mon Aug 19 10:29:27 2013
ASM Health Checker found 1 new failures

Display even dismounted diskgroups
$  asmcmd lsdg --discovery
State       Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
DISMOUNTED          N           0   4096        0         0        0                0               0              0             N  ACFS/
MOUNTED     NORMAL  N         512   4096  1048576     15342     9932             5114            2409              0             N  DATA/
MOUNTED     NORMAL  N         512   4096  1048576      6141     5217             2047            1585              0             Y  OCR/
--> diskgroup ACFS is still dismounted  
Try to mount the ASM diskgroup with force option and check the available disks
As we have only a single disk available for NORMAL redundancy the mount fails and we need to use the force option  
SQL> alter diskgroup ACFS mount force;
Diskgroup altered.

Verify the disk status after mount force command. 
$ asmcmd lsdsk -p -G ACFS
Group_Num  Disk_Num      Incarn  Mount_Stat  Header_Stat  Mode_Stat  State   Path
        1         1  3915954601  MISSING     UNKNOWN      OFFLINE    NORMAL  
        1         0  3915954600  CACHED      MEMBER       ONLINE     NORMAL  /dev/oracleasm/disks/ACFS_DATA
$ asmcmd lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  NORMAL  N         512   4096  1048576      1019      122                0              61              1             N  ACFS/
MOUNTED  NORMAL  N         512   4096  1048576     15342     9932             5114            2409              0             N  DATA/
MOUNTED  NORMAL  N         512   4096  1048576      6141     5217             2047            1585              0             Y  OCR/
           1585              0             Y  OCR/
Try to read and verify ASM disk header ( on all instances )
# $GRID_HOME/bin/kfed read  /dev/sdj1
kfbh.endian:                          0 ; 0x000: 0x00
kfbh.hard:                            0 ; 0x001: 0x00
kfbh.type:                            0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt:                          0 ; 0x003: 0x00
kfbh.block.blk:                       0 ; 0x004: blk=0
kfbh.block.obj:                       0 ; 0x008: file=0
kfbh.check:                           0 ; 0x00c: 0x00000000
kfbh.fcn.base:                        0 ; 0x010: 0x00000000
kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000
kfbh.spare1:                          0 ; 0x018: 0x00000000
kfbh.spare2:                          0 ; 0x01c: 0x00000000
7FB74EC83400 00000000 00000000 00000000 00000000  [................]
        Repeat 63 times
7FB74EC83800 0000FF00 0003FBB8 000032FC 0003DA6D  [.........2..m...]
7FB74EC83810 0000FEF5 00000000 00000002 00000002  [................]
7FB74EC83940 00000000 00000000 00000000 01000000  [................]
7FB74EC83950 00000000 00000000 00000000 001C001C  [................]
7FB74EC83960 00000001 00000000 00000000 00000000  [................]
7FB74EC83970 00000000 00000004 00008196 00000000  [................]
7FB74EC83980 00000000 00000000 00000000 00000000  [................]
  Repeat 167 times
KFED-00322: file not found; arguments: [kfbtTraverseBlock] [Invalid OSM block type] [] [0]
--> This is not a valid  ASM header - let's create new ASM disk:

Common cluvfy errors and warnings including first debugging steps

 PRIF-10, PRVG-1060, PRCT-1011  [ cluvfy  stage -pre crsinst ]

Current Configuration :

  • Your CRS stack doesn’t come up and you want to verify your CRS statck
  • Your are running  cluvfy  stage -pre crsinst in a ready installed CRS stack

ERROR       : PRVG-1060 : Failed to retrieve the network interface classification information from an existing CRS home at path "/u01/app/121/grid" on the local node
              PRCT-1011 : Failed to run "oifcfg". Detailed error: PRIF-10: failed to initialize the cluster registry
Command     :   cluvfy  stage -pre crsinst in a ready installed CRS stack
Workaround 1: Try to start clusterware in exclusive mode 
               # crsctl start crs -excl 
                 Oracle High Availability Services is online 
                 CRS-4692: Cluster Ready Services is online in exclusive mode 
                 CRS-4529: Cluster Synchronization Services is online 
                 CRS-4533: Event Manager is online 
                $ bin/cluvfy  stage -pre crsinst -n gract1 
               Note if you can startup cluvfy in exclusive mode cluvfy  stage -post crsinst should work too 
                 $  cluvfy  stage -post crsinst -n gract1 
Workaround 2: Need to be used if you can start the CRS stack in exclusive mode  
               If you can startup the CRS stack you may use the WA from  
                  Bug 17505999 : CVU CHECKS FOR ACTIVEVERSION WHEN CRS STACK IS NOT UP. 
                  # mv /etc/oraInst.loc /etc/oraInst.loc_sav 
                  # mv /etc/oracle  /etc/oracle_sav 
                 
                $ bin/cluvfy  -version 
                   12.1.0.1.0 Build 112713x8664 
                Now the command below should work and as said before always download the latest cluvfy version ! 
                 $  bin/cluvfy  stage -pre crsinst -n gract1 
                 .. Check for /dev/shm mounted as temporary file system passed 
                  Pre-check for cluster services setup was successful.
 Reference :    Bug 17505999 : CVU CHECKS FOR ACTIVEVERSION WHEN CRS STACK IS NOT UP.

PRVF-0002 : Could not retrieve local nodename

Command    : $ ./bin/cluvfy -h
Error      : PRVF-0002 : Could not retrieve local nodename
Root cause : Nameserver down, host not not yet know in DNS 
             $   nslookup grac41   returns error
               Server:        192.135.82.44
               Address:    192.135.82.44#53
               ** server can't find grac41: NXDOMAIN
Fix         : Restart DNS, or configure DNS . Nslookup should work in any case !

PRVG-1013 : The path “/u01/app/11203/grid” does not exist or cannot be created

Command    : cluvfy stage -pre nodeadd -n grac3 -verbose
Error      : PRVG-1013 : The path "/u01/app/11203/grid" does not exist or cannot be created on the nodes to be added
             Shared resources check for node addition failed:
Logfile    : Check cluvify log:  $GRID_HOME/cv/log/cvutrace.log.0
             [ 15025@grac1.example.com] [Worker 1] [ 2013-08-29 15:17:08.266 CEST ] [NativeSystem.isCmdScv:499]  isCmdScv: 
             cmd=[/usr/bin/ssh -o FallBackToRsh=no  -o PasswordAuthentication=no  -o StrictHostKeyChecking=yes  
             -o NumberOfPasswordPrompts=0  grac3 -n 
             /bin/sh -c "if [  -d /u01 -a -w /u01 ] ; then echo exists; fi"]
             ...
             [15025@grac1.example.com] [main] [ 2013-08-29 15:17:08.270 CEST ] [TaskNodeAddDelete.checkSharedPath:559]  
             PRVG-1013 : The path "/u01/app/11203/grid" does not exist or cannot be created on the nodes to be added
             [15025@grac1.example.com] [main] [ 2013-08-29 15:17:08.270 CEST ] [ResultSet.traceResultSet:359]
             Node Add/Delete ResultSet trace.
             Overall Status->VERIFICATION_FAILED
             grac3-->VERIFICATION_FAILED
Root cause:  cluvfy commands tries to check the /u01 directory with write attribute and fails
             /bin/sh -c "if [  -d /u01 -a -w /u01 ] ; then echo exists; fi"
Code Fix     : drop -w argument and we get the required fixed ouput
              $  /bin/sh -c "if [  -d /u01 -a /u01 ] ; then echo exists; fi"
               exists
Related BUG:
             Bug 13241453 : LNX64-12.1-CVU: "CLUVFY STAGE -POST NODEADD" COMMAND REPORTS PRVG-1013 ERROR

PRVF-5229 : GNS VIP is active before Clusterware installation

Command    : $ ./bin/cluvfy comp gns -precrsinst -domain grid.example.com -vip 192.168.1.50 -verbose -n grac121
              Verifying GNS integrity 
              Checking GNS integrity...
              Checking if the GNS subdomain name is valid...
              The GNS subdomain name "grid.example.com" is a valid domain name
              Checking if the GNS VIP is a valid address...
              GNS VIP "192.168.1.50" resolves to a valid IP address
              Checking the status of GNS VIP...
Error       : Error PRVF-5229 : GNS VIP is active before Clusterware installation
              GNS integrity check passed
Fix         : If your clusterware is already installed and up and running ignore this error
              If this is a new install use an unsed TPC/IP address for your GNS VIP ( note ping should fail ! )

PRVF-4007 : User equivalence check failed for user “oracle”

Command   : $ ./bin/cluvfy stage -pre crsinst -n grac1 
Error     : PRVF-4007 : User equivalence check failed for user "oracle" 
Fix       : Run  sshUserSetup.sh            
            $ ./sshUserSetup.sh -user grid -hosts "grac1 grac2"  -noPromptPassphrase            
            Verify SSH connectivity:            
            $ /usr/bin/ssh -x -l grid  grac1 date             Tue Jul 16 12:14:17 CEST 2013            
            $ /usr/bin/ssh -x -l grid  grac2 date             Tue Jul 16 12:14:25 CEST 2a013

PRVF-9992 : Group of device “/dev/oracleasm/disks/DATA1” did not match the expected group

Command    : $ ./bin/cluvfy stage -pre crsinst -n grac1 -asm -asmdev /dev/oracleasm/disks/DATA1 Checking consistency of device group across all nodes... 
Error      : PRVF-9992 : Group of device "/dev/oracleasm/disks/DATA1" did not match the expected group. [Expected = "dba"; Found = "{asmadmin=[grac1]}"] 
Root cause : Cluvfy doesn't know that grid user belongs to a different group 
Fix:       : Run cluvfy with -asmgrp asmadmin to provide correct group mappings: 
             $ ./bin/cluvfy stage -pre crsinst -n grac1 -asm -asmdev /dev/oracleasm/disks/DATA1 -asmgrp asmadmin

PRVF-9802 : Attempt to get udev info from node “grac1” failed

 Command   : $ ./bin/cluvfy stage -pre crsinst -n grac1 -asm -asmdev /dev/oracleasm/disks/DATA1 
Error     : PRVF-9802 : Attempt to get udev info from node "grac1" failed
           UDev attributes check failed for ASM Disks
Bug       : Bug 12804811 : [11203-LIN64-110725] OUI PREREQUISITE CHECK FAILED IN OL6
Fix       : If using ASMLIB you can ignore currently this error
            If using UDEV you may read follwing link. 

PRVF-7539 – User “grid” does not belong to group “dba

Error       : PRVF-7539 - User "grid" does not belong to group "dba
Command     : $  ./bin/cluvfy comp sys -p crs -n grac1
Fix         :  Add grid owner to DBA group
Note        : ID 1505586.1 : CVU found following errors with Clusterware setup : User "grid" does not 
          belong to group "dba" [ID 1505586.1]
            : ID 316817.1] Cluster Verification Utility (CLUVFY) FAQ [ID 316817.1]
Bug         :  Bug 12422324 : LNX64-112-CMT: HIT PRVF-7539 : GROUP "DBA" DOES NOT EXIST ON OUDA NODE ( Fixed : 11.2.0.4 )

PRVF-7617 : Node connectivity between “grac1 : 192.168.1.61” and “grac1 : 192.168.1.55” failed

Command     : $ ./bin/cluvfy comp nodecon -n grac1
Error       : PRVF-7617 : Node connectivity between "grac1 : 192.168.1.61" and "grac1 : 192.168.1.55" failed
Action 1    : Disable firewall / IP tables
             # service iptables stop 
             # chkconfig iptables off
             # iptables -F
             # service iptables status 
             If after a reboot the firewall is enabled again please read following post .              
Action 2    : Checking ssh connectivity 
              $ id
              uid=501(grid) gid=54321(oinstall) groups=54321(oinstall),504(asmadmin),506(asmdba),507(asmoper),54322(dba)
              $ ssh grac1 date 
                Sat Jul 27 13:42:19 CEST 2013
Fix         : Seems that we need to run cluvfy comp nodecon with at least 2 Nodes
              Working Command: $ ./bin/cluvfy comp nodecon -n grac1,grac2 
                -> Node connectivity check passed
              Failing Command: $ ./bin/cluvfy comp nodecon -n grac1
                -> Verification of node connectivity was unsuccessful. 
                   Checks did not pass for the following node(s):
               grac1 : 192.168.1.61
            :  Ignore this error if running with a single RAC Node  - Rerun later when both nodes are available 
            : Verify that that ping is working with all involved IP addresses

Action 3    : 2 or more network interfaces are using the same network address
              Test your Node Commectivity by running:
              $ /u01/app/11203/grid/bin//cluvfy comp nodecon -i eth1,eth2 -n grac31,grac32,grac33 -verbose

              Interface information for node "grac32"
              Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU   
              ------ --------------- --------------- --------------- --------------- ----------------- ------
              eth0   10.0.2.15       10.0.2.0        0.0.0.0         10.0.2.2        08:00:27:88:32:F3 1500  
              eth1   192.168.1.122   192.168.1.0     0.0.0.0         10.0.2.2        08:00:27:EB:39:F1 1500  
              eth3   192.168.1.209   192.168.1.0     0.0.0.0         10.0.2.2        08:00:27:69:AE:D2 1500  

              Verifiy current settings via ifconfig
              eth1     Link encap:Ethernet  HWaddr 08:00:27:5A:61:E3  
                       inet addr:192.168.1.121  Bcast:192.168.1.255  Mask:255.255.255.0
              eth3     Link encap:Ethernet  HWaddr 08:00:27:69:AE:D2  
                       inet addr:192.168.1.209  Bcast:192.168.1.255  Mask:255.255.255.0

              --> Both eth1 and eth3 are using the same network address 192.168.1 
Fix           : Setup your network devices and provide a different IP Address like 192.168.3 for eth3 

Action 4      :Intermittent PRVF-7617 error with cluvfy 11.2.0.3 ( cluvfy Bug )     
               $  /u01/app/11203/grid/bin/cluvfy -version
               11.2.0.3.0 Build 090311x8664
               $ /u01/app/11203/grid/bin/cluvfy comp nodecon -i eth1,eth2 -n grac31,grac32,grac33 -verbos
               --> Fails intermittent with following ERROR: 
               PRVF-7617 : Node connectivity between "grac31 : 192.168.1.121" and "grac33 : 192.168.1.220" failed

               $  /home/grid/cluvfy_121/bin/cluvfy -version
               12.1.0.1.0 Build 062813x8664
               $  /home/grid/cluvfy_121/bin/cluvfy comp nodecon -i eth1,eth2 -n grac31,grac32,grac33 -verbose
               --> Works for each run
    Fix      : Always use latest 12.1 cluvfy utility to test Node connectivity 

References:  
               PRVF-7617: TCP connectivity check failed for subnet (Doc ID 1335136.1)
               Bug 16176086 - SOLX64-12.1-CVU:CVU REPORT NODE CONNECTIVITY CHECK FAIL FOR NICS ON SAME NODE 
               Bug 17043435 : EM 12C: SPORADIC INTERRUPTION WITHIN RAC-DEPLOYMENT AT THE STEP INSTALL/CLONE OR

PRVG-1172 : The IP address “192.168.122.1” is on multiple interfaces “virbr0” on nodes “grac42,grac41”

Command    :  $ ./bin/cluvfy stage -pre crsinst -asm -presence local -asmgrp asmadmin -asmdev /dev/oracleasm/disks/DATA1,/dev/oracleasm/disks/DATA2,/dev/oracleasm/disks/DATA3,/dev/oracleasm/disks/DATA4 -n grac41,grac42   a
Error      :  PRVG-1172 : The IP address "192.168.122.1" is on multiple interfaces "virbr0,virbr0" on nodes "grac42,grac41"
Root cause :  There are multiple networks ( eth0,eth1,eth2,virbr0  ) defined
Fix        :  use cluvfy with  -networks eth1:192.168.1.0:PUBLIC/eth2:192.168.2.0:cluster_interconnect -n grac41,grac42
Sample     :  $ ./bin/cluvfy stage -pre crsinst -asm -presence local -asmgrp asmadmin -asmdev /dev/oracleasm/disks/DATA1,/dev/oracleasm/disks/DATA2,/dev/oracleasm/disks/DATA3,/dev/oracleasm/disks/DATA4 -networks eth1:192.168.1.0:PUBLIC/eth2:192.168.2.0:cluster_interconnect -n grac41,grac42  ss

Cluvfy Warnings:

PRVG-1101 : SCAN name “grac4-scan.grid4.example.com” failed to resolve  ( PRVF-4664 PRVF-4657

Warning:      PRVG-1101 : SCAN name "grac4-scan.grid4.example.com" failed to resolve  
Cause:        An attempt to resolve specified SCAN name to a list of IP addresses failed because SCAN could not be resolved in DNS or GNS using 'nslookup'.
Action:       Verify your GNS/SCAN setup using ping, nslookup can cluvfy
              $  ping -c 1  grac4-scan.grid4.example.com
              PING grac4-scan.grid4.example.com (192.168.1.168) 56(84) bytes of data.
              64 bytes from 192.168.1.168: icmp_seq=1 ttl=64 time=0.021 ms
              --- grac4-scan.grid4.example.com ping statistics ---
              1 packets transmitted, 1 received, 0% packet loss, time 1ms
               rtt min/avg/max/mdev = 0.021/0.021/0.021/0.000 ms

              $  ping -c 1  grac4-scan.grid4.example.com
              PING grac4-scan.grid4.example.com (192.168.1.170) 56(84) bytes of data.
              64 bytes from 192.168.1.170: icmp_seq=1 ttl=64 time=0.031 ms 
              --- grac4-scan.grid4.example.com ping statistics ---
              1 packets transmitted, 1 received, 0% packet loss, time 2ms
              rtt min/avg/max/mdev = 0.031/0.031/0.031/0.000 ms

             $  ping -c 1  grac4-scan.grid4.example.com
             PING grac4-scan.grid4.example.com (192.168.1.165) 56(84) bytes of data.
             64 bytes from 192.168.1.165: icmp_seq=1 ttl=64 time=0.143 ms
             --- grac4-scan.grid4.example.com ping statistics ---
             1 packets transmitted, 1 received, 0% packet loss, time 0ms
             rtt min/avg/max/mdev = 0.143/0.143/0.143/0.000 ms

             $ nslookup grac4-scan.grid4.example.com
             Server:        192.168.1.50
             Address:    192.168.1.50#53
             Non-authoritative answer:
             Name:    grac4-scan.grid4.example.com
             Address: 192.168.1.168
             Name:    grac4-scan.grid4.example.com
             Address: 192.168.1.165
             Name:    grac4-scan.grid4.example.com
             Address: 192.168.1.170

            $ $GRID_HOME/bin/cluvfy comp scan
            Verifying scan 
            Checking Single Client Access Name (SCAN)...
            Checking TCP connectivity to SCAN Listeners...
            TCP connectivity to SCAN Listeners exists on all cluster nodes
            Checking name resolution setup for "grac4-scan.grid4.example.com"...
            Checking integrity of name service switch configuration file "/etc/nsswitch.conf" ...
            Check for integrity of name service switch configuration file "/etc/nsswitch.conf" passed
            Verification of SCAN VIP and Listener setup passed
            Verification of scan was successful. 

 Fix:       As nsloopkup, ping and cluvfy works as expected you can ignore this warning   

Reference:  RVF-4664 PRVF-4657: Found inconsistent name resolution entries for SCAN name (Doc ID 887471.1)

WARNING    : Could not find a suitable set of interfaces for the private interconnect

Root cause : public ( 192.168.1.60) and private interface ( 192.168.1.61) uses same network adress
Fix             : provide own network address (  192.168.1.xx) for private interconenct 
                  After fix cluvfy reports : 
                  Interfaces found on subnet "192.168.1.0" that are likely candidates for VIP are:
                  grac1 eth0:192.168.1.60
                  Interfaces found on subnet "192.168.2.0" that are likely candidates for a private interconnect are:
                  grac1 eth1:192.168.2.101

WARNING: Could not find a suitable set of interfaces for VIPs

WARNING: Could not find a suitable set of interfaces for VIPs
             Checking subnet mask consistency...
             Subnet mask consistency check passed for subnet "192.168.1.0".
             Subnet mask consistency check passed for subnet "192.168.2.0".
             Subnet mask consistency check passed.
Fix        : Ignore this warning 
Root Cause : Per BUG:4437727, cluvfy makes an incorrect assumption based on RFC 1918 that any IP address/subnet that 
            begins with any of the following octets is private and hence may not be fit for use as a VIP:
            172.16.x.x  through 172.31.x.x
            192.168.x.x
            10.x.x.x
            However, this assumption does not take into account that it is possible to use these IPs as Public IP's on an
            internal network  (or intranet).   Therefore, it is very common to use IP addresses in these ranges as 
            Public IP's and as Virtual IP(s), and this is a supported configuration.  
Reference:
Note:       CLUVFY Fails With Error: Could not find a suitable set of interfaces for VIPs or Private Interconnect [ID 338924.1]

PRVF-5436 : The NTP daemon running on one or more nodes lacks the slewing option “-x”

Error        :PRVF-5436 : The NTP daemon running on one or more nodes lacks the slewing option "-x"
Solution     :Change  /etc/sysconfig/ntpd
               # OPTIONS="-u ntp:ntp -p /var/run/ntpd.pid"
                to 
                OPTIONS="-x -u ntp:ntp -p /var/run/ntpd.pid"
               Restart NTPD daemon
               [root@ract1 ~]#  service ntpd  restart

PRVF-5217 : An error occurred while trying to look up IP address for “grac1cl.grid2.example.com

WARNING:    PRVF-5217 : An error occurred while trying to look up IP address for "grac1cl.grid2.example.com"
Action    : Verify with dig and nslookup that VIP IP adresss is working:
            $  dig grac1cl-vip.grid2.example.com
             ; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.10.rc1.el6 <<>> grac1cl-vip.grid2.example.com
             ;; global options: +cmd
             ;; Got answer:
             ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23546
             ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 1
             ;; QUESTION SECTION:
             ;grac1cl-vip.grid2.example.com.    IN    A
             ;; ANSWER SECTION:
             grac1cl-vip.grid2.example.com. 120 IN    A    192.168.1.121
             ;; AUTHORITY SECTION:
             grid2.example.com.    3600    IN    NS    ns1.example.com.
             grid2.example.com.    3600    IN    NS    gns2.grid2.example.com.
            ;; ADDITIONAL SECTION:
            ns1.example.com.    3600    IN    A    192.168.1.50
           ;; Query time: 12 msec
           ;; SERVER: 192.168.1.50#53(192.168.1.50)
           ;; WHEN: Mon Aug 12 09:39:24 2013
           ;; MSG SIZE  rcvd: 116
          $  nslookup grac1cl-vip.grid2.example.com
           Server:        192.168.1.50
           Address:    192.168.1.50#53
           Non-authoritative answer:
           Name:    grac1cl-vip.grid2.example.com
           Address: 192.168.1.121
Fix:      Ignores this warning.
          DNS server on this system has stripped the authoritative flag. This results into the throw of an 
          UnknownHostExecption when  CVU calls InetAddress.getAllByName(..). That's why cluvfy returns a WARNING.
Reference: Bug 12826689 : PRVF-5217 FROM CVU WHEN VALIDATING GNS 

Running cluvfy comp dns -server fails silent – Cluvfy logs show PRCZ-2090 error


Command  runcluvfy.sh comp dns -server ... just exits with SUCCESS which is not what we expect. Indeed this command should create a local DNS server and block until runcluvfy.sh comp dns -client -last was executed

[grid@ractw21 linuxx64_12201_grid_home]$ runcluvfy.sh comp dns -server -domain grid122.example.com -vipaddress 192.168.1.59/255.255.255.0/enp0s8 -verbose -method root
Enter "ROOT" password:

Verifying Task DNS configuration check ...
Waiting for DNS client requests...
Verifying Task DNS configuration check ...PASSED

Verification of DNS Check was successful. 

CVU operation performed:      DNS Check
Date:                         Apr 11, 2017 3:23:56 PM
CVU home:                     /media/sf_kits/Oracle/122/linuxx64_12201_grid_home/
User:                         grid

Review CVU traces shows that cluvfy command fails with: error  PRCZ-2090
PRCZ-2090 : failed to create host key repository from file "/home/grid/.ssh/known_hosts" to establish SSH connection to node "ractw21"
[main] [ 2017-04-14 17:38:09.204 CEST ] [ExecCommandNoUserEqImpl.runCmd:374]  Final CompositeOperationException: PRCZ-2009 : Failed to execute command "/media/sf_kits/Oracle/122/linuxx64_12201_grid_home//cv/admin/odnsdlite" as root within 0 seconds on nodes "ractw21"

Fix login user grid via ssh and create the proper ssh environment
[grid@ractw21 linuxx64_12201_grid_home]$  ssh grid@ractw21.example.com

 

PRVF-5636 , PRVF-5637 : The DNS response time for an unreachable node exceeded “15000” ms

Problem 1: 
Command   : $ ./bin/cluvfy stage -pre crsinst -n grac1 -asm -asmdev /dev/oracleasm/disks/DATA1
Error     : PRVF-5636 : The DNS response time for an unreachable node exceeded "15000" ms on following nodes: grac1
Root Cause: nsloopup return wrong status message
            # nslookup hugo.example.com
            Server:        192.168.1.50
            Address:    192.168.1.50#53
            ** server can't find hugo.example.com: NXDOMAIN
            #  echo $?
            1
            --> Note the error can't find hugo.example.com is ok - but no the status code
 Note:      PRVF-5637 : DNS response time could not be checked on following nodes [ID 1480242.1]
 Bug :      Bug 16038314 : PRVF-5637 : DNS RESPONSE TIME COULD NOT BE CHECKED ON FOLLOWING NODESa

 Problem 2:
 Version   : 12.1.0.2
 Command   : $GRID_HOME/addnode/addnode.sh -silent "CLUSTER_NEW_NODES={gract3}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={auto}" "CLUSTER_NEW_NODE_ROLES={hub}" a
 Error     : SEVERE: [FATAL] [INS-13013] Target environment does not meet some mandatory requirements.
             FINE: [Task.perform:594]
             sTaskResolvConfIntegrity:Task resolv.conf Integrity[STASKRESOLVCONFINTEGRITY]:TASK_SUMMARY:FAILED:CRITICAL:VERIFICATION_FAILED
             PRVF-5636 : The DNS response time for an unreachable node exceeded "15000" ms on following nodes: gract1,gract3a
Verify     : Runs ping  SCAN address for a long time to check out node connectivity
             $ ping -v gract-scan.grid12c.example.com
             $ nsloopkup gract-scan.grid12c.example.com
             Note you may need to run above commands a long time until error comes up
Root Cause : Due to the intermittent hang of the above OS commands a firewall issue could be identified
Fix        : Disable firewall
Reference  : PRVF-5636 : The DNS response time for an unreachable node exceeded "15000" ms on following nodes (Doc ID 1356975.1) 
             PRVF-5637 : DNS response time could not be checked on following nodes (Doc ID 1480242.1)
             Using 11.2 WA by setting : $ export IGNORE_PREADDNODE_CHECKS=Y did not help

PRVF-4037 : CRS is not installed on any of the nodes

Error     : PRVF-4037 : CRS is not installed on any of the nodes
            PRVF-5447 : Could not verify sharedness of Oracle Cluster Voting Disk configuration
Command   : $ cluvfy stage -pre crsinst -upgrade -n grac41,grac42,grac43 -rolling -src_crshome $GRID_HOME 
           -dest_crshome /u01/app/grid_new -dest_version 12.1.0.1.0  -fixup -fixupdir /tmp -verbose
Root Cause:  /u01/app/oraInventory/ContentsXML/inventory.xml was corrupted ( missing node_list for GRID HOME )
            <HOME NAME="Ora11g_gridinfrahome1" LOC="/u01/app/11204/grid" TYPE="O" IDX="1" CRS="true"/>
            <HOME NAME="OraDb11g_home1" LOC="/u01/app/oracle/product/11204/racdb" TYPE="O" IDX="2">
              <NODE_LIST>
               <NODE NAME="grac41"/>
               <NODE NAME="grac42"/>
               <NODE NAME="grac43"/>
              ....
Fix: Correct entry in inventory.xml
            <HOME NAME="Ora11g_gridinfrahome1" LOC="/u01/app/11204/grid" TYPE="O" IDX="1" CRS="true">
               <NODE_LIST>
                  <NODE NAME="grac41"/>
                  <NODE NAME="grac42"/>
                  <NODE NAME="grac43"/>
               </NODE_LIST>
               ...

Reference : CRS is not installed on any of the nodes (Doc ID 1316815.1)
            CRS is not installed on any of the nodes. Inventory.xml is changed even when no problem with TMP files. (Doc ID 1352648.1)

avahi-daemon is running

Cluvfy report : 
     Checking daemon "avahi-daemon" is not configured and running
     Daemon not configured check failed for process "avahi-daemon"
     Check failed on nodes: 
        ract2,ract1
     Daemon not running check failed for process "avahi-daemon"
     Check failed on nodes: 
        ract2,ract1

Verify  for running avahi-daemon daemon
     $ ps -elf | grep avahi-daemon
     5 S avahi     4159     1  0  80   0 -  5838 poll_s Apr02 ?        00:00:00 avahi-daemon: running [ract1.local]
     1 S avahi     4160  4159  0  80   0 -  5806 unix_s Apr02 ?        00:00:00 avahi-daemon: chroot helper

Fix it ( run on all nodes ) :
      To shut it down, as root
      # /etc/init.d/avahi-daemon stop
      To disable it, as root:
      # /sbin/chkconfig  avahi-daemon off

Reference: 
    Cluster After Private Network Recovered if avahi Daemon is up and Running (Doc ID 1501093.1)

Reference data is not available for verifying prerequisites on this operating system distribution

Command    : ./bin/cluvfy stage -pre crsinst -upgrade -n gract3 -rolling -src_crshome $GRID_HOME 
                -dest_crshome /u01/app/12102/grid -dest_version 12.1.0.2.0 -verbose
Error      :  Reference data is not available for verifying prerequisites on this operating system distribution
              Verification cannot proceed
              Pre-check for cluster services setup was unsuccessful on all the nodes.
Root cause:  cluvfy runs rpm -qa | grep  release
             --> if this command fails above error was thrown
             Working Node 
             [root@gract1 log]# rpm -qa | grep  release
             oraclelinux-release-6Server-4.0.4.x86_64
             redhat-release-server-6Server-6.4.0.4.0.1.el6.x86_64
             oraclelinux-release-notes-6Server-9.x86_64
             Failing Node
             [root@gract1 log]#  rpm -qa | grep  release
             rpmdb: /var/lib/rpm/__db.003: No such file or directory
             error: db3 error(2) from dbenv->open: No such file or directory 
             ->  Due to space pressure /var/lib/rpm was partially deleted on a specific RAC node
 Fix        : Restore RPM packages form a REMOTE RAC node or from backup
             [root@gract1 lib]# pwd
             /var/lib
             [root@gract1 lib]#  scp -r gract3:/var/lib/rpm .
             Verify RPM database
             [root@gract1 log]#   rpm -qa | grep  release
             oraclelinux-release-6Server-4.0.4.x86_64
             redhat-release-server-6Server-6.4.0.4.0.1.el6.x86_64
             oraclelinux-release-notes-6Server-9.x86_64
Related Nodes:
             - Oracle Secure Enterprise Search 11.2.2.2 Installation Problem On RHEL 6 - [INS-75028] 
               Environment Does Not Meet Minimum Requirements: Unsupported OS Distribution (Doc ID 1568473.1)
             - RHEL6: 12c OUI INS-13001: CVU Fails: Reference data is not available for verifying prerequisites on 
               this operating system distribution (Doc ID 1567127.1)

Cluvfy Debug : PRVG-11049

Create a problem - Shutdown cluster Interconnect:
$ ifconfig eth1 down

Verify error with cluvfy
$ cluvfy comp nodecon -n all -i eth1
Verifying node connectivity 
Checking node connectivity...
Checking hosts config file...
Verification of the hosts config file successful
ERROR: 
PRVG-11049 : Interface "eth1" does not exist on nodes "grac2"
...

Step 1 - check cvutrace.log.0 trace:
# grep PRVG /home/grid/cluvfy112/cv/log/cvutrace.log.0
[21684@grac1.example.com] [main] [ 2013-07-29 18:32:46.429 CEST ] [TaskNodeConnectivity.performSubnetExistanceCheck:1394]  Found Bad node(s): PRVG-11049 : Interface "eth1" does not exist on nodes "grac2"
PRVG-11049 : Interface "eth1" does not exist on nodes "grac2"
          ERRORMSG(grac2): PRVG-11049 : Interface "eth1" does not exist on nodes "grac2"

Step 2: Create a script and set trace level:  SRVM_TRACE_LEVEL=2
rm -rf /tmp/cvutrace
mkdir /tmp/cvutrace
export CV_TRACELOC=/tmp/cvutrace
export SRVM_TRACE=true
export SRVM_TRACE_LEVEL=2
./bin/cluvfy comp nodecon -n all -i eth1 -verbose
ls /tmp/cvutrace

Run script and check cluvfy trace file:
[32478@grac1.example.com] [main] [ 2013-07-29 19:08:23.125 CEST ] [TaskNodeConnectivity.performSubnetExistanceCheck:1367]  getting interface eth1 on node grac2
[32478@grac1.example.com] [main] [ 2013-07-29 19:08:23.126 CEST ] [TaskNodeConnectivity.performSubnetExistanceCheck:1374]  Node: grac2 has no 'eth1' interfaces!
[32478@grac1.example.com] [main] [ 2013-07-29 19:08:23.126 CEST ] [TaskNodeConnectivity.performSubnetExistanceCheck:1367]  getting interface eth1 on node grac1
[32478@grac1.example.com] [main] [ 2013-07-29 19:08:23.127 CEST ] [TaskNodeConnectivity.performSubnetExistanceCheck:1394]  Found Bad node(s): PRVG-11049 : Interface "eth1" does not exist on nodes "grac2"

Verify problem with ifconfig on grac2 ( eth1 is not up )
# ifconfig eth1
eth1      Link encap:Ethernet  HWaddr 08:00:27:8E:6D:24  
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:413623 errors:0 dropped:0 overruns:0 frame:0
          TX packets:457739 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:226391378 (215.9 MiB)  TX bytes:300565159 (286.6 MiB)
          Interrupt:16 Base address:0xd240 
Fix : 
Restart eth1 and restart crs 
# ifconfig eth1 up
#  $GRID_HOME/bin/crsctl stop  crs -f
#  $GRID_HOME/bin/crsctl start  crs 

Debug PRVF-9802

From cluvfy log following command is failing 
$  /tmp/CVU_12.1.0.1.0_grid/exectask.sh -getudevinfo oracleasm/disks/DATA1
<CV_ERR><SLOS_LOC>CVU00310</SLOS_LOC><SLOS_OP></SLOS_OP><SLOS_CAT>OTHEROS</SLOS_CAT><SLOS_OTHERINFO>No UDEV rule found for device(s) specified</SLOS_OTHERINFO></CV_ERR>
<CV_VRES>1</CV_VRES><CV_LOG>Exectask:getudevinfo success</CV_LOG><CV_CMDLOG>
<CV_INITCMD>/tmp/CVU_12.1.0.1.0_grid/exectask -getudevinfo oracleasm/disks/DATA1 </CV_INITCMD>
<CV_CMD>popen /etc/udev/udev.conf</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT><CV_CMD>opendir /etc/udev/permissions.d</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT>
<CV_CMD>opendir /etc/udev/rules.d</CV_CMD><CV_CMDOUT> Reading: /etc/udev/rules.d</CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT>
<CV_CMD>popen /bin/grep KERNEL== /etc/udev/rules.d/*.rules | grep GROUP | grep MODE | sed -e '/^#/d' -e 's/\*/.*/g' -e 's/\(.*\)KERNEL=="\([^\"]*\)\(.*\)/\2 @ \1 KERNEL=="\2\3/' 
| awk '{if ("oracleasm/disks/DATA1" ~ $1 ) print $3,$4,$5,$6,$7,$8,$9,$10,$11,$12}' | sed -e 's/://' -e 's/\.\*/\*/g'</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT></CV_CMDLOG><CV_ERES>0</CV_ERES>
--> No Output

Failing Command
$ /bin/grep KERNEL== /etc/udev/rules.d/*.rules | grep GROUP | grep MODE | sed -e '/^#/d' -e 's/\*/.*/g' -e 's/\(.*\)KERNEL=="\([^\"]*\)\(.*\)/\2 @ \1 KERNEL=="\2\3/' 
| awk '{if ("oracleasm/disks/DATA1" ~ $1 ) print $3,$4,$5,$6,$7,$8,$9,$10,$11,$12}' | sed -e 's/://' -e 's/\.\*/\*/g'
Diagnostics : cluvfy is scanning directory /etc/udev/rules.d/ for udev rules for device : oracleasm/disks/DATA1 - but couldn't find a rule that device

Fix: setup udev rules.

After fixing the udev rules the above command works fine and cluvfy doesn't complain anymore 
$ /bin/grep KERNEL== /etc/udev/rules.d/*.rules | grep GROUP | grep MODE | sed -e '/^#/d' -e 's/\*/.*/g' -e 's/\(.*\)KERNEL=="\([^\"]*\)\(.*\)/\2 @ \1 KERNEL=="\2\3/'
kvm @ /etc/udev/rules.d/80-kvm.rules: KERNEL=="kvm", GROUP="kvm", MODE="0666"
fuse @ /etc/udev/rules.d/99-fuse.rules: KERNEL=="fuse", MODE="0666",OWNER="root",GROUP="root"
Fix: setup udev rules .....
Verify: $ /tmp/CVU_12.1.0.1.0_grid/exectask.sh -getudevinfo  /dev/asmdisk1_udev_sdb1
<CV_VAL><USMDEV><USMDEV_LINE>/etc/udev/rules.d/99-oracle-asmdevices.rules KERNEL=="sdb1", NAME="asmdisk1_udev_sdb1", OWNER="grid", GROUP="asmadmin", MODE="0660"    
</USMDEV_LINE><USMDEV_NAME>sdb1</USMDEV_NAME><USMDEV_OWNER>grid</USMDEV_OWNER><USMDEV_GROUP>asmadmin</USMDEV_GROUP><USMDEV_PERMS>0660</USMDEV_PERMS></USMDEV></CV_VAL><CV_VRES>0</CV_VRES><CV_LOG>Exectask:getudevinfo success</CV_LOG><CV_CMDLOG><CV_INITCMD>/tmp/CVU_12.1.0.1.0_grid/exectask -getudevinfo /dev/asmdisk1_udev_sdb1 </CV_INITCMD><CV_CMD>popen /etc/udev/udev.conf</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT><CV_CMD>opendir /etc/udev/permissions.d</CV_CMD><CV_CMDOUT></CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT><CV_CMD>opendir /etc/udev/rules.d</CV_CMD><CV_CMDOUT> Reading: /etc/udev/rules.d</CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT><CV_CMD>popen /bin/grep KERNEL== /etc/udev/rules.d/*.rules | grep GROUP | grep MODE | sed -e '/^#/d' -e 's/\*/.*/g' -e 's/\(.*\)KERNEL=="\([^\"]*\)\(.*\)/\2 @ \1 KERNEL=="\2\3/' | awk '{if ("/dev/asmdisk1_udev_sdb1" ~ $1 ) print $3,$4,$5,$6,$7,$8,$9,$10,$11,$12}' | sed -e 's/://' -e 's/\.\*/\*/g'</CV_CMD><CV_CMDOUT> /etc/udev/rules.d/99-oracle-asmdevices.rules KERNEL=="sdb1", NAME="asmdisk1_udev_sdb1", OWNER="grid", GROUP="asmadmin", MODE="0660"    
</CV_CMDOUT><CV_CMDSTAT>0</CV_CMDSTAT></CV_CMDLOG><CV_ERES>0</CV_ERES>

Debug and Fix  PRVG-13606 Error

Reference:

Installing Oracle RAC 11.2.0.3, OEL 6.3 and Virtualbox 4.2 with GNS

Linux, Virtualbox Installation

Check the following link for Linux/VirtualBox installation details: http://www.oracle-base.com/articles/11g/oracle-db-11gr2-rac-installation-on-oracle-linux-6-using-virtualbox.php

  • Install Virtualbox Guest Additons
  • Install package : # yum install oracle-rdbms-server-11gR2-preinstall
  • Update the installation: : # yum update
  • Install Wireshark:  # yum install wireshark     # yum install wireshark-gnome
  • Install ASMlib
  • Install cluvfy as user grid – download here and extract files under user grid
  • Extract grid software to folder grid and  install rpm from  folder:  grid/rpm 
# cd /media/sf_kits/Oracle/11.2.0.4/grid/rpm
# rpm -iv cvuqdisk-1.0.9-1.rpm
Preparing packages for installation...
Using default group oinstall to install package
cvuqdisk-1.0.9-1
  • Verify the currrent OS status by running : $ ./bin/cluvfy stage -pre crsinst -n grac41

 

Check OS setting

Install X11 applications like xclock
# yum install xorg-x11-apps

Turn off and disable the firewall IPTables and disable SELinux
# service iptables stop
iptables: Flushing firewall rules:                         [  OK  ]
iptables: Setting chains to policy ACCEPT: filter          [  OK  ]
iptables: Unloading modules:                               [  OK  ]
# chkconfig iptables off
# chkconfig --list iptables
iptables        0:off   1:off   2:off   3:off   4:off   5:off   6:off

Disable SELinux. Open the config file and change the SELINUX variable from enforcing to disabled.
# vim /etc/selinux/config
 # This file controls the state of SELinux on the system.
 # SELINUX= can take one of these three values:
 #     enforcing - SELinux security policy is enforced.
 #     permissive - SELinux prints warnings instead of enforcing.
 #     disabled - No SELinux policy is loaded.
 SELINUX=disabled

DNS Setup including BIND, NTP, DHCP in a LAN   on a separate VirtualBox VM  

Even if you are using a DNS, Oracle recommends to list the public IP, VIP and private addresses

for each node in the hosts file on each node.

Domain:         example.com       Name Server: ns1.example.com            192.168.1.50
RAC Sub-Domain: grid.example.com  Name Server: gns.example.com            192.168.1.55
DHCP Server:    ns1.example.com
NTP  Server:    ns1.example.com
DHCP adresses:  192.168.1.100 ... 192.168.1.254

Configure DNS:
Identity     Home Node    Host Node                          Given Name                      Type        Address        Address Assigned By     Resolved By
GNS VIP        None        Selected by Oracle Clusterware    gns.example.com                 Virtual     192.168.1.55   Net administrator       DNS + GNS
Node 1 Public  Node 1      grac1                             grac1.example.com               Public      192.168.1.61   Fixed                   DNS
Node 1 VIP     Node 1      Selected by Oracle Clusterware    grac1-vip.grid.example.com      Private     Dynamic        DHCP                    GNS
Node 1 Private Node 1      grac1int                          grac1int.example.com            Private     192.168.2.71   Fixed                   DNS
Node 2 Public  Node 2      grac2                             grac2.example.com               Public      192.168.1.62   Fixed                   DNS
Node 2 VIP     Node 2      Selected by Oracle Clusterware    grac2-vip.grid.example.com      Private     Dynamic        DHCP                    GNS
Node 2 Private Node 2      grac2int                          grac2int.example.com            Private     192.168.2.72   Fixed                   DNS
SCAN VIP 1     none        Selected by Oracle Clusterware    GRACE2-scan.grid.example.com    Virtual     Dynamic        DHCP                    GNS
SCAN VIP 2     none        Selected by Oracle Clusterware    GRACE2-scan.grid.example.com    Virtual     Dynamic        DHCP                    GNS
SCAN VIP 3     none        Selected by Oracle Clusterware    GRACE2-scan.grid.example.com    Virtual     Dynamic        DHCP                    GNS

 

Note: the cluster node VIPs and SCANs are obtained via DHCP and if GNS is up all DHCP  adresses should be found with nslookup. If you have problems with zone delegation add your GNS name server to /etc/resolv.conf

Install BIND – Make sure the following rpms are installed

Install – Make sure the following rpms are installed:

dhcp-common-4.1.1-34.P1.0.1.el6

dhcp-common-4.1.1-34.P1.0.1.el6.x86_64

bind-9.8.2-0.17.rc1.0.2.el6_4.4.x86_64.rpm

bind-libs-9.8.2-0.17.rc1.0.2.el6_4.4.x86_64.rpm

bind-utils-9.8.2-0.17.rc1.0.2.el6_4.4.x86_64.rpm

Install Bind packages

# rpm -Uvh bind-9.8.2-0.17.rc1.0.2.el6_4.4.x86_64.rpm bind-libs-9.8.2-0.17.rc1.0.2.el6_4.4.x86_64.rpm

bind-utils-9.8.2-0.17.rc1.0.2.el6_4.4.x86_64.rpm

For a detailed describtion using zone delegations check following link:

Configure DNS:

-> named.conf
options {
    listen-on port 53 {  192.168.1.50; 127.0.0.1; };
    directory     "/var/named";
    dump-file     "/var/named/data/cache_dump.db";
        statistics-file "/var/named/data/named_stats.txt";
        memstatistics-file "/var/named/data/named_mem_stats.txt";
    allow-query     {  any; };
    allow-recursion     {  any; };
    recursion yes;
    dnssec-enable no;
    dnssec-validation no;

};
logging {
        channel default_debug {
                file "data/named.run";
                severity dynamic;
        };
};
zone "." IN {
    type hint;
    file "named.ca";
};
zone    "1.168.192.in-addr.arpa" IN { // Reverse zone
    type master;
    file "192.168.1.db";
        allow-transfer { any; };
    allow-update { none; };
};
zone    "2.168.192.in-addr.arpa" IN { // Reverse zone
    type master;
    file "192.168.2.db";
        allow-transfer { any; };
    allow-update { none; };
};
zone  "example.com" IN {
      type master;
       notify no;
       file "example.com.db";
};

-> Forward zone: example.com.db 
;
; see http://www.zytrax.com/books/dns/ch9/delegate.html 
; 
$TTL 1H         ; Time to live
$ORIGIN example.com.
@       IN      SOA     ns1.example.com.  hostmaster.example.com.  (
                        2009011202      ; serial (todays date + todays serial #)
                        3H              ; refresh 3 hours
                        1H              ; retry 1 hour
                        1W              ; expire 1 week
                        1D )            ; minimum 24 hour
;
        IN          A         192.168.1.50
        IN          NS        ns1.example.com. ; name server for example.com
ns1     IN          A         192.168.1.50
grac1   IN          A         192.168.1.61
grac2   IN          A         192.168.1.62
grac3   IN          A         192.168.1.63
;
$ORIGIN grid.example.com.
@       IN          NS        gns.grid.example.com. ; NS  grid.example.com
        IN          NS        ns1.example.com.      ; NS example.com
gns     IN          A         192.168.1.55 ; glue record

-> Reverse zone:  192.168.1.db 
$TTL 1H
@       IN      SOA     ns1.example.com.  hostmaster.example.com.  (
                        2009011201      ; serial (todays date + todays serial #)
                        3H              ; refresh 3 hours
                        1H              ; retry 1 hour
                        1W              ; expire 1 week
                        1D )            ; minimum 24 hour
; 
              NS        ns1.example.com.
              NS        gns.grid.example.com.
50            PTR       ns1.example.com.
55            PTR       gns.grid.example.com. ; reverse mapping for GNS
61            PTR       grac1.example.com. ; reverse mapping for GNS
62            PTR       grac2.example.com. ; reverse mapping for GNS
63            PTR       grac3.example.com. ; reverse mapping for GNS

-> Reverse zone:  192.168.2.db 
$TTL 1H
@       IN      SOA     ns1.example.com. hostmaster.example.com.  (
                        2009011201      ; serial (todays date + todays serial #)
                        3H              ; refresh 3 hours
                        1H              ; retry 1 hour
                        1W              ; expire 1 week
                        1D )            ; minimum 24 hour
; 
             NS        ns1.example.com.
71           PTR       grac1int.example.com. ; reverse mapping for GNS
72           PTR       grac2int.example.com. ; reverse mapping for GNS
73           PTR       grac3int.example.com. ; reverse mapping for GNS

->/etc/resolv.conf
# Generated by NetworkManager
search example.com
nameserver 192.168.1.50

Verify DNS ( Note: Commands where execute with a running GNS - means we already have install GRID )
Check the current GNS status
#   /u01/app/11203/grid/bin/srvctl config gns -a -l
GNS is enabled.
GNS is listening for DNS server requests on port 53
GNS is using port 5353 to connect to mDNS
GNS status: OK
Domain served by GNS: grid3.example.com
GNS version: 11.2.0.3.0
GNS VIP network: ora.net1.network
Name            Type Value
grac3-scan      A    192.168.1.220
grac3-scan      A    192.168.1.221
grac3-scan      A    192.168.1.222
grac3-scan1-vip A    192.168.1.220
grac3-scan2-vip A    192.168.1.221
grac3-scan3-vip A    192.168.1.222
grac31-vip      A    192.168.1.219
grac32-vip      A    192.168.1.224
grac33-vip      A    192.168.1.226


$ nslookup grac1.example.com
Name:    grac1.example.com
Address: 192.168.1.61
$ nslookup grac1.example.com
Name:    grac1.example.com
Address: 192.168.1.61
$ nslookup grac1.example.com
Name:    grac1.example.com
Address: 192.168.1.61
$ nslookup grac1int.example.com
Name:    grac1int.example.com
Address: 192.168.2.71
$ nslookup grac1int.example.com
Name:    grac1int.example.com
Address: 192.168.2.71
$ nslookup grac1int.example.com
Name:    grac1int.example.com
Address: 192.168.2.71
$ nslookup 192.168.2.71
71.2.168.192.in-addr.arpa    name = grac1int.example.com.
$ nslookup 192.168.2.72
72.2.168.192.in-addr.arpa    name = grac2int.example.com.
$ nslookup 192.168.2.73
73.2.168.192.in-addr.arpa    name = grac3int.example.com.
$ nslookup 192.168.1.61
61.1.168.192.in-addr.arpa    name = grac1.example.com.
$ nslookup 192.168.1.62
62.1.168.192.in-addr.arpa    name = grac2.example.com.
$ nslookup 192.168.1.63
63.1.168.192.in-addr.arpa    name = grac3.example.com.
$ nslookup grac1-vip.grid.example.com
Non-authoritative answer:
Name:    grac1-vip.grid.example.com
Address: 192.168.1.107
$ nslookup grac2-vip.grid.example.com
Non-authoritative answer:
Name:    grac2-vip.grid.example.com
Address: 192.168.1.112
$ nslookup GRACE2-scan.grid.example.com
Non-authoritative answer:
Name:    GRACE2-scan.grid.example.com
Address: 192.168.1.108
Name:    GRACE2-scan.grid.example.com
Address: 192.168.1.110
Name:    GRACE2-scan.grid.example.com
Address: 192.168.1.109

Use dig against DNS name server - DNS name server should use Zone Delegation
$ dig @192.168.1.50 GRACE2-scan.grid.example.com
; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.10.rc1.el6 <<>> @192.168.1.50 GRACE2-scan.grid.example.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 64626
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 2, ADDITIONAL: 1
;; QUESTION SECTION:
;GRACE2-scan.grid.example.com.    IN    A
;; ANSWER SECTION:
GRACE2-scan.grid.example.com. 1    IN    A    192.168.1.108
GRACE2-scan.grid.example.com. 1    IN    A    192.168.1.109
GRACE2-scan.grid.example.com. 1    IN    A    192.168.1.110
;; AUTHORITY SECTION:
grid.example.com.    3600    IN    NS    ns1.example.com.
grid.example.com.    3600    IN    NS    gns.grid.example.com.
;; ADDITIONAL SECTION:
ns1.example.com.    3600    IN    A    192.168.1.50
;; Query time: 0 msec
;; SERVER: 192.168.1.50#53(192.168.1.50)
;; WHEN: Sun Jul 28 13:50:26 2013
;; MSG SIZE  rcvd: 146

Use dig against GNS name server 
$ dig @192.168.1.55 GRACE2-scan.grid.example.com
; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.10.rc1.el6 <<>> @192.168.1.55 GRACE2-scan.grid.example.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 32138
;; flags: qr aa; QUERY: 1, ANSWER: 3, AUTHORITY: 1, ADDITIONAL: 1
;; QUESTION SECTION:
;GRACE2-scan.grid.example.com.    IN    A
;; ANSWER SECTION:
GRACE2-scan.grid.example.com. 120 IN    A    192.168.1.108
GRACE2-scan.grid.example.com. 120 IN    A    192.168.1.109
GRACE2-scan.grid.example.com. 120 IN    A    192.168.1.110
;; AUTHORITY SECTION:
grid.example.com.    10800    IN    SOA    GRACE2-gns-vip.grid.example.com. GRACE2-gns-vip.grid.example.com. 3173463 10800 10800 30 120
;; ADDITIONAL SECTION:
GRACE2-gns-vip.grid.example.com. 10800 IN A    192.168.1.55
;; Query time: 15 msec
;; SERVER: 192.168.1.55#53(192.168.1.55)
;; WHEN: Sun Jul 28 13:50:26 2013
;; MSG SIZE  rcvd: 161

Start the DNS server

# service named restart

Starting named:                                            [  OK  ]

Ensure DNS service restart on the reboot

# chkconfig named on

# chkconfig –list named

named              0:off    1:off    2:on    3:on    4:on    5:on    6:off

Display all records for zone example.com with dig 

 

$ dig example.com AXFR
$ dig @192.168.1.55  AXFR
$ dig GRACE2-scan.grid.example.com

 

Configure DHCP server 

  • dhclient is recreate /etc/resolv,conf . Run $ service network restart after testing dhclient that to have a consistent /etc/resolv,conf on all cluster nodes

 

Verify that you don't use any DHCP server from a  vbriged network
- Note If using Virtualbox briged network devices using same Network address as our local Router 
  the Virtualbox DHCP server is used ( of course you can disable 
  M:\VM> vboxmanage list bridgedifs
   Name:            Realtek PCIe GBE Family Controller
   GUID:            7e0af9ff-ea37-4e63-b2e5-5128c60ab300
   DHCP:            Enabled
   IPAddress:       192.168.1.4
   NetworkMask:     255.255.255.0

M:\VM\GRAC_OEL64_11203>ipconfig
   Windows-IP-Konfiguration
   Ethernet-Adapter LAN-Verbindung:
   Verbindungsspezifisches DNS-Suffix: speedport.ip
   Verbindungslokale IPv6-Adresse  . : fe80::c52f:f681:bb0b:c358%11
   IPv4-Adresse  . . . . . . . . . . : 192.168.1.4
   Subnetzmaske  . . . . . . . . . . : 255.255.255.0
   Standardgateway . . . . . . . . . : 192.168.1.1

Solution:  Use Internal Network devices instead of Bridged Network devices for the Virtulbox Network setup


-> /etc/sysconfig/dhcpd
Command line options here
 DHCPDARGS="eth0"

-> /etc/dhcp/dhcpd.conf ( don't used domain-name as this will create a new resolv.conf )
 ddns-update-style interim;
 ignore client-updates;
 subnet 192.168.1.0 netmask 255.255.255.0 {
 option routers                  192.168.1.1;                    # Default gateway to be used by DHCP clients
 option subnet-mask              255.255.255.0;                  # Default subnet mask to be used by DHCP clients.
 option ip-forwarding            off;                            # Do not forward DHCP requests.
 option broadcast-address        192.168.1.255;                  # Default broadcast address to be used by DHCP client.
#  option domain-name              "grid.example.com"; 
 option domain-name-servers      192.168.1.50;                   # IP address of the DNS server. In this document it will be oralab1
 option time-offset              -19000;                           # Central Standard Time
 option ntp-servers              0.pool.ntp.org;                   # Default NTP server to be used by DHCP clients
 range                           192.168.1.100 192.168.1.254;    # Range of IP addresses that can be issued to DHCP client
 default-lease-time              21600;                            # Amount of time in seconds that a client may keep the IP address
 max-lease-time                  43200;
 }
 # service dhcpd restart
 # chkconfig dhcpd on

Test on all cluster instances:
 # dhclient eth0
 Check /var/log/messages
 #  tail -f /var/log/messages
 Jul  8 12:46:09 gns dhclient[3909]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 7 (xid=0x6fb12d80)
 Jul  8 12:46:09 gns dhcpd: DHCPDISCOVER from 08:00:27:e6:71:54 via eth0
 Jul  8 12:46:10 gns dhcpd: 0.pool.ntp.org: temporary name server failure
 Jul  8 12:46:10 gns dhcpd: DHCPOFFER on 192.168.1.100 to 08:00:27:e6:71:54 via eth0
 Jul  8 12:46:10 gns dhclient[3909]: DHCPOFFER from 192.168.1.50
 Jul  8 12:46:10 gns dhclient[3909]: DHCPREQUEST on eth0 to 255.255.255.255 port 67 (xid=0x6fb12d80)
 Jul  8 12:46:10 gns dhcpd: DHCPREQUEST for 192.168.1.100 (192.168.1.50) from 08:00:27:e6:71:54 via eth0
 Jul  8 12:46:10 gns dhcpd: DHCPACK on 192.168.1.100 to 08:00:27:e6:71:54 via eth0
 Jul  8 12:46:10 gns dhclient[3909]: DHCPACK from 192.168.1.50 (xid=0x6fb12d80)
 Jul  8 12:46:12 gns avahi-daemon[1407]: Registering new address record for 192.168.1.100 on eth0.IPv4.
 Jul  8 12:46:12 gns NET[3962]: /sbin/dhclient-script : updated /etc/resolv.conf
 Jul  8 12:46:12 gns dhclient[3909]: bound to 192.168.1.100 -- renewal in 9071 seconds.
 Jul  8 12:46:13 gns ntpd[2051]: Listening on interface #6 eth0, 192.168.1.100#123 Enabled
  • Verify that the right DHCP server is in use ( at least check the bound an renwal values )

NTP Setup – Server: gns.example.com

# cat /etc/ntp.conf
 restrict default nomodify notrap noquery
 restrict 127.0.0.1
 # -- CLIENT NETWORK -------
 restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap
 # --- OUR TIMESERVERS -----  can't reach NTP servers - build my own server
 #server 0.pool.ntp.org iburst
 #server 1.pool.ntp.org iburst
 server 127.127.1.0
 # --- NTP MULTICASTCLIENT ---
 # --- GENERAL CONFIGURATION ---
 # Undisciplined Local Clock.
 fudge   127.127.1.0 stratum 9
 # Drift file.
 driftfile /var/lib/ntp/drift
 broadcastdelay  0.008
 # Keys file.
 keys /etc/ntp/keys
 # chkconfig ntpd on
 # ntpq -p
 remote           refid      st t when poll reach   delay   offset  jitter
 ==============================================================================
 *LOCAL(0)        .LOCL.           9 l   11   64  377    0.000    0.000   0.000

NTP Setup - Clients: grac1.example.com, grac2.example.com,  ...
 # cat /etc/ntp.conf
 restrict default nomodify notrap noquery
 restrict 127.0.0.1
 # -- CLIENT NETWORK -------
 # --- OUR TIMESERVERS -----
 # 192.168.1.2 is the address for my timeserver,
 # use the address of your own, instead:
 server 192.168.1.50
 server  127.127.1.0
 # --- NTP MULTICASTCLIENT ---
 # --- GENERAL CONFIGURATION ---
 # Undisciplined Local Clock.
 fudge   127.127.1.0 stratum 12
 # Drift file.
 driftfile /var/lib/ntp/drift
 broadcastdelay  0.008
 # Keys file.
 keys /etc/ntp/keys
 # ntpq -p
 remote           refid      st t when poll reach   delay   offset  jitter
 ==============================================================================
 gns.example.com LOCAL(0)        10 u   22   64    1    2.065  -11.015   0.000
 LOCAL(0)        .LOCL.          12 l   21   64    1    0.000    0.000   0.000
 Verify setup with cluvfy :

Add to  our /etc/rc.local
#
service ntpd stop
ntpdate -u 192.168.1.50 
service ntpd start

 

Verify GNS setup with cluvfy:

$ ./bin/cluvfy comp gns -precrsinst -domain grid.example.com -vip 192.168.2.100 -verbose -n grac1,grac2
 Verifying GNS integrity
 Checking GNS integrity...
 Checking if the GNS subdomain name is valid...
 The GNS subdomain name "grid.example.com" is a valid domain name
 Checking if the GNS VIP is a valid address...
 GNS VIP "192.168.2.100" resolves to a valid IP address
 Checking the status of GNS VIP...
 GNS integrity check passed
 Verification of GNS integrity was successful.

 

Setup User Accounts

NOTE: Oracle recommend different users for the installation of the Grid  Infrastructure (GI) and the Oracle RDBMS home. The GI will be installed in  a separate Oracle base, owned by user ‘grid.’ After the grid install the GI home will be owned by root, and inaccessible to unauthorized users.

Create OS groups using the command below. Enter these commands as the 'root' user:
  #/usr/sbin/groupadd -g 501 oinstall
  #/usr/sbin/groupadd -g 502 dba
  #/usr/sbin/groupadd -g 504 asmadmin
  #/usr/sbin/groupadd -g 506 asmdba
  #/usr/sbin/groupadd -g 507 asmoper

Create the users that will own the Oracle software using the commands:
  #/usr/sbin/useradd -u 501 -g oinstall -G asmadmin,asmdba,asmoper grid
  #/usr/sbin/useradd -u 502 -g oinstall -G dba,asmdba oracle
  $ id
  uid=501(grid) gid=54321(oinstall) groups=54321(oinstall),504(asmadmin),506(asmdba),507(asmoper)
  $ id
  uid=54321(oracle) gid=54321(oinstall) groups=54321(oinstall),501(vboxsf),506(adba),54322(dba)

For the C shell (csh or tcsh), add the following lines to the /etc/csh.login file:
  if ( $USER = "oracle" || $USER = "grid" ) then
  limit maxproc 16384
  limit descriptors 65536
  endif

Modify  /etc/security/limits.conf
  # oracle-rdbms-server-11gR2-preinstall setting for nofile soft limit is 1024
  oracle   soft   nofile    1024
  grid   soft   nofile    1024
  # oracle-rdbms-server-11gR2-preinstall setting for nofile hard limit is 65536
  oracle   hard   nofile    65536
  grid   hard   nofile    65536
  # oracle-rdbms-server-11gR2-preinstall setting for nproc soft limit is 2047
  oracle   soft   nproc    2047
  grid     soft   nproc    2047
  # oracle-rdbms-server-11gR2-preinstall setting for nproc hard limit is 16384
  oracle   hard   nproc    16384
  grid     hard   nproc    16384
  # oracle-rdbms-server-11gR2-preinstall setting for stack soft limit is 10240KB
  oracle   soft   stack    10240
  grid     soft   stack    10240
  # oracle-rdbms-server-11gR2-preinstall setting for stack hard limit is 32768KB
  oracle   hard   stack    32768
  grid     hard   stack    32768

Create Directories:
 - Have a separate ORACLE_BASE for both GRID and RDBMS install !
Create the Oracle Inventory Directory ( needed or 11.2.0.3 will ) 
To create the Oracle Inventory directory, enter the following commands as the root user:
  # mkdir -p /u01/app/oraInventory
  # chown -R grid:oinstall /u01/app/oraInventory

Creating the Oracle Grid Infrastructure Home Directory
To create the Grid Infrastructure home directory, enter the following commands as the root user:
  # mkdir -p /u01/app/grid
  # chown -R grid:oinstall /u01/app/grid
  # chmod -R 775 /u01/app/grid
  # mkdir -p /u01/app/11203/grid
  # chown -R grid:oinstall /u01//app/11203/grid
  # chmod -R 775 /u01/app/11203/grid

Creating the Oracle Base Directory
  To create the Oracle Base directory, enter the following commands as the root user:
  # mkdir -p /u01/app/oracle
  # chown -R oracle:oinstall /u01/app/oracle
  # chmod -R 775 /u01/app/oracle

Creating the Oracle RDBMS Home Directory
  To create the Oracle RDBMS Home directory, enter the following commands as the root user:
  # mkdir -p /u01/app/oracle/product/11203/racdb
  # chown -R oracle:oinstall /u01/app/oracle/product/11203/racdb
  # chmod -R 775 /u01/app/oracle/product/11203/racdb

Add divider=10″ to /boot/grub/grub.conf
Finally, add “divider=10″ to the boot parameters in grub.conf to improve VM performance. 
This is often recommended as a way to reduce host CPU utilization when a VM is idle, but 
it also improves overall guest performance. When I tried my first run-through of this 
process without this parameter enabled, the cluster configuration script bogged down 
terribly, and failed midway through creating the database

Verify Initial Virtualbox Image using cluvfy
  Install the cluvfy as Grid Owner ( grid )  in  ~/cluvfy112

Check the minimum system for our 1.st Virtualbox image  by running: cluvfy -p crs
$ ./bin/cluvfy comp sys -p crs -n grac1
Verifying system requirement 
Total memory check passed
Available memory check passed
Swap space check passed
Free disk space check passed for "grac1:/u01/app/11203/grid,grac1:/tmp"
Check for multiple users with UID value 501 passed 
User existence check passed for "grid"
Group existence check passed for "oinstall"
Group existence check passed for "dba"
Membership check for user "grid" in group "oinstall" [as Primary] passed
Membership check for user "grid" in group "dba" passed
Run level check passed
Hard limits check passed for "maximum open file descriptors"
Hard limits check passed for "maximum user processes"
System architecture check passed
Kernel version check passed
Kernel parameter check passed for "semmsl"
Kernel parameter check passed for "semmns"
Kernel parameter check passed for "semopm"
Kernel parameter check passed for "semmni"
Kernel parameter check passed for "shmmax"
Kernel parameter check passed for "shmmni"
Kernel parameter check passed for "shmall"
Kernel parameter check passed for "file-max"
Kernel parameter check passed for "ip_local_port_range"
Kernel parameter check passed for "rmem_default"
Kernel parameter check passed for "rmem_max"
Kernel parameter check passed for "wmem_default"
Kernel parameter check passed for "wmem_max"
Kernel parameter check passed for "aio-max-nr"
Package existence check passed for "binutils"
Package existence check passed for "compat-libcap1"
Package existence check passed for "compat-libstdc++-33(x86_64)"
Package existence check passed for "libgcc(x86_64)"
Package existence check passed for "libstdc++(x86_64)"
Package existence check passed for "libstdc++-devel(x86_64)"
Package existence check passed for "sysstat"
Package existence check passed for "gcc"
Package existence check passed for "gcc-c++"
Package existence check passed for "ksh"
Package existence check passed for "make"
Package existence check passed for "glibc(x86_64)"
Package existence check passed for "glibc-devel(x86_64)"
Package existence check passed for "libaio(x86_64)"
Package existence check passed for "libaio-devel(x86_64)"
Check for multiple users with UID value 0 passed 
Starting check for consistency of primary group of root user
Check for consistency of root user's primary group passed
Time zone consistency check passed
Verification of system requirement was successful.

 

 Setup ASM disks

Create ASM disks
  Note : Create all ASM disks on my SSD device ( C:\VM\GRACE2\ASM ) 
  Create 6 ASM disks : 
    3 disks with 5 Gbyte each   
    3 disks with 2 Gbyte each   
D:\VM>set_it
D:\VM>set path="d:\Program Files\Oracle\VirtualBox";D:\Windows\system32;D:\Windo
ws;D:\Windows\System32\Wbem;D:\Windows\System32\WindowsPowerShell\v1.0\;D:\Progr
am Files (x86)\IDM Computer Solutions\UltraEdit\

D:\VM>VBoxManage createhd --filename C:\VM\GRACE2\ASM\asm1_5G.vdi --size 5120 --
format VDI --variant Fixed
0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
Disk image created. UUID: 7c9711c7-14e9-4bc4-8390-3e7dbb2ad130
D:\VM>VBoxManage createhd --filename C:\VM\GRACE2\ASM\asm2_5G.vdi --size 5120 --
format VDI --variant Fixed
0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
Disk image created. UUID: 5c801291-7083-4030-9221-cfab1460f527
D:\VM>VBoxManage createhd --filename C:\VM\GRACE2\ASM\asm3_5G.vdi --size 5120 --
format VDI --variant Fixed
0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
Disk image created. UUID: 28b0e0b4-c9ae-474e-b339-d742a10bb120
D:\VM>VBoxManage createhd --filename C:\VM\GRACE2\ASM\asm1_2G.vdi --size 2048 --
format VDI --variant Fixed
0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
Disk image created. UUID: acc2b925-fa58-4d5f-966f-1c9cac014d1b
D:\VM>VBoxManage createhd --filename C:\VM\GRACE2\ASM\asm2_2G.vdi --size 2048 --
format VDI --variant Fixed
0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
Disk image created. UUID: a93f5fd8-bb10-4421-af07-3dfe4fc0d740
D:\VM>VBoxManage createhd --filename C:\VM\GRACE2\ASM\asm3_2G.vdi --size 2048 --
format VDI --variant Fixed
0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
Disk image created. UUID: 89c0f4cd-569e-4a30-9b6e-5ce3044fcde5
D:\VM>dir  C:\VM\GRACE2\ASM\*
 Volume in Laufwerk C: hat keine Bezeichnung.
 Volumeseriennummer: 20BF-FC17
 Verzeichnis von C:\VM\GRACE2\ASM
13.07.2013  13:00     2.147.495.936 asm1_2G.vdi
13.07.2013  12:56     5.368.733.696 asm1_5G.vdi
13.07.2013  13:00     2.147.495.936 asm2_2G.vdi
13.07.2013  12:57     5.368.733.696 asm2_5G.vdi
13.07.2013  13:00     2.147.495.936 asm3_2G.vdi
13.07.2013  12:59     5.368.733.696 asm3_5G.vdi

Attach disk to VM

D:\VM>VBoxManage storageattach grac1 --storagectl "SATA" --port 1  --device 0 --type hdd --medium C:\VM\GRACE2\ASM\asm1_5G.vdi
D:\VM>VBoxManage storageattach grac1 --storagectl "SATA" --port 2  --device 0 --type hdd --medium C:\VM\GRACE2\ASM\asm2_5G.vdi
D:\VM>VBoxManage storageattach grac1 --storagectl "SATA" --port 3  --device 0 --type hdd --medium C:\VM\GRACE2\ASM\asm3_5G.vdi
D:\VM>VBoxManage storageattach grac1 --storagectl "SATA" --port 4  --device 0 --type hdd --medium C:\VM\GRACE2\ASM\asm1_2G.vdi
D:\VM>VBoxManage storageattach grac1 --storagectl "SATA" --port 5  --device 0 --type hdd --medium C:\VM\GRACE2\ASM\asm2_2G.vdi
D:\VM>VBoxManage storageattach grac1 --storagectl "SATA" --port 6  --device 0 --type hdd --medium C:\VM\GRACE2\ASM\asm3_2G.vdi

Change disk type to sharable disks:
D:\VM>VBoxManage modifyhd C:\VM\GRACE2\ASM\asm1_5G.vdi --type shareable
D:\VM>VBoxManage modifyhd C:\VM\GRACE2\ASM\asm2_5G.vdi --type shareable
D:\VM>VBoxManage modifyhd C:\VM\GRACE2\ASM\asm3_5G.vdi --type shareable
D:\VM>VBoxManage modifyhd C:\VM\GRACE2\ASM\asm1_2G.vdi --type shareable
D:\VM>VBoxManage modifyhd C:\VM\GRACE2\ASM\asm2_2G.vdi --type shareable
D:\VM>VBoxManage modifyhd C:\VM\GRACE2\ASM\asm3_2G.vdi --type shareable

Reboot and format disks

 # ls /dev/sd*
/dev/sda   /dev/sda2  /dev/sdb  /dev/sdd  /dev/sdf
/dev/sda1  /dev/sda3  /dev/sdc  /dev/sde  /dev/sdg
# fdisk /dev/sdb
  Command (m for help): n
  Command action
   e   extended
   p   primary partition (1-4)
  p 
  Partition number (1-4): 1
  First sector (2048-10485759, default 2048): 
  Using default value 2048
  Last sector, +sectors or +size{K,M,G} (2048-10485759, default 10485759): 
  Using default value 10485759
  Command (m for help): w
  The partition table has been altered!
  In each case, the sequence of answers is "n", "p", "1", "Return", "Return" and "w".
  Repeat steps for : /dev/sdb -> /dev/sdg
#  ls /dev/sd*
/dev/sda   /dev/sda3  /dev/sdc   /dev/sdd1  /dev/sdf   /dev/sdg1
/dev/sda1  /dev/sdb   /dev/sdc1  /dev/sde   /dev/sdf1
/dev/sda2  /dev/sdb1  /dev/sdd   /dev/sde1  /dev/sdg

 

Configure ASMLib and Disks

# /usr/sbin/oracleasm configure -i

#  /etc/init.d/oracleasm createdisk data1 /dev/sdb1
Marking disk "data1" as an ASM disk:                       [  OK  ]
#  /etc/init.d/oracleasm createdisk data2 /dev/sdc1
Marking disk "data2" as an ASM disk:                       [  OK  ]
# /etc/init.d/oracleasm createdisk data3 /dev/sdd1
Marking disk "data3" as an ASM disk:                       [  OK  ]
#  /etc/init.d/oracleasm createdisk ocr1 /dev/sde1
Marking disk "ocr1" as an ASM disk:                        [  OK  ]
# /etc/init.d/oracleasm createdisk ocr2  /dev/sdf1
Marking disk "ocr2" as an ASM disk:                        [  OK  ]
[root@grac1 Desktop]#  /etc/init.d/oracleasm createdisk ocr3 /dev/sdg1
Marking disk "ocr3" as an ASM disk:                        [  OK  ]

# /etc/init.d/oracleasm listdisks
DATA1
DATA2
DATA3
OCR1
OCR2
OCR3

# ls -l /dev/oracleasm/disks
total 0
brw-rw---- 1 grid asmadmin 8, 17 Jul 13 16:32 DATA1
brw-rw---- 1 grid asmadmin 8, 33 Jul 13 16:32 DATA2
brw-rw---- 1 grid asmadmin 8, 49 Jul 13 16:33 DATA3
brw-rw---- 1 grid asmadmin 8, 65 Jul 13 16:33 OCR1
brw-rw---- 1 grid asmadmin 8, 81 Jul 13 16:33 OCR2
brw-rw---- 1 grid asmadmin 8, 97 Jul 13 16:33 OCR3

#  /etc/init.d/oracleasm status 
Checking if ASM is loaded: yes
Checking if /dev/oracleasm is mounted: yes
[root@grac1 Desktop]# /etc/init.d/oracleasm listdisks
DATA1
DATA2
DATA3
OCR1
OCR2
OCR3

# /etc/init.d/oracleasm querydisk -d DATA1
Disk "DATA1" is a valid ASM disk on device [8, 17]
# /etc/init.d/oracleasm querydisk -d DATA2
Disk "DATA2" is a valid ASM disk on device [8, 33]
# /etc/init.d/oracleasm querydisk -d DATA3
Disk "DATA3" is a valid ASM disk on device [8, 49]
# /etc/init.d/oracleasm querydisk -d OCR1
Disk "OCR1" is a valid ASM disk on device [8, 65]
# /etc/init.d/oracleasm querydisk -d OCR2
# /etc/init.d/oracleasm querydisk -d OCR3
Disk "OCR3" is a valid ASM disk on device [8, 97]
# /etc/init.d/oracleasm  scandisks
Scanning the system for Oracle ASMLib disks:               [  OK  ]

 

Clone VirtualBox Image

Shutdown Virtualbox image 1 and manually clone the "grac1.vdi" disk using the following commands on the host server.
D:\VM> set_it
D:\VM> md D:\VM\GNS_RACE2\grac2

D:\VM> VBoxManage clonehd D:\VM\GNS_RACE2\grac1\grac1.vdi d:\VM\GNS_RACE2\grac2\grac2.vdi
0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
Clone hard disk created in format 'VDI'. UUID: 0d626e95-9354-4f65-8fc0-e40ba44e1
Manually clone the "ol6-112-rac1.vdi" disk using the following commands on the host server.
Create new VM grac2 by using disk grac2.vdi

Attach disk to VM: grac2
D:\VM>VBoxManage storageattach grac2 --storagectl "SATA" --port 1  --device 0 --type hdd --medium C:\VM\GRACE2\ASM\asm1_5G.vdi
D:\VM>VBoxManage storageattach grac2 --storagectl "SATA" --port 2  --device 0 --type hdd --medium C:\VM\GRACE2\ASM\asm2_5G.vdi
D:\VM>VBoxManage storageattach grac2 --storagectl "SATA" --port 3  --device 0 --type hdd --medium C:\VM\GRACE2\ASM\asm3_5G.vdi
D:\VM>VBoxManage storageattach grac2 --storagectl "SATA" --port 4  --device 0 --type hdd --medium C:\VM\GRACE2\ASM\asm1_2G.vdi
D:\VM>VBoxManage storageattach grac2 --storagectl "SATA" --port 5  --device 0 --type hdd --medium C:\VM\GRACE2\ASM\asm2_2G.vdi
D:\VM>VBoxManage storageattach grac2 --storagectl "SATA" --port 6  --device 0 --type hdd --medium C:\VM\GRACE2\ASM\asm3_2G.vdi 
Start the "grac2" virtual machine by clicking the "Start" button on the toolbar. Ignore any network errors during the startup.
Log in to the "grac2" virtual machine as the "root" user so we can reconfigure the network settings to match the following.
    hostname: grac2.example.com
    IP Address eth0: 192.168.1.62 (public address)
    Default Gateway eth0: 192.168.1.1 (public address)
    IP Address eth1: 192.168.2.72 (private address)
    Default Gateway eth1: none
Amend the hostname in the "/etc/sysconfig/network" file.
    NETWORKING=yes
    HOSTNAME=grac2.example.com 
Check the MAC address of each of the available network connections. Don't worry that they are listed as "eth2" and "eth3". These are dynamically created connections because the MAC address of the "eth0" and "eth1" connections is incorrect.

# ifconfig -a | grep eth
eth2      Link encap:Ethernet  HWaddr 08:00:27:1F:2E:33  
eth3      Link encap:Ethernet  HWaddr 08:00:27:8E:6D:24  
Edit the "/etc/sysconfig/network-scripts/ifcfg-eth0", amending only the IPADDR and HWADDR settings as follows and deleting the UUID entry. Note, the HWADDR value comes from the "eth2" interface displayed above.
    IPADDR=192.168.1.62
    HWADDR=08:00:27:1F:2E:33 
Edit the "/etc/sysconfig/network-scripts/ifcfg-eth1", amending only the IPADDR and HWADDR settings as follows and deleting the UUID entry. Note, the HWADDR value comes from the "eth3" interface displayed above.
    HWADDR=08:00:27:8E:6D:24
    IPADDR=192.168.2.102
Change .login for grid user
 setenv ORACLE_SID +ASM2
Remove udev rules:
# rm  /etc/udev/rules.d/70-persistent-net.rules
# reboot
Verify network devices ( use graphical tool if needed for changes )
# ifconfig
eth0      Link encap:Ethernet  HWaddr 08:00:27:1F:2E:33  
          inet addr:192.168.1.62  Bcast:192.168.1.255  Mask:255.255.255.0
..
eth1      Link encap:Ethernet  HWaddr 08:00:27:8E:6D:24  
          inet addr:192.168.2.72  Bcast:192.168.2.255  Mask:255.255.255.0 
..

Check Ntp
$ ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 gns.example.com LOCAL(0)        10 u   30   64    1    0.462  2233.72   0.000
 LOCAL(0)        .LOCL.          12 l   29   64    1    0.000    0.000   0.000

Check DHCP
$ grep -i dhcp /var/log/messages
Jul 15 19:12:21 grac1 NetworkManager[1528]: <info> Activation (eth2) Beginning DHCPv4 transaction
Jul 15 19:12:21 grac1 NetworkManager[1528]: <info> Activation (eth2) DHCPv4 will time out in 45 seconds
Jul 15 19:12:21 grac1 NetworkManager[1528]: <info> Activation (eth3) Beginning DHCPv4 transaction
Jul 15 19:12:21 grac1 NetworkManager[1528]: <info> Activation (eth3) DHCPv4 will time out in 45 seconds
Jul 15 19:12:21 grac1 dhclient[1547]: Internet Systems Consortium DHCP Client 4.1.1-P1
Jul 15 19:12:21 grac1 dhclient[1547]: For info, please visit https://www.isc.org/software/dhcp/
Jul 15 19:12:21 grac1 dhclient[1537]: Internet Systems Consortium DHCP Client 4.1.1-P1
Jul 15 19:12:21 grac1 dhclient[1537]: For info, please visit https://www.isc.org/software/dhcp/
Jul 15 19:12:21 grac1 NetworkManager[1528]: <info> (eth2): DHCPv4 state changed nbi -> preinit
Jul 15 19:12:21 grac1 NetworkManager[1528]: <info> (eth3): DHCPv4 state changed nbi -> preinit
Jul 15 19:12:22 grac1 dhclient[1537]: DHCPDISCOVER on eth2 to 255.255.255.255 port 67 interval 4 (xid=0x5ddfdccc)
Jul 15 19:12:23 grac1 dhclient[1547]: DHCPDISCOVER on eth3 to 255.255.255.255 port 67 interval 5 (xid=0x5c751799)
Jul 15 19:12:26 grac1 dhclient[1537]: DHCPDISCOVER on eth2 to 255.255.255.255 port 67 interval 11 (xid=0x5ddfdccc)
Jul 15 19:12:28 grac1 dhclient[1547]: DHCPDISCOVER on eth3 to 255.255.255.255 port 67 interval 11 (xid=0x5c751799)
Jul 15 19:12:32 grac1 dhclient[1537]: DHCPOFFER from 192.168.1.50
Jul 15 19:12:32 grac1 dhclient[1537]: DHCPREQUEST on eth2 to 255.255.255.255 port 67 (xid=0x5ddfdccc)
Jul 15 19:12:32 grac1 dhclient[1537]: DHCPACK from 192.168.1.50 (xid=0x5ddfdccc)
Jul 15 19:12:32 grac1 NetworkManager[1528]: <info> (eth2): DHCPv4 state changed preinit -> bound
Jul 15 19:12:33 grac1 dhclient[1547]: DHCPOFFER from 192.168.1.50
Jul 15 19:12:33 grac1 dhclient[1547]: DHCPREQUEST on eth3 to 255.255.255.255 port 67 (xid=0x5c751799)
Jul 15 19:12:33 grac1 dhclient[1547]: DHCPACK from 192.168.1.50 (xid=0x5c751799)
Jul 15 19:12:33 grac1 NetworkManager[1528]: <info> (eth3): DHCPv4 state changed preinit -> bound
Jul 15 19:27:53 grac2 NetworkManager[1617]: <info> Activation (eth2) Beginning DHCPv4 transaction
Jul 15 19:27:53 grac2 NetworkManager[1617]: <info> Activation (eth2) DHCPv4 will time out in 45 seconds
Jul 15 19:27:53 grac2 dhclient[1637]: Internet Systems Consortium DHCP Client 4.1.1-P1
Jul 15 19:27:53 grac2 dhclient[1637]: For info, please visit https://www.isc.org/software/dhcp/
Jul 15 19:27:53 grac2 dhclient[1637]: DHCPDISCOVER on eth2 to 255.255.255.255 port 67 interval 4 (xid=0x44e12e9)
Jul 15 19:27:53 grac2 NetworkManager[1617]: <info> (eth2): DHCPv4 state changed nbi -> preinit
Jul 15 19:27:57 grac2 dhclient[1637]: DHCPDISCOVER on eth2 to 255.255.255.255 port 67 interval 10 (xid=0x44e12e9)
Jul 15 19:28:03 grac2 dhclient[1637]: DHCPOFFER from 192.168.1.50
Jul 15 19:28:03 grac2 dhclient[1637]: DHCPREQUEST on eth2 to 255.255.255.255 port 67 (xid=0x44e12e9)
Jul 15 19:28:03 grac2 dhclient[1637]: DHCPACK from 192.168.1.50 (xid=0x44e12e9)
Jul 15 19:28:03 grac2 NetworkManager[1617]: <info> (eth2): DHCPv4 state changed preinit -> bound
Jul 15 19:32:52 grac2 NetworkManager[1690]: <info> Activation (eth0) Beginning DHCPv4 transaction
Jul 15 19:32:52 grac2 NetworkManager[1690]: <info> Activation (eth0) DHCPv4 will time out in 45 seconds
Jul 15 19:32:52 grac2 dhclient[1703]: Internet Systems Consortium DHCP Client 4.1.1-P1
Jul 15 19:32:52 grac2 dhclient[1703]: For info, please visit https://www.isc.org/software/dhcp/
Jul 15 19:32:52 grac2 dhclient[1703]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 6 (xid=0x6781ea4f)
Jul 15 19:32:52 grac2 NetworkManager[1690]: <info> (eth0): DHCPv4 state changed nbi -> preinit
Jul 15 19:32:58 grac2 dhclient[1703]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 12 (xid=0x6781ea4f)
Jul 15 19:33:02 grac2 dhclient[1703]: DHCPOFFER from 192.168.1.50
Jul 15 19:33:02 grac2 dhclient[1703]: DHCPREQUEST on eth0 to 255.255.255.255 port 67 (xid=0x6781ea4f)
Jul 15 19:33:02 grac2 dhclient[1703]: DHCPACK from 192.168.1.50 (xid=0x6781ea4f)
Jul 15 19:33:02 grac2 NetworkManager[1690]: <info> (eth0): DHCPv4 state changed preinit -> bound
Jul 15 19:37:56 grac2 NetworkManager[1690]: <info> (eth0): canceled DHCP transaction, DHCP client pid 1703
Rerun clufify for 2.nd node and test GNS connectivity:

Verify GNS: 
$ ./bin/cluvfy comp gns -precrsinst -domain oracle-gns.example.com -vip 192.168.2.72 -verbose -n grac2
Verifying GNS integrity 
Checking GNS integrity...
Checking if the GNS subdomain name is valid...
The GNS subdomain name "oracle-gns.example.com" is a valid domain name
Checking if the GNS VIP is a valid address...
GNS VIP "192.168.2.72" resolves to a valid IP address
Checking the status of GNS VIP...
GNS integrity check passed
Verification of GNS integrity was successful. 

Verify CRS for both nodes using newly created  ASM disk and asmadmin group 
$ ./bin/cluvfy stage -pre crsinst -n grac1,grac2 -asm -asmgrp asmadmin -asmdev /dev/oracleasm/disks/DATA1,/dev/oracleasm/disks/DATA2,/dev/oracleasm/disks/DATA3
Performing pre-checks for cluster services setup 
Checking node reachability...
Node reachability check passed from node "grac1"
Checking user equivalence...
User equivalence check passed for user "grid"
Checking node connectivity...
Checking hosts config file...
Verification of the hosts config file successful
Node connectivity passed for subnet "192.168.1.0" with node(s) grac2,grac1
TCP connectivity check passed for subnet "192.168.1.0"
Node connectivity passed for subnet "192.168.2.0" with node(s) grac2,grac1
TCP connectivity check passed for subnet "192.168.2.0"
Node connectivity passed for subnet "169.254.0.0" with node(s) grac2,grac1
TCP connectivity check passed for subnet "169.254.0.0"
Interfaces found on subnet "169.254.0.0" that are likely candidates for VIP are:
grac2 eth1:169.254.86.205
grac1 eth1:169.254.168.215
Interfaces found on subnet "192.168.2.0" that are likely candidates for a private interconnect are:
grac2 eth1:192.168.2.102
grac1 eth1:192.168.2.101
Checking subnet mask consistency...
Subnet mask consistency check passed for subnet "192.168.1.0".
Subnet mask consistency check passed for subnet "192.168.2.0".
Subnet mask consistency check passed for subnet "169.254.0.0".
Subnet mask consistency check passed.
Node connectivity check passed
Checking ASMLib configuration.
Check for ASMLib configuration passed.
Total memory check passed
Available memory check passed
Swap space check passed
Free disk space check passed for "grac2:/u01/app/11203/grid,grac2:/tmp"
Free disk space check passed for "grac1:/u01/app/11203/grid,grac1:/tmp"
Check for multiple users with UID value 501 passed 
User existence check passed for "grid"
Group existence check passed for "oinstall"
Group existence check passed for "dba"
Group existence check passed for "asmadmin"
Membership check for user "grid" in group "oinstall" [as Primary] passed
Membership check for user "grid" in group "dba" passed
Membership check for user "grid" in group "asmadmin" passed
Run level check passed
Hard limits check passed for "maximum open file descriptors"
Hard limits check passed for "maximum user processes"
System architecture check passed
Kernel version check passed
Kernel parameter check passed for "semmsl"
Kernel parameter check passed for "semmns"
Kernel parameter check passed for "semopm"
Kernel parameter check passed for "semmni"
Kernel parameter check passed for "shmmax"
Kernel parameter check passed for "shmmni"
Kernel parameter check passed for "shmall"
Kernel parameter check passed for "file-max"
Kernel parameter check passed for "ip_local_port_range"
Kernel parameter check passed for "rmem_default"
Kernel parameter check passed for "rmem_max"
Kernel parameter check passed for "wmem_default"
Kernel parameter check passed for "wmem_max"
Kernel parameter check passed for "aio-max-nr"
Package existence check passed for "binutils"
Package existence check passed for "compat-libcap1"
Package existence check passed for "compat-libstdc++-33(x86_64)"
Package existence check passed for "libgcc(x86_64)"
Package existence check passed for "libstdc++(x86_64)"
Package existence check passed for "libstdc++-devel(x86_64)"
Package existence check passed for "sysstat"
Package existence check passed for "gcc"
Package existence check passed for "gcc-c++"
Package existence check passed for "ksh"
Package existence check passed for "make"
Package existence check passed for "glibc(x86_64)"
Package existence check passed for "glibc-devel(x86_64)"
Package existence check passed for "libaio(x86_64)"
Package existence check passed for "libaio-devel(x86_64)"
Check for multiple users with UID value passed 
Current group ID check passed
Starting check for consistency of primary group of root user
Check for consistency of root user's primary group passed
Package existence check passed for "cvuqdisk"
Checking Devices for ASM...
Checking for shared devices...
  Device                                Device Type             
  ------------------------------------  ------------------------
  /dev/oracleasm/disks/DATA3            Disk                    
  /dev/oracleasm/disks/DATA2            Disk                    
  /dev/oracleasm/disks/DATA1            Disk                    
Checking consistency of device owner across all nodes...
Consistency check of device owner for "/dev/oracleasm/disks/DATA3" PASSED
Consistency check of device owner for "/dev/oracleasm/disks/DATA1" PASSED
Consistency check of device owner for "/dev/oracleasm/disks/DATA2" PASSED
Checking consistency of device group across all nodes...
Consistency check of device group for "/dev/oracleasm/disks/DATA3" PASSED
Consistency check of device group for "/dev/oracleasm/disks/DATA1" PASSED
Consistency check of device group for "/dev/oracleasm/disks/DATA2" PASSED
Checking consistency of device permissions across all nodes...
Consistency check of device permissions for "/dev/oracleasm/disks/DATA3" PASSED
Consistency check of device permissions for "/dev/oracleasm/disks/DATA1" PASSED
Consistency check of device permissions for "/dev/oracleasm/disks/DATA2" PASSED
Checking consistency of device size across all nodes...
Consistency check of device size for "/dev/oracleasm/disks/DATA3" PASSED
Consistency check of device size for "/dev/oracleasm/disks/DATA1" PASSED
Consistency check of device size for "/dev/oracleasm/disks/DATA2" PASSED
UDev attributes check for ASM Disks started...
ERROR: 
PRVF-9802 : Attempt to get udev info from node "grac2" failed
ERROR: 
PRVF-9802 : Attempt to get udev info from node "grac1" failed
UDev attributes check failed for ASM Disks 
Devices check for ASM passed
Starting Clock synchronization checks using Network Time Protocol(NTP)...
NTP Configuration file check started...
NTP Configuration file check passed
Checking daemon liveness...
Liveness check passed for "ntpd"
Check for NTP daemon or service alive passed on all nodes
NTP daemon slewing option check passed
NTP daemon's boot time configuration check for slewing option passed
NTP common Time Server Check started...
Check of common NTP Time Server passed
Clock time offset check from NTP Time Server started...
Clock time offset check passed
Clock synchronization check using Network Time Protocol(NTP) passed
Core file name pattern consistency check passed.
User "grid" is not part of "root" group. Check passed
Default user file creation mask check passed
Checking consistency of file "/etc/resolv.conf" across nodes
File "/etc/resolv.conf" does not have both domain and search entries defined
domain entry in file "/etc/resolv.conf" is consistent across nodes
search entry in file "/etc/resolv.conf" is consistent across nodes
All nodes have one search entry defined in file "/etc/resolv.conf"
PRVF-5636 : The DNS response time for an unreachable node exceeded "15000" ms on following nodes: grac2,grac1
File "/etc/resolv.conf" is not consistent across nodes
Time zone consistency check passed
Pre-check for cluster services setup was unsuccessful on all the nodes. 
Ignore PRVF-9802 , PRVF-5636.  For details check the following link.

 

Install Clusterware Software

As user Root 
# xhost +
    access control disabled, clients can connect from any host
As user Grid
$  xclock      ( Testing X connection )
$ cd /KITS/Oracle/11.2.0.3/Linux_64/grid   ( your grid staging area )
$ ./runInstaller  
--> Important : Select Installation type : Advanced Installation
Cluster name   GRACE2  
Scan name:     GRACE2-scan.grid.example.com
Scan port:     1521
Configure GNS
GNS sub domain:  grid.example.com
GNS VIP address: 192.168.1.55
   ( This address shouldn't be in use:   # ping 192.168.1.55 should fail ) 
  Hostname:  grac1.example.com     Virtual hostnames  : AUTO
  Hostname:  grac1.example.com     Virtual hostnames  : AUTO 
Test and configure SSH connectivity 
Configure ASM disk string: /dev/oracleasm/disks/*
ASM password: sys 
Don't user IPM
Dont't change groups
ORACLE_BASE: /u01/app/grid
Sofware Location : /u01/app/11.2.0/grid
--> Check OUI Prerequisites Check 
  -> Ignore the wellknown  PRVF-5636, PRVF-9802 errors/warnings ( see the former clufvfy reports ) 
Install software and run the related root.sh scripts

Run on grac1:  /u01/app/11203/grid/root.sh
Performing root user operation for Oracle 11g 
The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME=  /u01/app/11203/grid
Enter the full pathname of the local bin directory: [/usr/local/bin]: 
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/11203/grid/crs/install/crsconfig_params
Creating trace directory
User ignored Prerequisites during installation
OLR initialization - successful
  root wallet
  root wallet cert
  root cert export
  peer wallet
  profile reader wallet
  pa wallet
  peer wallet keys
  pa wallet keys
  peer cert request
  pa cert request
  peer cert
  pa cert
  peer root cert TP
  profile reader root cert TP
  pa root cert TP
  peer pa cert TP
  pa peer cert TP
  profile reader pa cert TP
  profile reader peer cert TP
  peer user cert
  pa user cert
Adding Clusterware entries to upstart
CRS-2672: Attempting to start 'ora.mdnsd' on 'grac1'
CRS-2676: Start of 'ora.mdnsd' on 'grac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'grac1'
CRS-2676: Start of 'ora.gpnpd' on 'grac1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'grac1'
CRS-2672: Attempting to start 'ora.gipcd' on 'grac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'grac1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'grac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'grac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'grac1'
CRS-2676: Start of 'ora.diskmon' on 'grac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'grac1' succeeded
ASM created and started successfully.
Disk Group DATA created successfully.
clscfg: -install mode specified
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
CRS-4256: Updating the profile
Successful addition of voting disk 3ee007b399cc4f59bfa0fc80ff3fa9ff.
Successful addition of voting disk 7a73147a81dc4f71bfc8757343aee181.
Successful addition of voting disk 25fcfbdb854a4f49bf0addd0fa32d0a2.
Successfully replaced voting disk group with +DATA.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   3ee007b399cc4f59bfa0fc80ff3fa9ff (/dev/oracleasm/disks/DATA1) [DATA]
 2. ONLINE   7a73147a81dc4f71bfc8757343aee181 (/dev/oracleasm/disks/DATA2) [DATA]
 3. ONLINE   25fcfbdb854a4f49bf0addd0fa32d0a2 (/dev/oracleasm/disks/DATA3) [DATA]
Located 3 voting disk(s).
CRS-2672: Attempting to start 'ora.asm' on 'grac1'
CRS-2676: Start of 'ora.asm' on 'grac1' succeeded
CRS-2672: Attempting to start 'ora.DATA.dg' on 'grac1'
CRS-2676: Start of 'ora.DATA.dg' on 'grac1' succeeded
Configure Oracle Grid Infrastructure for a Cluster ... succeeded

Run on grac2:  /u01/app/11203/grid/root.sh
# /u01/app/11203/grid/root.sh
Performing root user operation for Oracle 11g 
The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME=  /u01/app/11203/grid
Enter the full pathname of the local bin directory: [/usr/local/bin]: 
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/app/11203/grid/crs/install/crsconfig_params
Creating trace directory
User ignored Prerequisites during installation
OLR initialization - successful
Adding Clusterware entries to upstart
CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node grac1, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
Configure Oracle Grid Infrastructure for a Cluster ... succeeded

Run cluvfyy and crsctl to verify Oracle Grid Installation
$ ./bin/cluvfy stage -post crsinst -n grac1,grac2 -verbose
Performing post-checks for cluster services setup 
Checking node reachability...
Check: Node reachability from node "grac1"
  Destination Node                      Reachable?              
  ------------------------------------  ------------------------
  grac2                                 yes                     
  grac1                                 yes                     
Result: Node reachability check passed from node "grac1"
Checking user equivalence...
Check: User equivalence for user "grid"
  Node Name                             Status                  
  ------------------------------------  ------------------------
  grac2                                 passed                  
  grac1                                 passed                  
Result: User equivalence check passed for user "grid"
Checking node connectivity...
Checking hosts config file...
  Node Name                             Status                  
  ------------------------------------  ------------------------
  grac2                                 passed                  
  grac1                                 passed                  
Verification of the hosts config file successful
Interface information for node "grac2"
 Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU   
 ------ --------------- --------------- --------------- --------------- ----------------- ------
 eth0   192.168.1.62    192.168.1.0     0.0.0.0         168.1.0.1       08:00:27:1F:2E:33 1500  
 eth0   192.168.1.112   192.168.1.0     0.0.0.0         168.1.0.1       08:00:27:1F:2E:33 1500  
 eth0   192.168.1.108   192.168.1.0     0.0.0.0         168.1.0.1       08:00:27:1F:2E:33 1500  
 eth1   192.168.2.102   192.168.2.0     0.0.0.0         168.1.0.1       08:00:27:8E:6D:24 1500  
 eth1   169.254.86.205  169.254.0.0     0.0.0.0         168.1.0.1       08:00:27:8E:6D:24 1500  
Interface information for node "grac1"
 Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU   
 ------ --------------- --------------- --------------- --------------- ----------------- ------
 eth0   192.168.1.61    192.168.1.0     0.0.0.0         192.168.1.1     08:00:27:6E:17:DB 1500  
 eth0   192.168.1.55    192.168.1.0     0.0.0.0         192.168.1.1     08:00:27:6E:17:DB 1500  
 eth0   192.168.1.110   192.168.1.0     0.0.0.0         192.168.1.1     08:00:27:6E:17:DB 1500  
 eth0   192.168.1.109   192.168.1.0     0.0.0.0         192.168.1.1     08:00:27:6E:17:DB 1500  
 eth0   192.168.1.107   192.168.1.0     0.0.0.0         192.168.1.1     08:00:27:6E:17:DB 1500  
 eth1   192.168.2.101   192.168.2.0     0.0.0.0         192.168.1.1     08:00:27:F5:31:22 1500  
 eth1   169.254.168.215 169.254.0.0     0.0.0.0         192.168.1.1     08:00:27:F5:31:22 1500  
Check: Node connectivity for interface "eth0"
  Source                          Destination                     Connected?      
  ------------------------------  ------------------------------  ----------------
  grac2[192.168.1.62]             grac2[192.168.1.112]            yes             
  grac2[192.168.1.62]             grac2[192.168.1.108]            yes             
  grac2[192.168.1.62]             grac1[192.168.1.61]             yes             
  grac2[192.168.1.62]             grac1[192.168.1.55]             yes             
  grac2[192.168.1.62]             grac1[192.168.1.110]            yes             
  grac2[192.168.1.62]             grac1[192.168.1.109]            yes             
  grac2[192.168.1.62]             grac1[192.168.1.107]            yes             
  grac2[192.168.1.112]            grac2[192.168.1.108]            yes             
  grac2[192.168.1.112]            grac1[192.168.1.61]             yes             
  grac2[192.168.1.112]            grac1[192.168.1.55]             yes             
  grac2[192.168.1.112]            grac1[192.168.1.110]            yes             
  grac2[192.168.1.112]            grac1[192.168.1.109]            yes             
  grac2[192.168.1.112]            grac1[192.168.1.107]            yes             
  grac2[192.168.1.108]            grac1[192.168.1.61]             yes             
  grac2[192.168.1.108]            grac1[192.168.1.55]             yes             
  grac2[192.168.1.108]            grac1[192.168.1.110]            yes             
  grac2[192.168.1.108]            grac1[192.168.1.109]            yes             
  grac2[192.168.1.108]            grac1[192.168.1.107]            yes             
  grac1[192.168.1.61]             grac1[192.168.1.55]             yes             
  grac1[192.168.1.61]             grac1[192.168.1.110]            yes             
  grac1[192.168.1.61]             grac1[192.168.1.109]            yes             
  grac1[192.168.1.61]             grac1[192.168.1.107]            yes             
  grac1[192.168.1.55]             grac1[192.168.1.110]            yes             
  grac1[192.168.1.55]             grac1[192.168.1.109]            yes             
  grac1[192.168.1.55]             grac1[192.168.1.107]            yes             
  grac1[192.168.1.110]            grac1[192.168.1.109]            yes             
  grac1[192.168.1.110]            grac1[192.168.1.107]            yes             
  grac1[192.168.1.109]            grac1[192.168.1.107]            yes             
Result: Node connectivity passed for interface "eth0"
Check: TCP connectivity of subnet "192.168.1.0"
  Source                          Destination                     Connected?      
  ------------------------------  ------------------------------  ----------------
  grac1:192.168.1.61              grac2:192.168.1.62              passed          
  grac1:192.168.1.61              grac2:192.168.1.112             passed          
  grac1:192.168.1.61              grac2:192.168.1.108             passed          
  grac1:192.168.1.61              grac1:192.168.1.55              passed          
  grac1:192.168.1.61              grac1:192.168.1.110             passed          
  grac1:192.168.1.61              grac1:192.168.1.109             passed          
  grac1:192.168.1.61              grac1:192.168.1.107             passed          
Result: TCP connectivity check passed for subnet "192.168.1.0"
Check: Node connectivity for interface "eth1"
  Source                          Destination                     Connected?      
  ------------------------------  ------------------------------  ----------------
  grac2[192.168.2.102]            grac1[192.168.2.101]            yes             
Result: Node connectivity passed for interface "eth1"
Check: TCP connectivity of subnet "192.168.2.0"
  Source                          Destination                     Connected?      
  ------------------------------  ------------------------------  ----------------
  grac1:192.168.2.101             grac2:192.168.2.102             passed          
Result: TCP connectivity check passed for subnet "192.168.2.0"
Checking subnet mask consistency...
Subnet mask consistency check passed for subnet "192.168.1.0".
Subnet mask consistency check passed for subnet "192.168.2.0".
Subnet mask consistency check passed.
Result: Node connectivity check passed
Checking multicast communication...
Checking subnet "192.168.1.0" for multicast communication with multicast group "230.0.1.0"...
Check of subnet "192.168.1.0" for multicast communication with multicast group "230.0.1.0" passed.
Checking subnet "192.168.2.0" for multicast communication with multicast group "230.0.1.0"...
Check of subnet "192.168.2.0" for multicast communication with multicast group "230.0.1.0" passed.
Check of multicast communication passed.
Check: Time zone consistency 
Result: Time zone consistency check passed
Checking Oracle Cluster Voting Disk configuration...
ASM Running check passed. ASM is running on all specified nodes
Oracle Cluster Voting Disk configuration check passed
Checking Cluster manager integrity... 
Checking CSS daemon...
  Node Name                             Status                  
  ------------------------------------  ------------------------
  grac2                                 running                 
  grac1                                 running                 
Oracle Cluster Synchronization Services appear to be online.
Cluster manager integrity check passed
UDev attributes check for OCR locations started...
Result: UDev attributes check passed for OCR locations 
UDev attributes check for Voting Disk locations started...
Result: UDev attributes check passed for Voting Disk locations 
Check default user file creation mask
  Node Name     Available                 Required                  Comment   
  ------------  ------------------------  ------------------------  ----------
  grac2         22                        0022                      passed    
  grac1         22                        0022                      passed    
Result: Default user file creation mask check passed
Checking cluster integrity...
  Node Name                           
  ------------------------------------
  grac1                               
  grac2                               
Cluster integrity check passed
Checking OCR integrity...
Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations
ASM Running check passed. ASM is running on all specified nodes
Checking OCR config file "/etc/oracle/ocr.loc"...
OCR config file "/etc/oracle/ocr.loc" check successful
Disk group for ocr location "+DATA" available on all the nodes
NOTE: 
This check does not verify the integrity of the OCR contents. Execute 'ocrcheck' as a privileged user to verify the contents of OCR.
OCR integrity check passed
Checking CRS integrity...
Clusterware version consistency passed
The Oracle Clusterware is healthy on node "grac2"
The Oracle Clusterware is healthy on node "grac1"
CRS integrity check passed
Checking node application existence...
Checking existence of VIP node application (required)
  Node Name     Required                  Running?                  Comment   
  ------------  ------------------------  ------------------------  ----------
  grac2         yes                       yes                       passed    
  grac1         yes                       yes                       passed    
VIP node application check passed
Checking existence of NETWORK node application (required)
  Node Name     Required                  Running?                  Comment   
  ------------  ------------------------  ------------------------  ----------
  grac2         yes                       yes                       passed    
  grac1         yes                       yes                       passed    
NETWORK node application check passed
Checking existence of GSD node application (optional)
  Node Name     Required                  Running?                  Comment   
  ------------  ------------------------  ------------------------  ----------
  grac2         no                        no                        exists    
  grac1         no                        no                        exists    
GSD node application is offline on nodes "grac2,grac1"
Checking existence of ONS node application (optional)
  Node Name     Required                  Running?                  Comment   
  ------------  ------------------------  ------------------------  ----------
  grac2         no                        yes                       passed    
  grac1         no                        yes                       passed    
ONS node application check passed
Checking Single Client Access Name (SCAN)...
  SCAN Name         Node          Running?      ListenerName  Port          Running?    
  ----------------  ------------  ------------  ------------  ------------  ------------
  GRACE2-scan.grid.example.com  grac2         true          LISTENER_SCAN1  1521          true        
  GRACE2-scan.grid.example.com  grac1         true          LISTENER_SCAN2  1521          true        
  GRACE2-scan.grid.example.com  grac1         true          LISTENER_SCAN3  1521          true        
Checking TCP connectivity to SCAN Listeners...
  Node          ListenerName              TCP connectivity?       
  ------------  ------------------------  ------------------------
  grac1         LISTENER_SCAN1            yes                     
  grac1         LISTENER_SCAN2            yes                     
  grac1         LISTENER_SCAN3            yes                     
TCP connectivity to SCAN Listeners exists on all cluster nodes
Checking name resolution setup for "GRACE2-scan.grid.example.com"...
  SCAN Name     IP Address                Status                    Comment   
  ------------  ------------------------  ------------------------  ----------
  GRACE2-scan.grid.example.com  192.168.1.110             passed                              
  GRACE2-scan.grid.example.com  192.168.1.109             passed                              
  GRACE2-scan.grid.example.com  192.168.1.108             passed                              
Verification of SCAN VIP and Listener setup passed
Checking OLR integrity...
Checking OLR config file...
OLR config file check successful
Checking OLR file attributes...
OLR file check successful
WARNING: 
This check does not verify the integrity of the OLR contents. Execute 'ocrcheck -local' as a privileged user to verify the contents of OLR.
OLR integrity check passed
Checking GNS integrity...
Checking if the GNS subdomain name is valid...
The GNS subdomain name "grid.example.com" is a valid domain name
Checking if the GNS VIP belongs to same subnet as the public network...
Public network subnets "192.168.1.0" match with the GNS VIP "192.168.1.0"
Checking if the GNS VIP is a valid address...
GNS VIP "192.168.1.55" resolves to a valid IP address
Checking the status of GNS VIP...
Checking if FDQN names for domain "grid.example.com" are reachable
GNS resolved IP addresses are reachable
GNS resolved IP addresses are reachable
GNS resolved IP addresses are reachable
GNS resolved IP addresses are reachable
GNS resolved IP addresses are reachable
Checking status of GNS resource...
  Node          Running?                  Enabled?                
  ------------  ------------------------  ------------------------
  grac2         no                        yes                     
  grac1         yes                       yes                     
GNS resource configuration check passed
Checking status of GNS VIP resource...
  Node          Running?                  Enabled?                
  ------------  ------------------------  ------------------------
  grac2         no                        yes                     
  grac1         yes                       yes                     
GNS VIP resource configuration check passed.
GNS integrity check passed
Checking to make sure user "grid" is not in "root" group
  Node Name     Status                    Comment                 
  ------------  ------------------------  ------------------------
  grac2         passed                    does not exist          
  grac1         passed                    does not exist          
Result: User "grid" is not part of "root" group. Check passed
Checking if Clusterware is installed on all nodes...
Check of Clusterware install passed
Checking if CTSS Resource is running on all nodes...
Check: CTSS Resource running on all nodes
  Node Name                             Status                  
  ------------------------------------  ------------------------
  grac2                                 passed                  
  grac1                                 passed                  
Result: CTSS resource check passed
Querying CTSS for time offset on all nodes...
Result: Query of CTSS for time offset passed
Check CTSS state started...
Check: CTSS state
  Node Name                             State                   
  ------------------------------------  ------------------------
  grac2                                 Observer                
  grac1                                 Observer                
CTSS is in Observer state. Switching over to clock synchronization checks using NTP
Starting Clock synchronization checks using Network Time Protocol(NTP)...
NTP Configuration file check started...
The NTP configuration file "/etc/ntp.conf" is available on all nodes
NTP Configuration file check passed
Checking daemon liveness...
Check: Liveness for "ntpd"
  Node Name                             Running?                
  ------------------------------------  ------------------------
  grac2                                 yes                     
  grac1                                 yes                     
Result: Liveness check passed for "ntpd"
Check for NTP daemon or service alive passed on all nodes
Checking NTP daemon command line for slewing option "-x"
Check: NTP daemon command line
  Node Name                             Slewing Option Set?     
  ------------------------------------  ------------------------
  grac2                                 yes                     
  grac1                                 yes                     
Result: 
NTP daemon slewing option check passed
Checking NTP daemon's boot time configuration, in file "/etc/sysconfig/ntpd", for slewing option "-x"
Check: NTP daemon's boot time configuration
  Node Name                             Slewing Option Set?     
  ------------------------------------  ------------------------
  grac2                                 yes                     
  grac1                                 yes                     
Result: 
NTP daemon's boot time configuration check for slewing option passed
Checking whether NTP daemon or service is using UDP port 123 on all nodes
Check for NTP daemon or service using UDP port 123
  Node Name                             Port Open?              
  ------------------------------------  ------------------------
  grac2                                 yes                     
  grac1                                 yes                     
NTP common Time Server Check started...
NTP Time Server ".LOCL." is common to all nodes on which the NTP daemon is running
Check of common NTP Time Server passed
Clock time offset check from NTP Time Server started...
Checking on nodes "[grac2, grac1]"... 
Check: Clock time offset from NTP Time Server
Time Server: .LOCL. 
Time Offset Limit: 1000.0 msecs
  Node Name     Time Offset               Status                  
  ------------  ------------------------  ------------------------
  grac2         0.0                       passed                  
  grac1         0.0                       passed                  
Time Server ".LOCL." has time offsets that are within permissible limits for nodes "[grac2, grac1]". 
Clock time offset check passed
Result: Clock synchronization check using Network Time Protocol(NTP) passed
Oracle Cluster Time Synchronization Services check passed
Checking VIP configuration.
Checking VIP Subnet configuration.
Check for VIP Subnet configuration passed.
Checking VIP reachability
Check for VIP reachability passed.
Post-check for cluster services setup was successful. 

Checking CRS status after installation]
$ my_crs_stat
NAME                           TARGET     STATE           SERVER       STATE_DETAILS   
-------------------------      ---------- ----------      ------------ ------------------
ora.DATA.dg                    ONLINE     ONLINE          grac1         
ora.DATA.dg                    ONLINE     ONLINE          grac2         
ora.LISTENER.lsnr              ONLINE     ONLINE          grac1         
ora.LISTENER.lsnr              ONLINE     ONLINE          grac2         
ora.asm                        ONLINE     ONLINE          grac1        Started 
ora.asm                        ONLINE     ONLINE          grac2        Started 
ora.gsd                        OFFLINE    OFFLINE         grac1         
ora.gsd                        OFFLINE    OFFLINE         grac2         
ora.net1.network               ONLINE     ONLINE          grac1         
ora.net1.network               ONLINE     ONLINE          grac2         
ora.ons                        ONLINE     ONLINE          grac1         
ora.ons                        ONLINE     ONLINE          grac2         
ora.LISTENER_SCAN1.lsnr        ONLINE     ONLINE          grac2         
ora.LISTENER_SCAN2.lsnr        ONLINE     ONLINE          grac1         
ora.LISTENER_SCAN3.lsnr        ONLINE     ONLINE          grac1         
ora.cvu                        ONLINE     ONLINE          grac1         
ora.gns                        ONLINE     ONLINE          grac1         
ora.gns.vip                    ONLINE     ONLINE          grac1         
ora.grac1.vip                  ONLINE     ONLINE          grac1         
ora.grac2.vip                  ONLINE     ONLINE          grac2         
ora.oc4j                       ONLINE     ONLINE          grac1         
ora.scan1.vip                  ONLINE     ONLINE          grac2         
ora.scan2.vip                  ONLINE     ONLINE          grac1         
ora.scan3.vip                  ONLINE     ONLINE          grac1                              

Grid post installation - Ologgerd process cosumes high CPU time
  It had been noticed that after a while, the ologgerd process can consume excessive CPU resource. 
  The ologgerd is part of Oracle Cluster Health Monitor and used by Oracle Support to troubleshoot RAC problems. 
  You can check that by starting top:  (sometime up we see up to 60% WA states ) 
  top - 15:02:38 up 15 min,  6 users,  load average: 3.70, 2.54, 1.78
    Tasks: 215 total,   2 running, 213 sleeping,   0 stopped,   0 zombie
    Cpu(s):  3.6%us,  8.9%sy,  0.0%ni, 55.4%id, 31.4%wa,  0.0%hi,  0.8%si,  0.0%st
    Mem:   3234376k total,  2512568k used,   721808k free,   108508k buffers
    Swap:  3227644k total,        0k used,  3227644k free,  1221196k cached
      PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
     5602 root      RT   0  501m 145m  60m S 48.1  4.6   0:31.29 ologgerd    
If ologgerd process is consuming a lot of CPU, you can stop it by executing:
# crsctl stop resource ora.crf -init
  Now top looks good as IDLE CPU time increases from 55 % to 95 % !
  hrac1: 
    top - 15:07:56 up 20 min,  6 users,  load average: 2.57, 3.33, 2.41
    Tasks: 212 total,   1 running, 211 sleeping,   0 stopped,   0 zombie
    Cpu(s):  1.3%us,  4.2%sy,  0.0%ni, 94.3%id,  0.1%wa,  0.0%hi,  0.2%si,  0.0%st
    Mem:   3234376k total,  2339268k used,   895108k free,   132604k buffers
    Swap:  3227644k total,        0k used,  3227644k free,  1126964k cached
  hrac2:   
    top - 15:48:37 up 33 min,  3 users,  load average: 2.63, 2.40, 2.13
    Tasks: 204 total,   1 running, 203 sleeping,   0 stopped,   0 zombie
    Cpu(s):  0.9%us,  3.3%sy,  0.0%ni, 95.6%id,  0.1%wa,  0.0%hi,  0.2%si,  0.0%st
    Mem:   2641484k total,  1975444k used,   666040k free,   158212k buffers
    Swap:  3227644k total,        0k used,  3227644k free,   993328k cached
 If you want to disable ologgerd permanently, then execute:
 # crsctl delete resource ora.crf -init

 

Fixing a failed GRID Installation

Fixing a failed Grid Installation ( runt this commands on all instances )
[grid@grac31 ~]$ rm -rf  /u01/app/11203/grid/*
[grid@grac31 ~]$ rm /u01/app/oraInventory/*

 

Install RDBMS and  create RAC database

Login as Oracle user and verify the accout
$ id
  uid=54321(oracle) gid=54321(oinstall) groups=54321(oinstall),501(vboxsf),506(asmdba),54322(dba)
$ env | grep ORA 
  ORACLE_BASE=/u01/app/oracle
  ORACLE_SID=RACE2
  ORACLE_HOME=/u01/app/oracle/product/11203/racdb

Verfiy system by  running  cluvfy with: stage -pre dbinst
$ ./bin/cluvfy stage -pre dbinst -n grac1,grac2
Performing pre-checks for database installation 
Checking node reachability...
Node reachability check passed from node "grac1"
Checking user equivalence...
User equivalence check passed for user "oracle"
Checking node connectivity...
Checking hosts config file...
Verification of the hosts config file successful
Check: Node connectivity for interface "eth0"
Node connectivity passed for interface "eth0"
TCP connectivity check passed for subnet "192.168.1.0"
Check: Node connectivity for interface "eth1"
Node connectivity passed for interface "eth1"
TCP connectivity check passed for subnet "192.168.2.0"
Checking subnet mask consistency...
Subnet mask consistency check passed for subnet "192.168.1.0".
Subnet mask consistency check passed for subnet "192.168.2.0".
Subnet mask consistency check passed.
Node connectivity check passed
Checking multicast communication...
Checking subnet "192.168.1.0" for multicast communication with multicast group "230.0.1.0"...
Check of subnet "192.168.1.0" for multicast communication with multicast group "230.0.1.0" passed.
Checking subnet "192.168.2.0" for multicast communication with multicast group "230.0.1.0"...
Check of subnet "192.168.2.0" for multicast communication with multicast group "230.0.1.0" passed.
Check of multicast communication passed.
Total memory check passed
Available memory check passed
Swap space check passed
Free disk space check passed for "grac2:/tmp"
Free disk space check passed for "grac1:/tmp"
Check for multiple users with UID value 54321 passed 
User existence check passed for "oracle"
Group existence check passed for "oinstall"
Group existence check passed for "dba"
Membership check for user "oracle" in group "oinstall" [as Primary] passed
Membership check for user "oracle" in group "dba" passed
Run level check passed
Hard limits check passed for "maximum open file descriptors"
Hard limits check passed for "maximum user processes"
System architecture check passed
Kernel version check passed
...
Check for multiple users with UID value 0 passed 
Current group ID check passed
Starting check for consistency of primary group of root user
Check for consistency of root user's primary group passed
Default user file creation mask check passed
Checking CRS integrity...
Clusterware version consistency passed
CRS integrity check passed
Checking Cluster manager integrity... 
Checking CSS daemon...
Oracle Cluster Synchronization Services appear to be online.
Cluster manager integrity check passed
Checking node application existence...
Checking existence of VIP node application (required)
VIP node application check passed
Checking existence of NETWORK node application (required)
NETWORK node application check passed
Checking existence of GSD node application (optional)
GSD node application is offline on nodes "grac2,grac1"
Checking existence of ONS node application (optional)
ONS node application check passed
Checking if Clusterware is installed on all nodes...
Check of Clusterware install passed
Checking if CTSS Resource is running on all nodes...
CTSS resource check passed
Querying CTSS for time offset on all nodes...
Query of CTSS for time offset passed
Check CTSS state started...
CTSS is in Observer state. Switching over to clock synchronization checks using NTP
Starting Clock synchronization checks using Network Time Protocol(NTP)...
NTP Configuration file check started...
NTP Configuration file check passed
Checking daemon liveness...
Liveness check passed for "ntpd"
Check for NTP daemon or service alive passed on all nodes
NTP daemon slewing option check passed
NTP daemon's boot time configuration check for slewing option passed
NTP common Time Server Check started...
Check of common NTP Time Server passed
Clock time offset check from NTP Time Server started...
Clock time offset check passed
Clock synchronization check using Network Time Protocol(NTP) passed
Oracle Cluster Time Synchronization Services check passed
Checking consistency of file "/etc/resolv.conf" across nodes
File "/etc/resolv.conf" does not have both domain and search entries defined
domain entry in file "/etc/resolv.conf" is consistent across nodes
search entry in file "/etc/resolv.conf" is consistent across nodes
All nodes have one search entry defined in file "/etc/resolv.conf"
PRVF-5636 : The DNS response time for an unreachable node exceeded "15000" ms on following nodes: grac2,grac1
File "/etc/resolv.conf" is not consistent across nodes
Time zone consistency check passed
Checking Single Client Access Name (SCAN)...
Checking TCP connectivity to SCAN Listeners...
TCP connectivity to SCAN Listeners exists on all cluster nodes
Checking name resolution setup for "GRACE2-scan.grid.example.com"...
Verification of SCAN VIP and Listener setup passed
Checking VIP configuration.
Checking VIP Subnet configuration.
Check for VIP Subnet configuration passed.
Checking VIP reachability
Check for VIP reachability passed.
ASM and CRS versions are compatible
Database Clusterware version compatibility passed
Pre-check for database installation was unsuccessful on all the nodes. 

Run cluvfy with:  stage -pre dbcfg
$ ./bin/cluvfy stage -pre dbcfg -n grac1,grac2 -d $ORACLE_HOME
Performing pre-checks for database configuration 
ERROR: 
Unable to determine OSDBA group from Oracle Home "/u01/app/oracle/product/11203/racdb"
Checking node reachability...
Node reachability check passed from node "grac1"
Checking user equivalence...
User equivalence check passed for user "oracle"
Checking node connectivity...
Checking hosts config file...
Verification of the hosts config file successful
Check: Node connectivity for interface "eth0"
Node connectivity passed for interface "eth0"
ERROR: 
PRVF-7617 : Node connectivity between "grac1 : 192.168.1.61" and "grac2 : 192.168.1.108" failed
TCP connectivity check failed for subnet "192.168.1.0"
Check: Node connectivity for interface "eth1"
Node connectivity passed for interface "eth1"
TCP connectivity check passed for subnet "192.168.2.0"
Checking subnet mask consistency...
Subnet mask consistency check passed for subnet "192.168.1.0".
Subnet mask consistency check passed for subnet "192.168.2.0".
Subnet mask consistency check passed.
Node connectivity check failed
Checking multicast communication...
Checking subnet "192.168.1.0" for multicast communication with multicast group "230.0.1.0"...
Check of subnet "192.168.1.0" for multicast communication with multicast group "230.0.1.0" passed.
Checking subnet "192.168.2.0" for multicast communication with multicast group "230.0.1.0"...
Check of subnet "192.168.2.0" for multicast communication with multicast group "230.0.1.0" passed.
Check of multicast communication passed.
Total memory check passed
Available memory check passed
Swap space check passed
Free disk space check passed for "grac2:/u01/app/oracle/product/11203/racdb,grac2:/tmp"
Free disk space check passed for "grac1:/u01/app/oracle/product/11203/racdb,grac1:/tmp"
Check for multiple users with UID value 54321 passed 
User existence check passed for "oracle"
Group existence check passed for "oinstall"
Membership check for user "oracle" in group "oinstall" [as Primary] passed
Run level check passed
Hard limits check passed for "maximum open file descriptors"
Hard limits check passed for "maximum user processes"
System architecture check passed
Kernel version check passed
...
Package existence check passed for "libaio-devel(x86_64)"
Check for multiple users with UID value 0 passed 
Current group ID check passed
Starting check for consistency of primary group of root user
Check for consistency of root user's primary group passed
Checking CRS integrity...
Clusterware version consistency passed
CRS integrity check passed
Checking node application existence...
Checking existence of VIP node application (required)
VIP node application check passed
Checking existence of NETWORK node application (required)
NETWORK node application check passed
Checking existence of GSD node application (optional)
GSD node application is offline on nodes "grac2,grac1"
Checking existence of ONS node application (optional)
ONS node application check passed
Time zone consistency check passed
Pre-check for database configuration was unsuccessful on all the nodes. 

Ignore ERROR: 
   Unable to determine OSDBA group from Oracle Home "/u01/app/oracle/product/11203/racdb"
   -> Oracle software isn't software installed yet and cluvfy can't find $ORACLE_HOME/bin/osdbagrp
    stat("/u01/app/oracle/product/11203/racdb/bin/osdbagrp", 0x7fff2fd6e530) = -1 ENOENT (No such file or directory) 
   Run only cluvfy stage -pre dbcfg only after you have installed the software and before you have created the database.

Run Installer
As user Root 
  # xhost +
    access control disabled, clients can connect from any host
As user Oracle
  $ xclock      ( Testing X connection )
  $ cd /KITS/Oracle/11.2.0.3/Linux_64/database  ( rdbms staging area ) 
  $ ./runInstaller ( select SERVER class )
     Node Name           : grac1,grac2  
     Storage type        : ASM
     Location            : DATA
     OSDBDBA group       : asmdba
     Global database name: GRACE2
On grac1 run:  /u01/app/oracle/product/11203/racdb/root.sh
On grac2 run:  /u01/app/oracle/product/11203/racdb/root.sh
Enterprise Manager Database Control URL - (RACE2) :   https://hrac1.de.oracle.com:1158/em

Verify Rac Install
$ my_crs_stat
NAME                           TARGET     STATE           SERVER       STATE_DETAILS   
-------------------------      ---------- ----------      ------------ ------------------
ora.DATA.dg                    ONLINE     ONLINE          grac1         
ora.DATA.dg                    ONLINE     ONLINE          grac2         
ora.LISTENER.lsnr              ONLINE     ONLINE          grac1         
ora.LISTENER.lsnr              ONLINE     ONLINE          grac2         
ora.asm                        ONLINE     ONLINE          grac1        Started 
ora.asm                        ONLINE     ONLINE          grac2        Started 
ora.gsd                        OFFLINE    OFFLINE         grac1         
ora.gsd                        OFFLINE    OFFLINE         grac2         
ora.net1.network               ONLINE     ONLINE          grac1         
ora.net1.network               ONLINE     ONLINE          grac2         
ora.ons                        ONLINE     ONLINE          grac1         
ora.ons                        ONLINE     ONLINE          grac2         
ora.LISTENER_SCAN1.lsnr        ONLINE     ONLINE          grac2         
ora.LISTENER_SCAN2.lsnr        ONLINE     ONLINE          grac1         
ora.LISTENER_SCAN3.lsnr        ONLINE     ONLINE          grac1         
ora.cvu                        ONLINE     ONLINE          grac1         
ora.gns                        ONLINE     ONLINE          grac1         
ora.gns.vip                    ONLINE     ONLINE          grac1         
ora.grac1.vip                  ONLINE     ONLINE          grac1         
ora.grac2.vip                  ONLINE     ONLINE          grac2         
ora.grace2.db                  ONLINE     ONLINE          grac1        Open 
ora.grace2.db                  ONLINE     ONLINE          grac2        Open 
ora.oc4j                       ONLINE     ONLINE          grac1         
ora.scan1.vip                  ONLINE     ONLINE          grac2         
ora.scan2.vip                  ONLINE     ONLINE          grac1         
ora.scan3.vip                  ONLINE     ONLINE          grac1     

$ srvctl  status database -d GRACE2
Instance GRACE21 is running on node grac1
Instance GRACE22 is running on node grac2

$GRID_HOME/bin/olsnodes -n
grac1    1
grac2    2

 

Reference

Restart Scan Listener

Restart a specific SCAN listener

Check current scan listener status:
$ srvctl status scan_listener
SCAN Listener LISTENER_SCAN1 is enabled
SCAN listener LISTENER_SCAN1 is running on node grac2
SCAN Listener LISTENER_SCAN2 is enabled
SCAN listener LISTENER_SCAN2 is not running
SCAN Listener LISTENER_SCAN3 is enabled
SCAN listener LISTENER_SCAN3 is running on node grac2

Restart scan Listener LISTENER_SCAN2: 
$ srvctl start scan_listener -i 2

Verify new SCAN listener status:
$ srvctl status scan_listener
SCAN Listener LISTENER_SCAN1 is enabled
SCAN listener LISTENER_SCAN1 is running on node grac2
SCAN Listener LISTENER_SCAN2 is enabled
SCAN listener LISTENER_SCAN2 is running on node grac1
SCAN Listener LISTENER_SCAN3 is enabled
SCAN listener LISTENER_SCAN3 is running on node grac2

Change VIP status from INTERMEDIATE state back to ONLINE state

 

Check current VIP status:
$  crsctl status resource ora.grac1.vip
NAME=ora.grac1.vip
TYPE=ora.cluster_vip_net1.type
TARGET=ONLINE
STATE=INTERMEDIATE on grac2

Stop the VIP resource:
$ crsctl stop resource ora.grac1.vip
CRS-2673: Attempting to stop 'ora.grac1.vip' on 'grac2'
CRS-2677: Stop of 'ora.grac1.vip' on 'grac2' succeeded

Start the VIP resource:
$ crsctl start resource ora.grac1.vip
CRS-2672: Attempting to start 'ora.grac1.vip' on 'grac1'
CRS-2676: Start of 'ora.grac1.vip' on 'grac1' succeeded

Verify VIP resource:
$  crsctl status resource ora.grac1.vip
NAME=ora.grac1.vip
TYPE=ora.cluster_vip_net1.type
TARGET=ONLINE
STATE=ONLINE on grac1