Case III: GPnP fails due to unreadable CW “special” socket files located in var/tmp/.oracle/
Typically this directory contains a number of "special" socket files that are used by local clients to
connect via the IPC protocol (sqlnet) to various Oracle processes including the TNS listener, the CSS,
CRS & EVM daemons or even database or ASM instances
Force that error and monitor Clusterware Resource status after startup:
[root@hract21 DTRACE]# chmod 000 /var/tmp/.oracle/ora_gipc_GPNPD_hract21_lock
[root@hract21 DTRACE]# ls -l /var/tmp/.oracle/ora_gipc_GPNPD_hract21_lock
---------- 1 grid oinstall 0 Feb 15 08:18 /var/tmp/.oracle/ora_gipc_GPNPD_hract21_lock
***** Local Resources: *****
Resource NAME INST TARGET STATE SERVER STATE_DETAILS
--------------------------- ---- ------------ ------------ --------------- -----------------------------------------
ora.asm 1 ONLINE OFFLINE - STABLE
ora.cluster_interconnect.haip 1 ONLINE OFFLINE - STABLE
ora.crf 1 ONLINE OFFLINE - STABLE
ora.crsd 1 ONLINE OFFLINE - STABLE
ora.cssd 1 ONLINE OFFLINE - STABLE
ora.cssdmonitor 1 OFFLINE OFFLINE - STABLE
ora.ctssd 1 ONLINE OFFLINE - STABLE
ora.diskmon 1 OFFLINE OFFLINE - STABLE
ora.drivers.acfs 1 ONLINE ONLINE hract21 STABLE
ora.evmd 1 ONLINE INTERMEDIATE hract21 STABLE
ora.gipcd 1 ONLINE OFFLINE - STABLE
ora.gpnpd 1 ONLINE OFFLINE hract21 STARTING
ora.mdnsd 1 ONLINE ONLINE hract21 STABLE
ora.storage 1 ONLINE OFFLINE - STABLE
--> GPnPD doesn't start !
CLUVFY:
Found no cluvfy command to detect this error
TRACFILE review :
alert.log :
2015-02-16 12:37:45.345 [GPNPD(21849)]CRS-8500: Oracle Clusterware GPNPD process is starting with operating system process ID 21849
2015-02-16 12:37:45.388 [GPNPD(21849)]CRS-8501: Oracle Clusterware GPNPD process with operating system process ID 21849 is ending with return value 48
2015-02-16 12:47:45.281 [ORAAGENT(21818)]CRS-5818: Aborted command 'start' for resource 'ora.gpnpd'. Details at (:CRSAGF00113:) {0:0:2}
in /u01/app/grid/diag/crs/hract21/crs/trace/ohasd_oraagent_grid.trc.
ohasd_oraagent_grid.trc:
2015-02-16 12:47:45.837353 : AGFW:308582144: {0:0:2} Agent sending reply for: RESOURCE_START[ora.gpnpd 1 1] ID 4098:270
2015-02-16 12:47:45.838926 :CLSDYNAM:310683392: [ora.gpnpd]{0:0:2} [start] (:CLSN00107:) clsn_agent::start }
2015-02-16 12:47:49.288144 :CLSDYNAM:304379648: [ora.gpnpd]{0:0:2} [start] got lock
2015-02-16 12:47:49.288165 :CLSDYNAM:304379648: [ora.gpnpd]{0:0:2} [start] tryActionLock }
2015-02-16 12:47:49.288187 :CLSDYNAM:304379648: [ora.gpnpd]{0:0:2} [start] abort }
2015-02-16 12:47:49.288205 :CLSDYNAM:304379648: [ora.gpnpd]{0:0:2} [start] (:CLSN00110:) clsn_agent::abort }
2015-02-16 12:47:49.288230 : AGFW:304379648: {0:0:2} Command: start for resource: ora.gpnpd 1 1 completed with status: TIMEDOUT
2015-02-16 12:47:49.288313 :CLSDYNAM:304379648: [ora.gpnpd]{0:0:2} [check] Utils::getOracleHomeAttrib getEnvVar oracle_home:/u01/app/121/grid
2015-02-16 12:47:49.288361 :CLSDYNAM:304379648: [ora.gpnpd]{0:0:2} [check] Utils::getOracleHomeAttrib oracle_home:/u01/app/121/grid
2015-02-16 12:47:49.288672 :CLSDYNAM:304379648: [ora.gpnpd]{0:0:2} [check] PID 20189 from /u01/app/121/grid/gpnp/init/hract21.pid
2015-02-16 12:47:49.288708 : CLSDMC:304379648: Connecting to ipc://hract21_DBG_GPNPD
2015-02-16 12:47:49.289427 : CLSDMC:304379648: Error: gipcWait for gipcConnect - ret_gipcreqinfo=gipcretConnectionRefused, type_gipcreqinfo=gipcreqtypeConnect
2015-02-16 12:47:49.290221 : AGFW:308582144: {0:0:2} Agent sending reply for: RESOURCE_START[ora.gpnpd 1 1] ID 4098:270
2015-02-16 12:47:49.292126 :CLSDYNAM:304379648: [ora.gpnpd]{0:0:2} [check] ClsdmClient::sendMessage clsdmc_send error rmsg:0 ecode:-7 errbuf:
2015-02-16 12:47:49.292208 :CLSDYNAM:304379648: [ora.gpnpd]{0:0:2} [check] Calling PID check for daemon
2015-02-16 12:47:49.292269 :CLSDYNAM:304379648: [ora.gpnpd]{0:0:2} [check] Process id 20189 translated to
2015-02-16 12:47:49.292384 : CLSDMC:304379648: Connecting to ipc://hract21_DBG_GPNPD
2015-02-16 12:47:49.292604 : CLSDMC:304379648: Error: gipcWait for gipcConnect - ret_gipcreqinfo=gipcretConnectionRefused, type_gipcreqinfo=gipcreqtypeConnect
2015-02-16 12:47:49.292658 :CLSDYNAM:304379648: [ora.gpnpd]{0:0:2} [check] ClsdmClient::sendMessage clsdmc_send error rmsg:0 ecode:-7 errbuf:
2015-02-16 12:47:49.293759 : AGFW:308582144: {0:0:2} ora.gpnpd 1 1 state changed from: STARTING to: OFFLINE
2015-02-16 12:47:49.293871 : AGFW:308582144: {0:0:2} Agent sending last reply for: RESOURCE_START[ora.gpnpd 1 1] ID 40
--> Got no further indications that file permissons on file /var/tmp/.oracle/ora_gipc_GPNPD_hract21_lock is the root cause !
DTRACE SCRIPT :
syscall::open:entry
{
self->path = copyinstr(arg0);
}
syscall::open:return
/arg0<0 && substr( self->path,0,var_tmp_len) == var_tmp_loc && substr(execname,0,4) != "asm_" /
{
printf("- Exec: %s - open() %s failed with error: %d - scan_dir: %s ", execname, self->path, arg0, substr( self->path,0,var_tmp_len) );
}
DTRACE OUTPUT :
Again DTrace helps us to find that problem very quickly :
[root@hract21 DTRACE]# !dt
dtrace -s check_rac.d
dtrace: script 'check_rac.d' matched 21 probes
CPU ID FUNCTION:NAME
0 1 :BEGIN GRIDHOME: /u01/app/121/grid - GRIDHOME/bin: /u01/app/121/grid/bin - Temp Loc: /var/tmp/.oracle - PIDFILE: hract21.pid - Port for bind: 53
0 9 open:return - Exec: ohasd.bin - open() /var/tmp/.oracle/npohasd failed with error: -6 - scan_dir: /var/tmp/.oracle
0 9 open:return - Exec: ohasd.bin - open() /var/tmp/.oracle/npohasd failed with error: -6 - scan_dir: /var/tmp/.oracle
0 89 connect:return - Exec: mdnsd.bin - PID: 20575 connect() failed with error : -101 - fd : 39 - IP: 17.17.17.17 - Port: 256
0 9 open:return - Exec: gpnpd.bin - open() /var/tmp/.oracle/ora_gipc_GPNPD_hract21_lock failed with error: -13 - scan_dir: /var/tmp/.oracle
FIX :
Change permission and reboot Clusterware :
[root@hract21 DTRACE]# chmod 644 /var/tmp/.oracle/ora_gipc_GPNPD_hract21_lock
[root@hract21 DTRACE]# ls -l /var/tmp/.oracle/ora_gipc_GPNPD_hract21_lock
-rw-r--r-- 1 grid oinstall 0 Feb 15 08:18 /var/tmp/.oracle/ora_gipc_GPNPD_hract21_lock
Pages: Page 1, Page 2, Page 3, Page 4, Page 5, Page 6, Page 7, Page 8, Page 9, Page 10, Page 11, Page 12, Page 13, Page 14, Page 15, Page 16
Many thx
This is very helpful
I really like looking through an article that can make people think.
Also, thank you for permitting me to comment!