Table of Contents
Simulate a listener HANG scenario
To simulated a listener hang scenario attach a debugger to the local tnslsnr process ( tnslsnr LISTENER ) [root@gract3 Desktop]# ps -elf | grep tnslsnr 0 S grid 4463 1 0 80 0 - 45777 ep_pol Aug05 ? 00:00:01 /u01/app/121/grid/bin/tnslsnr ASMNET1LSNR_ASM -no_crs_notify -inherit 0 S grid 4518 1 0 80 0 - 45906 ep_pol Aug05 ? 00:00:01 /u01/app/121/grid/bin/tnslsnr LISTENER -no_crs_notify -inherit 0 S grid 4525 1 0 80 0 - 45847 ep_pol Aug05 ? 00:00:01 /u01/app/121/grid/bin/tnslsnr LISTENER_SCAN1 -no_crs_notify -inherit [root@gract3 Desktop]# gdb -p 4518 (gdb) where #0 0x0000003ae10e8f43 in epoll_wait () from /lib64/libc.so.6 #1 0x00007f809eadad91 in sntevepoll () from /u01/app/121/grid/lib/libclntsh.so.12.1 #2 0x00007f809eada308 in nteveque () from /u01/app/121/grid/lib/libclntsh.so.12.1 #3 0x00007f809ead6a9a in ntevque () from /u01/app/121/grid/lib/libclntsh.so.12.1 #4 0x00007f809ea78650 in nsevwait () from /u01/app/121/grid/lib/libclntsh.so.12.1 #5 0x00000000004066dc in nsglma () #6 0x0000000000405939 in main () --> Check listener and resource status [oracle@gract3 ~]$ lsnrctl status LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 06-AUG-2014 08:34:58 Copyright (c) 1991, 2013, Oracle. All rights reserved. Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521)) --> lsnrctl hangs [root@gract3 Desktop]# crs | egrep 'ora.LISTENER.lsnr|STATE' | grep gract3 ora.LISTENER.lsnr ONLINE INTERMEDIATE gract3 CHECK TIMED OUT,STABLE --> LISTENER status changed from ONLINE to INTERMEDIATE CHECK TIMED OUT,STABLE This is the expected behaviour as clusterware uses lsnrctl status to verify the listener resource status
Use strace to get details about the listener status
[grid@gract3 ~]$ strace -f -o LISTENER.trc lsnrctl status LSNRCTL for Linux: Version 12.1.0.1.0 - Production on 06-AUG-2014 08:38:17 Copyright (c) 1991, 2013, Oracle. All rights reserved. Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER))) Strace Output : 18597 socket(PF_FILE, SOCK_STREAM, 0) = 7 18597 access("/var/tmp/.oracle/sLISTENER", F_OK) = 0 18597 connect(7, {sa_family=AF_FILE, path="/var/tmp/.oracle/sLISTENER"}, 110) = 0 18597 fcntl(7, F_SETFD, FD_CLOEXEC) = 0 18597 brk(0x1aad000) = 0x1aad000 18597 rt_sigaction(SIGPIPE, {SIG_IGN, ~[ILL ABRT BUS FPE SEGV USR2 TERM XCPU XFSZ SYS RTMIN RT_1], SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x3ae180f500}, {SIG_DFL, [], 0}, 8) = 0 18597 write(7, "\0\332\0\0\1\0\0\0\1;\1,\0\201 \0\177\377s\10\0\0\1\0\0\224\0F\0\0\7\370"..., 218) = 218 18597 read(7, 0x1a87896, 8208) = ? ERESTARTSYS (To be restarted) --> IPC socket sLISTENER is used for local node socket communication Server process tnslsnr can't read from IPC socket sLISTENER because it is stopped by gdb ( Normal processing is read the message and send a reply to lsnrctl process ) Client process lsnrctl reads from an empty socket and gets blocked
Fix the problem
Find processes which are using that socket file and not responding with a reply ot@gract3 Desktop]# lsof | grep sLISTENER tnslsnr 4518 grid 9u unix 0xffff8800085ef200 0t0 15687340 /var/tmp/.oracle/sLISTENER tnslsnr 4518 grid 14u unix 0xffff880003a56780 0t0 15690441 /var/tmp/.oracle/sLISTENER tnslsnr 4525 grid 9u unix 0xffff880028e11c80 0t0 15687917 /var/tmp/.oracle/sLISTENER_SCAN1 tnslsnr 4525 grid 14u unix 0xffff880037ec3540 0t0 15690968 /var/tmp/.oracle/sLISTENER_SCAN1 [root@gract3 Desktop]# ps -elf | grep 4518 0 t grid 4518 1 0 80 0 - 45906 ptrace Aug05 ? 00:00:01 /u01/app/121/grid/bin/tnslsnr LISTENER -no_crs_notify -inherit --> OK what we expected : Our listener process is using that IPC socket file Potential problems: - A debugger attached to the tnslsnr process - tnslsnr process no functioning any more ( blocked by resources, wild running/looping program ) Solution: kill that process and check listener process again [root@gract3 Desktop]# kill -9 4518 Check listener status [grid@gract3 ~]$ lsnrctl status STATUS of the LISTENER ------------------------ Alias LISTENER Version TNSLSNR for Linux: Version 12.1.0.1.0 - Production Start Date 06-AUG-2014 08:54:36 Uptime 0 days 0 hr. 0 min. 35 sec Check resource status Rescource NAME TARGET STATE SERVER STATE_DETAILSa ------------------------- ---------- ---------- ------------ ------------------ ora.LISTENER.lsnr ONLINE ONLINE gract3 STABLE --> Note: Local listener is automatically restarted by clusterware