Table of Contents
Overview
- Always check /var/log/messages for generic mount problems
- Mount your DBFS filesysystem with dbfs_client to rule out problems with mount-dbfs.sh script
- before deploying mount-dbfs.sh script test script on all nodes by using following sequence
$ mount-dbfs.sh status
$ mount-dbfs.sh start
$ mount-dbfs.sh status ( if status OFFLINE repeat this command as DBFS start may take some time )
$ mount-dbfs.sh stop
$ mount-dbfs.sh status
- Version Overview
Linux grac41.example.com 3.8.13-35.1.2.el6uek.x86_64
CRS 11.2.0.4.3
Fuse RPMs used
[oracle@grac41 DBFS]$ rpm -qa | grep fuse
fuse-2.8.3-4.0.2.el6.x86_64
fuse-libs-2.8.3-4.0.2.el6.x86_64
gvfs-fuse-1.4.3-16.el6_5.x86_64
Debugging generic DBFS filesystem mount problems
To rule out and generic errors first try to mount your DBFS using dbfs_client : [oracle@grac41 DBFS]$ echo dbfs_user > pw [oracle@grac41 DBFS]$ dbfs_client dbfs_user@grac41 -otrace_file=/tmp/dbfs.out -otrace_level=1 -otrace_size=0 /u01/oradata/dbfs_direct <pw & [1] 17049 If above mount doesn't work check /var/log/messages If mount works youn can test mount-dbfs.sh start Some typical problem reported in /var/log/messages: Error 1: Jul 25 10:15:50 grac42 DBFS_/u01/oradata/dbfs_direct: mount-dbfs.sh mounting DBFS at /u01/oradata/dbfs_direct from database grac4 Jul 25 10:15:51 grac42 DBFS_/u01/oradata/dbfs_direct: ORACLE_SID is grac42 Jul 25 10:15:51 grac42 DBFS_/u01/oradata/dbfs_direct: spawning dbfs_client command using SID grac42 Jul 25 10:15:51 grac42 kernel: fuse init (API version 7.20) Jul 25 10:15:51 grac42 DBFS_/u01/oradata/dbfs_direct: fuse: failed to exec fusermount: Permission denied Solution : Set proper protection for /bin/fusermount # chmod +x /bin/fusermount For details read : DBFS resource not starting as crs resource (Doc ID 1908868.1) Error 2: Jul 25 09:26:02 grac43 DBFS_/u01/oradata/dbfs_direct: spawning dbfs_client command using SID grac43 Jul 25 09:26:02 grac43 DBFS_/u01/oradata/dbfs_direct: Fail to load library libfuse.so. Jul 25 09:26:02 grac43 DBFS_/u01/oradata/dbfs_direct: A dynamic linking error occurred: (libfuse.so: cannot open shared object file: No such file or directory) Jul 25 09:26:09 grac43 DBFS_/u01/oradata/dbfs_direct: Start -- OFFLINE Fix Check your current Shared Lib config according to Fuse libs [root@grac42 lib]# ldconfig -p | grep fuse libfuse.so.2 (libc6,x86-64) => /lib64/libfuse.so.2 --> Here we are missing libfuse.so # cd /usr/local/lib # locate libfuse.so /lib64/libfuse.so.2 /lib64/libfuse.so.2.8.3 /usr/local/lib/libfuse.so # ln -s /lib64/libfuse.so.2 libfuse.so # ldconfig # ldconfig -p | grep fuse libfuse.so.2 (libc6,x86-64) => /lib64/libfuse.so.2 libfuse.so (libc6,x86-64) => /usr/local/lib/libfuse.so
Debugging and fixing mount-dbfs.sh script before deploying script as a CW resource
Before deploying mount-dbfs.sh as a CW resource you should test on every node that mount-dbfs.sh status mount-dbfs.sh start mount-dbfs.sh status mount-dbfs.sh stop mount-dbfs.sh status works on each node and returns the expected results for mount-dbfs.sh status . Let's start with the initial mount test Error 1: Problem with password file - OS mount fails ( critical ) [oracle@grac41 DBFS]$ mount-dbfs.sh start mount-dbfs.sh mounting DBFS at /u01/oradata/dbfs_direct from database grac4 ORACLE_SID is grac41 spawning dbfs_client command using SID grac41 ./mount-dbfs.sh: line 198: /tmp/.dbfs-passwd.txt.31706: No such file or directory Start -- OFFLINE --> DBFS FS not mounted Checking code aroun line 198: (nohup $DBFS_CLIENT ${DBFS_USER}@ -o $MOUNT_OPTIONS \ $MOUNT_POINT < $DBFS_PWDFILE | $LOGGER -p ${LOGGER_FACILITY}.info 2>&1 & ) & $RMF $DBFS_PWDFILE --> Seems password file was delete - Note nohup .. & runs the process in the background Proposed Fix : add a short sleep between starting the client and deleting the password (nohup $DBFS_CLIENT ${DBFS_USER}@ -o $MOUNT_OPTIONS \ $MOUNT_POINT < $DBFS_PWDFILE | $LOGGER -p ${LOGGER_FACILITY}.info 2>&1 & ) & sleep 2 <-- proposed code change $RMF $DBFS_PWDFILE This code change fixes the error: /tmp/.dbfs-passwd.txt.31706: No such file or directory was fixed Retry the mount [oracle@grac41 DBFS]$ mount-dbfs.sh start mount-dbfs.sh mounting DBFS at /u01/oradata/dbfs_direct from database grac4 ORACLE_SID is grac41 spawning dbfs_client command using SID grac41 nohup: redirecting stderr to stdout Start -- OFFLINE [oracle@grac41 DBFS]$ mount dbfs-dbfs_user@grac41:/ on /u01/oradata/dbfs_direct type fuse (rw,nosuid,nodev,max_read=1048576,default_permissions,user=oracle) --> Now DBFS is mounted but the status return wrong Error 2: Debugging wrong status - DBFS is OFFLINE but OS mount status is ok ! ( critical ) [oracle@grac41 DBFS]$ mount-dbfs.sh status Checking status now Check -- OFFLINE Note : status remains OFFLINE even OS mount was successful. Even rerunning script doesn't help. --> Checking mount-dbfs.sh script 'check'|'status') ### check to see if it is mounted ### fire off a short process in perl to do the check (need the alarm builtin) logit debug "Checking status now" $PERL <<'TOT' $timeout = $ENV{'PERL_ALARM_TIMEOUT'}; $SIG{ALRM} = sub { ### we have a problem and need to cleanup exit 3; die "timeout" ; }; alarm $timeout; eval { $STATUSOUT=`$ENV{'STAT'} -f -c "%T" $ENV{'MOUNT_POINT'} 2>&1 `; chomp($STATUSOUT); if ( ( $ENV{'SOLARIS'} == 1 && $STATUSOUT eq 'uvfs' ) || ( $ENV{'LINUX'} == 1 && $STATUSOUT eq 'UNKNOWN (0x65735546)' ) ) { ### status is okay exit 0; Using strace to find the command how CW detects the filesystem status [oracle@grac41 DBFS]$ strace -f -o mount-dbfs.trc mount-dbfs.sh status 26156 execve("/usr/bin/stat", ["/usr/bin/stat", "-f", "-c", "%T", "/u01/oradata/dbfs_direct"], [/* 35 vars */]) = 0 --> The check DBFS status the perl script runs following command [oracle@grac41 DBFS]$ /usr/bin/stat -f -c %T /u01/oradata/dbfs_direct fuseblk --> From a mounted DBFS filesystem we get returned fuseblk Changing Line ( $ENV{'LINUX'} == 1 && $STATUSOUT eq 'UNKNOWN (0x65735546)' ) ) { to ( $ENV{'LINUX'} == 1 && $STATUSOUT eq 'fuseblk' ) ) { Now we get correct status [oracle@grac41 DBFS]$ mount-dbfs.sh status Checking status now Check -- ONLINE Error 3 : mount-dbfs.sh start still report status offline after start ( not critical ) [oracle@grac41 DBFS]$ mount-dbfs.sh start mount-dbfs.sh mounting DBFS at /u01/oradata/dbfs_direct from database grac4 ORACLE_SID is grac41 spawning dbfs_client command using SID grac41 nohup: redirecting stderr to stdout Start -- OFFLINE After waiting some seconds the status report looks good [oracle@grac41 DBFS]$ mount-dbfs.sh status Checking status now Check -- ONLINE Potential Fix: Increase sleeptime before checking mount status ### allow time for the mount table update before checking it $SLEEP 1 ### set return code based on success of mounting $SCRIPTPATH status > /dev/null 2>&1 if [ $? -eq 0 ]; then logit info "Start -- ONLINE" exit 0 else logit info "Start -- OFFLINE" exit 1 Change line 210 from $SLEEP 1 to $SLEEP 5 Note: This error is not critical as CW will test resource status again and again.
Testing CW resource script : mount-dbfs.sh
[oracle@grac41 DBFS]$ mount-dbfs.sh status Checking status now Check -- OFFLINE [oracle@grac41 DBFS]$ mount-dbfs.sh start mount-dbfs.sh mounting DBFS at /u01/oradata/dbfs_direct from database grac4 ORACLE_SID is grac41 spawning dbfs_client command using SID grac41 nohup: redirecting stderr to stdout Start -- ONLINE [oracle@grac41 DBFS]$ mount-dbfs.sh status Checking status now Check -- ONLINE --> If status is OFFline repeat the mount-dbfs.sh status for at least 1 minute . [oracle@grac41 DBFS]$ mount-dbfs.sh stop unmounting DBFS from /u01/oradata/dbfs_direct umounting the filesystem using '/bin/fusermount -u /u01/oradata/dbfs_direct' Stop - stopped, now not mounted [oracle@grac41 DBFS]$ mount-dbfs.sh status Checking status now Check -- OFFLINE
Manually mount DBFS using dbfs_client
[oracle@grac41 DBFS]$ echo dbfs_user > pw [oracle@grac41 DBFS]$ dbfs_client dbfs_user@grac41 -otrace_file=/tmp/dbfs.out -otrace_level=1 -otrace_size=0 /u01/oradata/dbfs_direct <pw & [1] 17049 grid@grac41 ~]$ mount dbfs-dbfs_user@grac41:/ on /u01/oradata/dbfs_direct type fuse (rw,nosuid,nodev,max_read=1048576,default_permissions,user=oracle) Test file access [oracle@grac41 DBFS]$ touch /u01/oradata/dbfs_direct/FS1/t [oracle@grac41 DBFS]$ ls /u01/oradata/dbfs_direct/FS1/t /u01/oradata/dbfs_direct/FS1/t
Reference
- DBFS resource not starting as crs resource (Doc ID 1908868.1)
Thanks Helmut. You save our days.