Debugging problems when mounting a DBFS filesystem

Overview

  • Always check /var/log/messages for generic mount problems
  • Mount your DBFS filesysystem with  dbfs_client to rule out problems with mount-dbfs.sh script
  • before deploying mount-dbfs.sh script test script on all nodes by using following sequence

$  mount-dbfs.sh status
$  mount-dbfs.sh start
$  mount-dbfs.sh status   ( if status OFFLINE repeat this command as DBFS start may take some time )
$  mount-dbfs.sh stop
$  mount-dbfs.sh status

  •   Version Overview

Linux grac41.example.com 3.8.13-35.1.2.el6uek.x86_64
CRS 11.2.0.4.3
Fuse RPMs used
[oracle@grac41 DBFS]$ rpm -qa | grep fuse
  fuse-2.8.3-4.0.2.el6.x86_64
fuse-libs-2.8.3-4.0.2.el6.x86_64
gvfs-fuse-1.4.3-16.el6_5.x86_64

Debugging generic DBFS filesystem mount problems

To rule out and generic errors first try to mount your DBFS using dbfs_client :
[oracle@grac41 DBFS]$ echo dbfs_user > pw
[oracle@grac41 DBFS]$ dbfs_client dbfs_user@grac41 -otrace_file=/tmp/dbfs.out -otrace_level=1 -otrace_size=0 /u01/oradata/dbfs_direct <pw &
[1] 17049 

If above mount doesn't work check /var/log/messages
If mount works youn can test mount-dbfs.sh start  

Some typical problem reported in /var/log/messages:
Error 1:
Jul 25 10:15:50 grac42 DBFS_/u01/oradata/dbfs_direct: mount-dbfs.sh mounting DBFS at /u01/oradata/dbfs_direct from database grac4
Jul 25 10:15:51 grac42 DBFS_/u01/oradata/dbfs_direct: ORACLE_SID is grac42
Jul 25 10:15:51 grac42 DBFS_/u01/oradata/dbfs_direct: spawning dbfs_client command using SID grac42
Jul 25 10:15:51 grac42 kernel: fuse init (API version 7.20)
Jul 25 10:15:51 grac42 DBFS_/u01/oradata/dbfs_direct: fuse: failed to exec fusermount: Permission denied

Solution : Set proper protection for /bin/fusermount 
# chmod +x /bin/fusermount   
For details read :  DBFS resource not starting as crs resource (Doc ID 1908868.1)

Error 2:
Jul 25 09:26:02 grac43 DBFS_/u01/oradata/dbfs_direct: spawning dbfs_client command using SID grac43
Jul 25 09:26:02 grac43 DBFS_/u01/oradata/dbfs_direct: Fail to load library libfuse.so.
Jul 25 09:26:02 grac43 DBFS_/u01/oradata/dbfs_direct: A dynamic linking error occurred: (libfuse.so: cannot open shared object file: No such file or directory)
Jul 25 09:26:09 grac43 DBFS_/u01/oradata/dbfs_direct: Start -- OFFLINE

Fix 
Check your current Shared Lib config according to Fuse libs
[root@grac42 lib]# ldconfig -p | grep fuse
    libfuse.so.2 (libc6,x86-64) => /lib64/libfuse.so.2
--> Here we are missing libfuse.so

# cd /usr/local/lib
# locate libfuse.so
/lib64/libfuse.so.2
/lib64/libfuse.so.2.8.3
/usr/local/lib/libfuse.so
# ln -s /lib64/libfuse.so.2 libfuse.so
# ldconfig
# ldconfig -p | grep fuse
    libfuse.so.2 (libc6,x86-64) => /lib64/libfuse.so.2
    libfuse.so (libc6,x86-64) => /usr/local/lib/libfuse.so

 

Debugging and fixing  mount-dbfs.sh script before deploying script as a CW resource

Before deploying mount-dbfs.sh as a CW resource you should test on every node that 
  mount-dbfs.sh status
  mount-dbfs.sh start
  mount-dbfs.sh status 
  mount-dbfs.sh stop
  mount-dbfs.sh status 
works on each node and returns the expected results for  mount-dbfs.sh status .

Let's start with the initial mount test 
Error 1: Problem with password file - OS mount fails ( critical ) 
[oracle@grac41 DBFS]$  mount-dbfs.sh start
mount-dbfs.sh mounting DBFS at /u01/oradata/dbfs_direct from database grac4
ORACLE_SID is grac41
spawning dbfs_client command using SID grac41
./mount-dbfs.sh: line 198: /tmp/.dbfs-passwd.txt.31706: No such file or directory
Start -- OFFLINE
--> DBFS FS not mounted 

Checking code aroun line 198:
    (nohup $DBFS_CLIENT ${DBFS_USER}@ -o $MOUNT_OPTIONS \
          $MOUNT_POINT < $DBFS_PWDFILE | $LOGGER -p ${LOGGER_FACILITY}.info 2>&1 & ) &

    $RMF $DBFS_PWDFILE
--> Seems password file was delete - Note nohup .. & runs the process in the background 
Proposed Fix : add a short sleep between starting the client and deleting the password 
    (nohup $DBFS_CLIENT ${DBFS_USER}@ -o $MOUNT_OPTIONS \
          $MOUNT_POINT < $DBFS_PWDFILE | $LOGGER -p ${LOGGER_FACILITY}.info 2>&1 & ) &
    sleep 2   <-- proposed code change 
    $RMF $DBFS_PWDFILE

This code change fixes the error:  /tmp/.dbfs-passwd.txt.31706: No such file or directory was fixed 

Retry the mount  
[oracle@grac41 DBFS]$ mount-dbfs.sh  start
mount-dbfs.sh mounting DBFS at /u01/oradata/dbfs_direct from database grac4
ORACLE_SID is grac41
spawning dbfs_client command using SID grac41
nohup: redirecting stderr to stdout
Start -- OFFLINE
[oracle@grac41 DBFS]$ mount
dbfs-dbfs_user@grac41:/ on /u01/oradata/dbfs_direct type fuse (rw,nosuid,nodev,max_read=1048576,default_permissions,user=oracle)
--> Now DBFS is mounted but the status return wrong 

Error 2: Debugging wrong status - DBFS is OFFLINE but OS mount status is ok ! ( critical )
[oracle@grac41 DBFS]$ mount-dbfs.sh status
Checking status now
Check -- OFFLINE
Note : status remains OFFLINE even OS mount was successful. Even rerunning script doesn't help. 
--> Checking mount-dbfs.sh script 
'check'|'status')
  ### check to see if it is mounted
  ### fire off a short process in perl to do the check (need the alarm builtin)
  logit debug "Checking status now"
  $PERL <<'TOT'
    $timeout = $ENV{'PERL_ALARM_TIMEOUT'};
    $SIG{ALRM} = sub { 
      ### we have a problem and need to cleanup
      exit 3;
      die "timeout" ;
    };
    alarm $timeout;
    eval {
      $STATUSOUT=`$ENV{'STAT'} -f -c "%T" $ENV{'MOUNT_POINT'} 2>&1 `; 
      chomp($STATUSOUT);
      if ( ( $ENV{'SOLARIS'} == 1 && $STATUSOUT eq 'uvfs' ) ||
           ( $ENV{'LINUX'} == 1   && $STATUSOUT eq 'UNKNOWN (0x65735546)' ) ) {
        ### status is okay
        exit 0;
Using strace to find the command how CW detects the filesystem status 
[oracle@grac41 DBFS]$ strace -f -o mount-dbfs.trc mount-dbfs.sh status
26156 execve("/usr/bin/stat", ["/usr/bin/stat", "-f", "-c", "%T", "/u01/oradata/dbfs_direct"], [/* 35 vars */]) = 0
--> The check DBFS status the perl script runs following command 
[oracle@grac41 DBFS]$ /usr/bin/stat -f -c %T /u01/oradata/dbfs_direct
fuseblk
--> From a mounted DBFS filesystem we get returned fuseblk
Changing Line
  ( $ENV{'LINUX'} == 1   && $STATUSOUT eq 'UNKNOWN (0x65735546)' ) ) {
to 
  ( $ENV{'LINUX'} == 1   && $STATUSOUT eq 'fuseblk' ) ) {
Now we get correct status 
[oracle@grac41 DBFS]$  mount-dbfs.sh status
Checking status now
Check -- ONLINE

Error 3 : mount-dbfs.sh start  still report status offline after start ( not critical )
[oracle@grac41 DBFS]$ mount-dbfs.sh start
mount-dbfs.sh mounting DBFS at /u01/oradata/dbfs_direct from database grac4
ORACLE_SID is grac41
spawning dbfs_client command using SID grac41
nohup: redirecting stderr to stdout
Start -- OFFLINE

After waiting some seconds the status report looks good
[oracle@grac41 DBFS]$ mount-dbfs.sh status
Checking status now
Check -- ONLINE
Potential Fix: Increase sleeptime before checking mount status
  ### allow time for the mount table update before checking it
  $SLEEP 1
  ### set return code based on success of mounting
  $SCRIPTPATH status > /dev/null 2>&1
  if [ $? -eq 0 ]; then
    logit info "Start -- ONLINE"
    exit 0
  else
    logit info "Start -- OFFLINE"
    exit 1
Change line 210 from
   $SLEEP 1
to
   $SLEEP 5
Note: This error is not critical as CW will test resource status again and again.

Testing CW resource script : mount-dbfs.sh

[oracle@grac41 DBFS]$ mount-dbfs.sh status
Checking status now
Check -- OFFLINE

[oracle@grac41 DBFS]$ mount-dbfs.sh start
mount-dbfs.sh mounting DBFS at /u01/oradata/dbfs_direct from database grac4
ORACLE_SID is grac41
spawning dbfs_client command using SID grac41
nohup: redirecting stderr to stdout
Start -- ONLINE

[oracle@grac41 DBFS]$ mount-dbfs.sh status
Checking status now
Check -- ONLINE
 --> If status is OFFline repeat the mount-dbfs.sh status  for at least 1 minute .

[oracle@grac41 DBFS]$ mount-dbfs.sh stop
unmounting DBFS from /u01/oradata/dbfs_direct
umounting the filesystem using '/bin/fusermount -u /u01/oradata/dbfs_direct'
Stop - stopped, now not mounted

[oracle@grac41 DBFS]$ mount-dbfs.sh status
Checking status now
Check -- OFFLINE

Manually mount DBFS using dbfs_client

[oracle@grac41 DBFS]$ echo dbfs_user > pw
[oracle@grac41 DBFS]$ dbfs_client dbfs_user@grac41 -otrace_file=/tmp/dbfs.out -otrace_level=1 -otrace_size=0 /u01/oradata/dbfs_direct <pw &
[1] 17049

grid@grac41 ~]$ mount
dbfs-dbfs_user@grac41:/ on /u01/oradata/dbfs_direct type fuse (rw,nosuid,nodev,max_read=1048576,default_permissions,user=oracle)

Test file access 
[oracle@grac41 DBFS]$ touch /u01/oradata/dbfs_direct/FS1/t
[oracle@grac41 DBFS]$ ls  /u01/oradata/dbfs_direct/FS1/t
/u01/oradata/dbfs_direct/FS1/t

Reference

  • DBFS resource not starting as crs resource (Doc ID 1908868.1)

2 thoughts on “Debugging problems when mounting a DBFS filesystem”

Leave a Reply

Your email address will not be published. Required fields are marked *