Read Me(libexplain)                                        Read Me(libexplain)



NAME
       libexplain - Explain errno values returned by libc functions

DESCRIPTION
       The libexplain package provides a library which may be used to explain
       Unix and Linux system call errors.  This will make your application's
       error messages much more informative to your users.

       The library is not quite a drop-in replacement for strerror(3), but it
       comes close.  Each system call has a dedicated libexplain function, for
       example
              fd = open(path, flags, mode);
              if (fd < 0)
              {
                  fprintf(stderr, "%s\n", explain_open(path, flags, mode));
                  exit(EXIT_FAILURE);
              }
       If, for example, you were to try to open no-such-dir/some-file, you
       would see a message like
              open(pathname = "no-such-dir/some-file", flags = O_RDONLY)
              failed, No such file or directory (2, ENOENT) because there is
              no "no-such-dir" directory in the current directory

       The good new is that for each of these functions there is a wrapper
       function, in this case explain_open_or_die(3), that includes the above
       code fragment.  Adding good error reporting is as simple as using a
       different, but similarly named, function.  The library also provides
       thread safe variants of each explanation function.

       Coverage includes 122 system calls and 374 ioctl requests.

RATIONALE
       Picture yourself as being the local "Unix guy", and all day long a
       parade of people knock on your door, to have this or that error message
       explained.  And these are usually smart people, but they don't
       understand the subtle nuances of strerror output.

       "No such file or directory" is my all time favorite.  Users all too
       often assume it is only talking about the final component of the path
       being operated on.  When that isn't the problem, they are lost.
       Libexplain actually locates the problematic component for the user, and
       includes it in the error message.

       "Permission denined" is another source of confusion.  Again, users
       assume that it is only the final path component that has the problem.
       And if it isn't, they are lost again.  Libexplain actually locates the
       problematic component for the user, and includes it in the error
       message.

       But there is a level beyond helping out the users, it also has to help
       out those poor benighted souls who man telephone help desks.  They
       can't see the screen, they can't just "hop on the system" and have a
       quick poke around.  And, of course, the user says "it crashed again".
       Libexplain attampts to provide as much context as possible, so that
       when the user reads back the text of the message over the phone, there
       is more for the help desk staffer to work with.

   Too Technical
       The error messages produced by libexplain can seem overly technical to
       non-technical users, mostly as a result of accurately printing the
       system call itself at the beginning of the error message.

       The first consumers of the error messages are the developers
       themselves, and way way way down the food^H^H^H^Hsupport chain are the
       tech support programmers.  Both groups need bugs explained accurately,
       so they can know which part of the system is broken.  And to accurately
       know what is happening within the system.

       So, an example of support problems: The worst offenders are stupid
       programs that say
              $ stupid
              can't open file
              $

       and of course you can't tell what this stupid program is trying to tell
       you.  Is there a file called "file" that has a problem, or some other
       file?  So the intrepid support person will break out their trusty
       strace(1) or truss(1) command and look at what it happening.  But users
       can't do that.  And help desk staff usually can't, either.  And
       sometimes the system with the problem is half a world (and several
       firewalls) away, making it impossible for anyone but the customer to go
       hunting.

       So you, as nth level support staff, take a clue stick to the author of
       said stupid program, telling him to look up perror(3).  Lo and behold,
       the next problem report looks like this:
              $ stupid
              open: No such file or directory
              $

       Slightly more informative, but you still need strace(1) to figure it
       out, and you take yet another run with the clue stick, this time
       standing over them until their code reads
              FILE *fp = fopen(filename, "r");
              if (!fp)
              {
                  fprintf(stderr, "open \"%s\": %s\n", filename, strerror(errno));
                  exit(1);
              }

       Of course, the developer's system is unable to reproduce the bug, so
       the next bug registered by the user says
              $ stupid
              open /user/lib/stupid/datafile: No such file or directory
              $

       And finally we discover the bug in the hard-coded path inside the
       stupid program.  Of course, the help desk may still not be able to spot
       the bug, but the next level of support should.

       Wouldn't it be easier if all that had to be done for routine error
       handling like this was to use some kind of "open-or-die" function that
       only ever returned on success?  One that produced error messages that
       would be of imediate use to the user, without calling tech support?

       And thus libexplain was born.  Code like this
              FILE *fp = explain_fopen_or_die(filename, "r");

       factors out all the error detecting and reporting.

       In general, for any libc library function, section 2 or section 3,
       there is an equivalent explain_*_or_die function that calls the libc
       function and, in the case of errors reports them and exits, and in the
       case of success returns normally.

       Now our example program would say
              $ stupid
              fopen(path = "/user/lib/stupid/datafile", flags = "r") failed,
              No such file or directory (ENOENT) because there is no "user"
              directory in the "/" directory, but "usr" is similar
              $

       Not so stupid any more.  If the user still can't work it out, tech
       support will be able to.  Actually, of course, it's a bug, so grab your
       clue stick and take another run at the developer.  ("Did it not occur
       to you that there was a tiny bit of a problem when you made that
       symlink?"  rm.  Now the bug is reproducable.  Whack.)

       And here is the "too technical" part: That bit that says fopen(path =
       "/user/lib/stupid/datafile", flags = "r") is alleged to be too scary
       for users to see.  Unfortunately it is essential for the nth level
       support person, because without it they can't know if it was an "open
       for reading" or an "open for writing" operation that had a problem,
       because that can be relevant (not to ENOENT, but, say EACCES) when
       performing psychic silicon divination half a planet away.

       If this information is not in the error message, it will never get into
       the bug report.  It can't get into the bug report, no matter how tech-
       savvy the customer is, no matter how brilliant the phone support staff
       are.  In order for the tech support programmer to do his job, it needs
       to be there.  And give your users bit more credit, too.

   Too Verbose
       The error messages produced by libexplain can seem overly verbose to
       technical users, mostly as a result of using every-day language which
       is often poorly adapted to explain technical issues.

       The error messages relating to EACCES (Permission denied) are
       particularly prone to being quite long, because the error message must
       also explain how the Unix file permission modes work, in the process of
       explaining exactly which directory or file's mode bits were offended.
       It may come as a surprise to seasoned Linux hackers, but users just
       don't "get" that permission stuff, even though it is ridiculously easy
       and applied completely consistently.

       And, of course, the above ENOENT example is anther case than can be
       quite long.  The message actually locates and identifies the missing
       directory entry, particularly as users are mostly just confused when
       any but the last path component is the one that is not there.  And by
       suggesting similarly named alternatives that are there, the library
       also deals with the commonest result of typing mistakes in a helpful
       way.

   Internationalization
       The library has globalization support (gettext) in place.  As yet there
       are no language translations.

HOME PAGE
       The latest version of libexplain is available on the Web from:

          URL:    http://libexplain.sourceforge.net/
          File:   index.html               # the libexplain page
          File:   libexplain.0.19.README   # Description, from the tar file
          File:   libexplain.0.19.lsm      # Description, LSM format
          File:   libexplain.0.19.tar.gz   # the complete source
          File:   libexplain.0.19.pdf      # Reference Manual

BUILDING LIBEXPLAIN
       Full instructions for building libexplain may be found in the BUILDING
       file included in this distribution.

COPYRIGHT
       libexplain version 0.19
       Copyright (C) 2008, 2009 Peter Miller

       This program is free software; you can redistribute it and/or modify it
       under the terms of the GNU Lesser General Public License as published
       by the Free Software Foundation; either version 3 of the License, or
       (at your option) any later version.

       This program is distributed in the hope that it will be useful, but
       WITHOUT ANY WARRANTY; without even the implied warranty of
       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
       Lesser General Public License for more details.

       You should have received a copy of the GNU Lesser General Public
       License along with this program. If not, see
       <http://www.gnu.org/licenses/>.

       It should be in the LICENSE file included with this distribution.

AUTHOR
       Peter Miller   E-Mail:   pmiller@opensource.org.au
       /\/\*             WWW:   http://www.canb.auug.org.au/~millerp/

RELEASE NOTES
       This section details the various features and bug fixes of the various
       releases.  For excruciating and complete detail, and also credits for
       those of you who have generously sent me suggestions and bug reports,
       see the etc/CHANGES.* files.

       Coverage includes 122 system calls and 374 ioctl requests.

       * The name prefix on all of the library functions has been changed from
       "libexplain_" to just "explain_".  This was the most requested change.
       You will need to change your code and recompile.  Apologies for the
       inconvenience.

   Version 0.9 (2009-Feb-27)
       * Two false negatives in the tests have been fixed.

       * The ./configure script now explicitly looks for bison(1), and
       complains if it cannot be found.

       * The socket(7) address family is now decoded.

   Version 0.8 (2009-Feb-14)
       * A problem with the Debian packaging has been fixed.

       * The decoding of IPv4 sockaddr structs has been improved.

   Version 0.7 (2009-Feb-10)
       * Coverage has been extended to include getsockopt(2), getpeername(2),
       getsockname(2) and setsockopt(2).

       * Build problems on Debian Sid have been fixed.

       * More magnetic tape ioctl controls, from operating systems other than
       Linux, have been added.

   Version 0.6 (2009-Jan-16)
       * Coverage has been extended to include execvp(3), ioctl(2), malloc(3),
       pclose(3), pipe(2), popen(3) and realloc(3) system calls.

       * The coverage for ioctl(2) includes linux console controls, magnetic
       tape controls, socket controls, and terminal controls.

       * A false negative from test 31 has been fixed.

   Version 0.5 (2009-Jan-03)
       * A build problem on Debian sid has been fixed.

       * There is a new explain_system_success(3) function, that performs all
       that explain_system_success_or_die(3) performs, except that it does not
       call exit(2).

       * There is more i18n support.

       * A bug with the pkg-config(1) support has been fixed.

   Version 0.4 (2008-Dec-24)
       * Coverage now includes accept(2), bind(2), connect(2), dup2(2),
       fchown(2), fdopen(3), fpathconf(2), fputc(2), futimes(2),
       getaddrinfo(2), getcwd(2), getrlimit(2), listen(2), pathconf(2),
       putc(2), putchar(2), select(2).

       * Internationalization has been improved.

       * The thread safety of the code has been improved.

       * The code is now able to be compiled on OpenBSD.  The test suite still
       gives many false negatives, due to differences in strerror(3) results.

   Version 0.3 (2008-Nov-23)
       * Cover has been extended to include closedir(3), execve(2), ferror(3),
       fgetc(3), fgets(3), fork(2), fread(3), getc(3), gettimeofday(2),
       lchown(2), socket(2), system(3), utime(2), wait3(2), wait4(2), wait(2),
       waitpid(2),

       * More internationalization support has been added.

       * A bug has been fixed in the C++ insulation.

   Version 0.2 (2008-Nov-11)
   Version 0.19 (2009-Sep-07)
       * The ioctl requests from linux/hdreg.h are now understood.

       * Some build problems on Debian Lenny have been fixed.

   Version 0.18 (2009-Sep-05)
       * More ioctl requesrs are un derstood.

       * Explanations are now availaible for errors reported by the
       tcsendbreak(3), tcsetattr(3), tcgetattr(3), tcflush(3), tcdrain(3),
       system calls.

   Version 0.17 (2009-Sep-03)
       * Explanations are now available for errors reported by the telldir(3)
       system call.

       * A number of Linux build problems have been fixed.

       * Explanations for a number of corner-cases of the open(2) system call
       have been improved, where flags values interact with file types and
       mount options.

       * A number of BSD build problems have been fixed.

       * More ioctl(2) commands are understood.

       * A bug has been fixed in the way absolute symbolic links are processed
       by the path_resolution code.

   Version 0.16 (2009-Aug-03)
       * The EROFS and ENOMEDIUM explanations now greatly improved.

       * A number of build problems and false negatives have been fixed on
       x86_64 architecture.

       * The Linux floppy disk and CD-ROM ioctl requests are now supported.

       * Explanations are now available for the errors reported by the
       getdomainname(2), readv(2), setdomainname(2), ustat(2) and writev(2)
       system calls.

   Version 0.15 (2009-Jul-26)
       * A number of build errors and warnings on amd64 have been fixed.  The
       problems were only detectable on 64-bit systems.

   Version 0.14 (2009-Jul-19)
       * Coverage now includes another 29 system calls: accept4(2), acct(2),
       adjtime(3), adjtimex(2), chroot(2), dirfd(3), eventfd(2), fflush(3),
       fileno(3), flock(2), fstatfs(2), ftime(3), getgroups(2),
       gethostname(2), kill(2), nice(2), pread(2), pwrite(2), sethostname(2),
       signalfd(2), strdup(3), strtod(3), strtof(3), strtol(3), strtold(3),
       strtoll(3), strtoul(3), strtoull(3), and timerfd_create(2).  A total of
       110 system calls are now supported

       * The ./configure script no longer demands lsof(1).  The Linux
       libexplain code doesn't need lsof(1).  On systems not supported by
       lsof(1), the error messages aren't quite as useful, but libexplain
       still works.

       * There is now an explain_*_on_error function for each system call,
       each reports errors but still returns the original return value to the
       caller.

   Version 0.13 (2009-May-17)
       * The web site now links to a number of services provided by
       SourceForge.

       * Several problems have been fixed with compiling libexplain on 64-bit
       systems.

   Version 0.12 (2009-May-04)
       * A build problem has been fixed on hosts that didn't need to do
       anything special for large file support.

   Version 0.11 (2009-Mar-29)
       * The current directory is replaced in messages with an absolute path
       in cases where the user's idea of the current directory may differ from
       that of the current process.

   Version 0.10 (2009-Mar-24)
       * Coverage now includes chmod(2), chown(2), dup(2), fchdir(2),
       fchmod(2), fstat(2), ftruncate(2), fwrite(3), mkdir(2), readdir(3),
       readlink(2), remove(3), rmdir(2) and truncate(2).

       * The lsof(1) command is used to obtain supplementary file information
       on those systems with limited /proc implementations.

       * The explanations now understand Linux capabilities.

   Version 0.1 (2008-Oct-26)
       First public release.



Reference Manual                  libexplain               Read Me(libexplain)
