``ftputil`` -- a high-level FTP client library
==============================================

:Version:   2.4.2
:Date:      2009-11-12
:Summary:   high-level FTP client library for Python
:Keywords:  FTP, ``ftplib`` substitute, virtual filesystem, pure Python
:Author:    Stefan Schwarzer <sschwarzer@sschwarzer.net>
:`Russian translation`__: Anton Stepanov <antymail@mail.ru>

.. __: ftputil_ru.html

.. contents::


Introduction
------------

The ``ftputil`` module is a high-level interface to the ftplib_
module. The `FTPHost objects`_ generated from it allow many operations
similar to those of os_, `os.path`_ and `shutil`_.

.. _ftplib: http://www.python.org/doc/current/lib/module-ftplib.html
.. _os: http://www.python.org/doc/current/lib/module-os.html
.. _`os.path`: http://www.python.org/doc/current/lib/module-os.path.html
.. _`shutil`: http://www.python.org/doc/current/lib/module-shutil.html

Examples::

    import ftputil

    # download some files from the login directory
    host = ftputil.FTPHost('ftp.domain.com', 'user', 'password')
    names = host.listdir(host.curdir)
    for name in names:
        if host.path.isfile(name):
            host.download(name, name, 'b')  # remote, local, binary mode

    # make a new directory and copy a remote file into it
    host.mkdir('newdir')
    source = host.file('index.html', 'r')         # file-like object
    target = host.file('newdir/index.html', 'w')  # file-like object
    host.copyfileobj(source, target)  # similar to shutil.copyfileobj
    source.close()
    target.close()

Also, there are `FTPHost.lstat`_ and `FTPHost.stat`_ to request size and
modification time of a file. The latter can also follow links, similar
to `os.stat`_. Even `FTPHost.walk`_ and `FTPHost.path.walk`_ work.

.. _`os.stat`: http://www.python.org/doc/2.5/lib/os-file-dir.html#l2h-2698


``ftputil`` features
--------------------

* Method names are familiar from Python's ``os``, ``os.path`` and
  ``shutil`` modules

* Remote file system navigation (``getcwd``, ``chdir``)

* Upload and download files (``upload``, ``upload_if_newer``,
  ``download``, ``download_if_newer``)

* Time zone synchronization between client and server (needed
  for ``upload_if_newer`` and ``download_if_newer``)

* Create and remove directories (``mkdir``, ``makedirs``, ``rmdir``,
  ``rmtree``) and remove files (``remove``)

* Get information about directories, files and links (``listdir``,
  ``stat``, ``lstat``, ``exists``, ``isdir``, ``isfile``, ``islink``,
  ``abspath``, ``split``, ``join``, ``dirname``, ``basename`` etc.)

* Iterate over remote file systems (``walk``)

* Local caching of results from ``lstat`` and ``stat`` calls to reduce
  network access (also applies to ``exists``, ``getmtime`` etc.).

* Read files from and write files to remote hosts via
  file-like objects (``FTPHost.file``; the generated file-like objects
  have many common methods like ``read``, ``readline``, ``readlines``,
  ``write``, ``writelines``, ``close`` and can do automatic line
  ending conversions on the fly, i. e. text/binary mode).


Exception hierarchy
-------------------

The exceptions are in the namespace of the ``ftp_error`` module, e. g.
``ftp_error.TemporaryError``. Getting the exception classes from the
"package module" ``ftputil`` is deprecated and will no longer be
supported in ``ftputil`` version 2.5.

The exception classes are organized as follows::

    FTPError
        FTPOSError(FTPError, OSError)
            PermanentError(FTPOSError)
                CommandNotImplementedError(PermanentError)
            TemporaryError(FTPOSError)
        FTPIOError(FTPError)
        InternalError(FTPError)
            InaccessibleLoginDirError(InternalError)
            ParserError(InternalError)
            RootDirError(InternalError)
            TimeShiftError(InternalError)

and are described here:

- ``FTPError``

  is the root of the exception hierarchy of the module.

- ``FTPOSError``

  is derived from ``OSError``. This is for similarity between the
  os module and ``FTPHost`` objects. Compare

  ::

    try:
        os.chdir('nonexisting_directory')
    except OSError:
        ...

  with

  ::

    host = ftputil.FTPHost('host', 'user', 'password')
    try:
        host.chdir('nonexisting_directory')
    except OSError:
        ...

  Imagine a function

  ::

    def func(path, file):
        ...

  which works on the local file system and catches ``OSErrors``. If you
  change the parameter list to

  ::

    def func(path, file, os=os):
        ...

  where ``os`` denotes the ``os`` module, you can call the function also as

  ::

    host = ftputil.FTPHost('host', 'user', 'password')
    func(path, file, os=host)

  to use the same code for both a local and remote file system.
  Another similarity between ``OSError`` and ``FTPOSError`` is that
  the latter holds the FTP server return code in the ``errno``
  attribute of the exception object and the error text in
  ``strerror``.

- ``PermanentError``

  is raised for 5xx return codes from the FTP server. This
  corresponds to ``ftplib.error_perm`` (though ``PermanentError`` and
  ``ftplib.error_perm`` are *not* identical).

- ``CommandNotImplementedError``

  indicates that an underlying command the code tries to use is not
  implemented. For an example, see the description of the
  `FTPHost.chmod`_ method.

- ``TemporaryError``

  is raised for FTP return codes from the 4xx category. This
  corresponds to ``ftplib.error_temp`` (though ``TemporaryError`` and
  ``ftplib.error_temp`` are *not* identical).

- ``FTPIOError``

  denotes an I/O error on the remote host. This appears
  mainly with file-like objects which are retrieved by invoking
  ``FTPHost.file`` (``FTPHost.open`` is an alias). Compare

  ::

    >>> try:
    ...     f = open('not_there')
    ... except IOError, obj:
    ...     print obj.errno
    ...     print obj.strerror
    ...
    2
    No such file or directory

  with

  ::

    >>> host = ftputil.FTPHost('host', 'user', 'password')
    >>> try:
    ...     f = host.open('not_there')
    ... except IOError, obj:
    ...     print obj.errno
    ...     print obj.strerror
    ...
    550
    550 not_there: No such file or directory.

  As you can see, both code snippets are similar. However, the error
  codes aren't the same.

- ``InternalError``

  subsumes exception classes for signaling errors due to limitations
  of the FTP protocol or the concrete implementation of ``ftputil``.

- ``InaccessibleLoginDirError``

  This exception is only raised if *both* of the following conditions
  are met:

  - The directory in which "you" are placed upon login is not
    accessible, i. e. a ``chdir`` call with the directory as
    argument would fail.

  - You try to access a path which contains whitespace.

- ``ParserError``

  is used for errors during the parsing of directory
  listings from the server. This exception is used by the ``FTPHost``
  methods ``stat``, ``lstat``, and ``listdir``.

- ``RootDirError``

  Because of the implementation of the ``lstat`` method it is not
  possible to do a ``stat`` call  on the root directory ``/``.
  If you know *any* way to do it, please let me know. :-)

  This problem does *not* affect stat calls on items *in* the root
  directory.

- ``TimeShiftError``

  is used to denote errors which relate to setting the `time shift`_,
  *for example* trying to set a value which is no multiple of a full
  hour.


``FTPHost`` objects
-------------------

.. _`FTPHost construction`:

Construction
~~~~~~~~~~~~

Basics
``````

``FTPHost`` instances can be generated with the following call::

    host = ftputil.FTPHost(host, user, password, account,
                           session_factory=ftplib.FTP)

The first four parameters are strings with the same meaning as for the
FTP class in the ``ftplib`` module.

Session factories
`````````````````

The keyword argument ``session_factory`` may be used to generate FTP
connections with other factories than the default ``ftplib.FTP``. For
example, the M2Crypto distribution uses a secure FTP class which is
derived from ``ftplib.FTP``.

In fact, all positional and keyword arguments other than
``session_factory`` are passed to the factory to generate a new
background session. This happens for every remote file that is opened;
see below.

This functionality of the constructor also allows to wrap
``ftplib.FTP`` objects to do something that wouldn't be possible with
the ``ftplib.FTP`` constructor alone.

As an example, assume you want to connect to another than the default
port, but ``ftplib.FTP`` only offers this by means of its ``connect``
method, not via its constructor. The solution is to use a wrapper
class::

    import ftplib
    import ftputil

    EXAMPLE_PORT = 50001

    class MySession(ftplib.FTP):
        def __init__(self, host, userid, password, port):
            """Act like ftplib.FTP's constructor but connect to another port."""
            ftplib.FTP.__init__(self)
            self.connect(host, port)
            self.login(userid, password)

    # try not to use MySession() as factory, - use the class itself
    host = ftputil.FTPHost(host, userid, password,
                           port=EXAMPLE_PORT, session_factory=MySession)
    # use `host` as usual

On login, the format of the directory listings (needed for stat'ing
files and directories) should be determined automatically. If not,
please `file a bug report`_.

.. _`file a bug report`: http://ftputil.sschwarzer.net/issuetrackernotes

Support for the ``with`` statement
``````````````````````````````````

If you are sure that all the users of your code use at least Python
2.5, you can use Python's `with statement`_::

    # not needed for Python 2.6 and later
    from __future__ import with_statement

    import ftputil

    with ftputil.FTPHost(host, user, password) as host:
        print host.listdir(host.curdir)

After the ``with`` block, the ``FTPHost`` instance and the
associated FTP sessions will be closed automatically.

If something goes wrong during the ``FTPHost`` construction or in the
body of the ``with`` statement, the instance is closed as well.
Exceptions will be propagated (as with ``try ... finally``).

.. _`with statement`: http://www.python.org/dev/peps/pep-0343/

``FTPHost`` attributes and methods
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Attributes
``````````

- ``curdir``, ``pardir``, ``sep``

  are strings which denote the current and the parent directory on the
  remote server. ``sep`` holds the path separator. Though `RFC 959`_
  (File Transfer Protocol) notes that these values may depend on the
  FTP server implementation, the Unix variants seem to work well in
  practice, even for non-Unix servers.

Remote file system navigation
`````````````````````````````

- ``getcwd()``

  returns the absolute current directory on the remote host. This
  method acts similar to ``os.getcwd``.

- ``chdir(directory)``

  sets the current directory on the FTP server. This resembles
  ``os.chdir``, as you may have expected.

Uploading and downloading files
```````````````````````````````

- ``upload(source, target, mode='')``

  copies a local source file (given by a filename, i. e. a string)
  to the remote host under the name target. Both ``source`` and
  ``target`` may be absolute paths or relative to their corresponding
  current directory (on the local or the remote host, respectively).
  The mode may be "" or "a" for ASCII uploads or "b" for binary
  uploads. ASCII mode is the default, similar to regular local
  file objects.

- ``download(source, target, mode='')``

  performs a download from the remote source to a target file. Both
  ``source`` and ``target`` are strings. Most of the description of
  the upload method applies here, too.

.. _`upload_if_newer`:

- ``upload_if_newer(source, target, mode='')``

  is similar to the ``upload`` method. The only difference is that the
  upload is only invoked if the time of the last modification for the
  source file is more recent than that of the target file or the
  target doesn't exist at all. If an upload actually happened, the
  return value is a true value, else a false value.

  Note that this method only checks the existence and/or the
  modification time of the source and target file; it can't recognize
  a change in the transfer mode, e. g.

  ::

    # transfer in ASCII mode
    host.upload_if_newer('source_file', 'target_file', 'a')
    # won't transfer the file again, which is bad!
    host.upload_if_newer('source_file', 'target_file', 'b')

  Similarly, if a transfer is interrupted, the remote file will have a
  newer modification time than the local file, and thus the transfer
  won't be repeated if ``upload_if_newer`` is used a second time.
  There are at least two possibilities after a failed upload:

  - use ``upload`` instead of ``upload_if_newer``, or

  - remove the incomplete target file with ``FTPHost.remove``, then
    use ``upload`` or ``upload_if_newer`` to transfer it again.

  If it seems that a file is uploaded unnecessarily or not when it
  should, read the subsection on `time shift`_ settings.

.. _`download_if_newer`:

- ``download_if_newer(source, target, mode='')``

  corresponds to ``upload_if_newer`` but performs a download from the
  server to the local host. Read the descriptions of download and
  ``upload_if_newer`` for more. If a download actually happened, the
  return value is a true value, else a false value.

  If it seems that a file is downloaded unnecessarily or not when it
  should, read the subsection on `time zone correction`_.

.. _`time shift`:
.. _`time zone correction`:

Time zone correction
````````````````````

If the client where ``ftputil`` runs and the server have a different
understanding of their local times, this has to be taken into account
for ``upload_if_newer`` and ``download_if_newer`` to work correctly.

Note that even if the client and the server are in the same time zone
(or even on the same computer), the time shift value (see below) may
be different from zero. For example, my computer is set to use local
time whereas the server running on the very same host insists on using
UTC time.

.. _`set_time_shift`:

- ``set_time_shift(time_shift)``

  sets the so-called time shift value, measured in seconds. The time
  shift is the difference between the local time of the server and the
  local time of the client at a given moment, i. e. by definition

  ::

    time_shift = server_time - client_time

  Setting this value is important for `upload_if_newer`_ and
  `download_if_newer`_ to work correctly even if the time zone of the
  FTP server differs from that of the client. Note that the time shift
  value *can be negative*.

  If the time shift value is invalid, e. g. no multiple of a full hour
  or its absolute value larger than 24 hours, a ``TimeShiftError`` is
  raised.

  See also `synchronize_times`_ for a way to set the time shift with a
  simple method call.

- ``time_shift()``

  returns the currently-set time shift value. See ``set_time_shift``
  above for its definition.

.. _`synchronize_times`:

- ``synchronize_times()``

  synchronizes the local times of the server and the client, so that
  `upload_if_newer`_ and `download_if_newer`_ work as expected, even
  if the client and the server use different time zones. For this
  to work, *all* of the following conditions must be true:

  - The connection between server and client is established.

  - The client has write access to the directory that is current when
    ``synchronize_times`` is called.

  If you can't fulfill these conditions, you can nevertheless set the
  time shift value explicitly with `set_time_shift`_. Trying to call
  ``synchronize_times`` if the above conditions aren't met results in
  a ``TimeShiftError`` exception.

Creating and removing directories
`````````````````````````````````

- ``mkdir(path, [mode])``

  makes the given directory on the remote host. This doesn't construct
  "intermediate" directories which don't already exist. The ``mode``
  parameter is ignored; this is for compatibility with ``os.mkdir`` if
  an ``FTPHost`` object is passed into a function instead of the
  ``os`` module. See the explanation in the subsection `Exception
  hierarchy`_.

- ``makedirs(path, [mode])``

  works similar to ``mkdir`` (see above), but also makes intermediate
  directories like ``os.makedirs``. The ``mode`` parameter is only
  there for compatibility with ``os.makedirs`` and is ignored.

- ``rmdir(path)``

  removes the given remote directory. If it's not empty, raise
  a ``PermanentError``.

- ``rmtree(path, ignore_errors=False, onerror=None)``

  removes the given remote, possibly non-empty, directory tree.
  The interface of this method is rather complex, in favor of
  compatibility with ``shutil.rmtree``.

  If ``ignore_errors`` is set to a true value, errors are ignored.
  If ``ignore_errors`` is a false value *and* ``onerror`` isn't
  set, all exceptions occurring during the tree iteration and
  processing are raised. These exceptions are all of type
  ``PermanentError``.

  To distinguish between different kinds of errors, pass in a callable
  for ``onerror``. This callable must accept three arguments:
  ``func``, ``path`` and ``exc_info``. ``func`` is a bound method
  object, *for example* ``your_host_object.listdir``. ``path`` is the
  path that was the recent argument of the respective method
  (``listdir``, ``remove``, ``rmdir``). ``exc_info`` is the exception
  info as it is gotten from ``sys.exc_info``.

  The code of ``rmtree`` is taken from Python's ``shutil`` module
  and adapted for ``ftputil``.

Removing files and links
````````````````````````

- ``remove(path)``

  removes a file or link on the remote host, similar to ``os.remove``.

- ``unlink(path)``

  is an alias for ``remove``.

Retrieving information about directories, files and links
`````````````````````````````````````````````````````````

- ``listdir(path)``

  returns a list containing the names of the files and directories
  in the given path, similar to ``os.listdir``. The special names
  ``.`` and ``..`` are not in the list.

The methods ``lstat`` and ``stat`` (and some others) rely on the
directory listing format used by the FTP server. When connecting to a
host, ``FTPHost``'s constructor tries to guess the right format, which
succeeds in most cases. However, if you get strange results or
``ParserError`` exceptions by a mere ``lstat`` call, please `file a
bug report`_.

If ``lstat`` or ``stat`` yield wrong modification dates or times, look
at the methods that deal with time zone differences (`time zone
correction`_).

.. _`FTPHost.lstat`:

- ``lstat(path)``

  returns an object similar to that from ``os.lstat``. This is a
  "tuple" with additional attributes; see the documentation of the
  ``os`` module for details.

  The result is derived by parsing the output of a ``DIR`` command on
  the server. Therefore, the result from ``FTPHost.lstat`` can not
  contain more information than the received text. In particular:

  - User and group ids can only be determined as strings, not as
    numbers, and that only if the server supplies them. This is
    usually the case with Unix servers but maybe not for other FTP
    server programs.

  - Values for the time of the last modification may be rough,
    depending on the information from the server. For timestamps
    older than a year, this usually means that the precision of the
    modification timestamp value is not better than days. For newer
    files, the information may be accurate to a minute.

  - Links can only be recognized on servers that provide this
    information in the ``DIR`` output.

  - Stat attributes that can't be determined at all are set to
  	``None``. For example, a line of a directory listing may not
  	contain the date/time of a directory's last modification.

  - There's a special problem with stat'ing the root directory.
    (Stat'ing things *in* the root directory is fine though.) In
    this case, a ``RootDirError`` is raised. This has to do with the
    algorithm used by ``(l)stat``, and I know of no approach which
    mends this problem.

  Currently, ``ftputil`` recognizes the common Unix-style and
  Microsoft/DOS-style directory formats. If you need to parse output
  from another server type, please write to the `ftputil mailing
  list`_. You may consider to `write your own parser`_.

.. _`ftputil mailing list`: http://ftputil.sschwarzer.net/mailinglist
.. _`write your own parser`: `Writing directory parsers`_

.. _`FTPHost.stat`:

- ``stat(path)``

  returns ``stat`` information also for files which are pointed to by a
  link. This method follows multiple links until a regular file or
  directory is found. If an infinite link chain is encountered or the
  target of the last link in the chain doesn't exist, a
  ``PermanentError`` is raised.

.. _`FTPHost.path`:

``FTPHost`` objects contain an attribute named ``path``, similar to
`os.path`_. The following methods can be applied to the remote host
with the same semantics as for ``os.path``:

::

    abspath(path)
    basename(path)
    commonprefix(path_list)
    dirname(path)
    exists(path)
    getmtime(path)
    getsize(path)
    isabs(path)
    isdir(path)
    isfile(path)
    islink(path)
    join(path1, path2, ...)
    normcase(path)
    normpath(path)
    split(path)
    splitdrive(path)
    splitext(path)
    walk(path, func, arg)

Like Python's counterparts under `os.path`_, ``ftputil``'s ``is...``
methods return ``False`` if they can't find the path given by their
argument.

Local caching of file system information
````````````````````````````````````````

Many of the above methods need access to the remote file system to
obtain data on directories and files. To get the most recent data,
*each* call to ``lstat``, ``stat``, ``exists``, ``getmtime`` etc.
would require to fetch a directory listing from the server, which can
make the program *very* slow. This effect is more pronounced for
operations which mostly scan the file system rather than transferring
file data.

For this reason, ``ftputil`` by default saves the results from
directory listings locally and reuses those results. This reduces
network accesses and so speeds up the software a lot. However, since
data is more rarely fetched from the server, the risk of obsolete data
also increases. This will be discussed below.

Caching can be controlled -- if necessary at all -- via the
``stat_cache`` object in an ``FTPHost``'s namespace. For example,
after calling

::

    host = ftputil.FTPHost(host, user, password, account,
                           session_factory=ftplib.FTP)

the cache can be accessed as ``host.stat_cache``.

While ``ftputil`` usually manages the cache quite well, there are two
possible reasons that may suggest modifying cache parameters.
The first is when the number of possible entries is too low. You may
notice that when you are processing very large directories, e. g.
containing more than 1000 directories or files, and the program
becomes much slower than before. It's common for code to read a
directory with ``listdir`` and then process the found directories and
files. For this application, it's a good rule of thumb to set the
cache size to somewhat more than the number of directory entries
fetched with ``listdir``. This is done by the ``resize`` method::

    host.stat_cache.resize(2000)

where the argument is the maximum number of ``lstat`` results to store
(the default is 1000). Note that each path on the server, e. g.
"/home/schwa/some_dir", corresponds to a single cache entry. Methods
like ``exists`` or ``getmtime`` all derive their results from a
previously fetched ``lstat`` result.

The value 2000 above means that the cache will hold at most 2000
entries. If more are about to be stored, the entries which haven't
been used for the longest time will be deleted to make place for newer
entries.

Caching is so effective because it reduces network accesses. This can
also be a disadvantage if the file system data on the remote server
changes after a stat result has been retrieved; the client, when
looking at the cached stat data, will use obsolete information.

There are two ways to get such out-of-date stat data. The first
happens when an ``FTPHost`` instance modifies a file path for which it
has a cache entry, e. g. by calling ``remove`` or ``rmdir``. Such
changes are handled transparently; the path will be deleted from the
cache. A different matter are changes unknown to the ``FTPHost``
object which inspects its cache. Obviously, for example, these are
changes by programs running on the remote host. On the other hand,
cache inconsistencies can also occur if two ``FTPHost`` objects change
a file system simultaneously::

    host1 = ftputil.FTPHost(server, user1, password1)
    host2 = ftputil.FTPHost(server, user1, password1)
    try:
        stat_result1 = host1.stat("some_file")
        stat_result2 = host2.stat("some_file")
        host2.remove("some_file")
        # `host1` will still see the obsolete cache entry!
        print host1.stat("some_file")
        # will raise an exception since an `FTPHost` object
        #  knows of its own changes
        print host2.stat("some_file")
    finally:
        host1.close()
        host2.close()

At first sight, it may appear to be a good idea to have a shared cache
among several ``FTPHost`` objects. After some thinking, this turns out
to be very error-prone. For example, it won't help with different
processes using ``ftputil``. So, if you have to deal with concurrent
write/read accesses to a server, you have to handle them explicitly.

The most useful tool for this is the ``invalidate`` method. In the
example above, it could be used like this::

    host1 = ftputil.FTPHost(server, user1, password1)
    host2 = ftputil.FTPHost(server, user1, password1)
    try:
        stat_result1 = host1.stat("some_file")
        stat_result2 = host2.stat("some_file")
        host2.remove("some_file")
        # invalidate using an absolute path
        absolute_path = host1.path.abspath(
                        host1.path.join(host1.curdir, "some_file"))
        host1.stat_cache.invalidate(absolute_path)
        # will now raise an exception as it should
        print host1.stat("some_file")
        # would raise an exception since an `FTPHost` object
        #  knows of its own changes, even without `invalidate`
        print host2.stat("some_file")
    finally:
        host1.close()
        host2.close()

The method ``invalidate`` can be used on any *absolute* path, be it a
directory, a file or a link.

By default, the cache entries (if not replaced by newer ones) are
stored for an infinite time. That is, if you start your Python process
using ``ftputil`` and let it run for three days a stat call may still
access cache data that old. To avoid this, you can set the ``max_age``
attribute::

    host = ftputil.FTPHost(server, user, password)
    host.stat_cache.max_age = 60 * 60  # = 3600 seconds

This sets the maximum age of entries in the cache to an hour. This
means any entry older won't be retrieved from the cache but its data
instead fetched again from the remote host and then again stored for
up to an hour. To reset `max_age` to the default of unlimited age,
i. e. cache entries never expire, use ``None`` as value.

If you are certain that the cache will be in the way, you can disable
and later re-enable it completely with ``disable`` and ``enable``::

    host = ftputil.FTPHost(server, user, password)
    host.stat_cache.disable()
    ...
    host.stat_cache.enable()

During that time, the cache won't be used; all data will be fetched
from the network. After enabling the cache, its entries will be the
same as when the cache was disabled, that is, entries won't get
updated with newer data during this period. Note that even when the
cache is disabled, the file system data in the code can become
inconsistent::

    host = ftputil.FTPHost(server, user, password)
    host.stat_cache.disable()
    if host.path.exists("some_file"):
        mtime = host.path.getmtime("some_file")

In that case, the file ``some_file`` may have been removed by another
process between the calls to ``exists`` and ``getmtime``!

Iteration over directories
``````````````````````````

.. _`FTPHost.walk`:

- ``walk(top, topdown=True, onerror=None)``

  iterates over a directory tree, similar to `os.walk`_. Actually,
  ``FTPHost.walk`` uses the code from Python with just the necessary
  modifications, so see the linked documentation.

.. _`os.walk`: http://www.python.org/doc/2.5/lib/os-file-dir.html#l2h-2707

.. _`FTPHost.path.walk`:

- ``path.walk(path, func, arg)``

  Similar to ``os.path.walk``, the ``walk`` method in
  `FTPHost.path`_ can be used, though ``FTPHost.walk`` is probably
  easier to use.

Other methods
`````````````

- ``close()``

  closes the connection to the remote host. After this, no more
  interaction with the FTP server is possible without using a new
  ``FTPHost`` object.

- ``rename(source, target)``

  renames the source file (or directory) on the FTP server.

.. _`FTPHost.chmod`:

- ``chmod(path, mode)``

  sets the access mode (permission flags) for the given path. The mode
  is an integer as returned for the mode by the ``stat`` and ``lstat``
  methods. Be careful: Usually, mode values are written as octal
  numbers, for example 0755 to make a directory readable and writable
  for the owner, but not writable for the group and others. If you
  want to use such octal values, rely on Python's support for them::

    host.chmod("some_directory", 0755)

  *Note the leading zero.*

  Not all FTP servers support the ``chmod`` command. In case of
  an exception, how do you know if the path doesn't exist or if
  the command itself is invalid? If the FTP server complies with
  `RFC 959`_, it should return a status code 502 if the ``SITE CHMOD``
  command isn't allowed. ``ftputil`` maps this special error
  response to a ``CommandNotImplementedError`` which is derived from
  ``PermanentError``.

  So you need to code like this::

    host = ftputil.FTPHost(server, user, password)
    try:
        host.chmod("some_file", 0644)
    except ftp_error.CommandNotImplementedError:
        # chmod not supported
        ...
    except ftp_error.PermanentError:
        # possibly a non-existent file
        ...

  Because the ``CommandNotImplementedError`` is more specific, you
  have to test for it first.

.. _`RFC 959`: `RFC 959 - File Transfer Protocol (FTP)`_

- ``copyfileobj(source, target, length=64*1024)``

  copies the contents from the file-like object source to the
  file-like object target. The only difference to
  ``shutil.copyfileobj`` is the default buffer size. Note that
  arbitrary file-like objects can be used as arguments (e. g. local
  files, remote FTP files). See `File-like objects`_ for construction
  and use of remote file-like objects.

.. _`set_parser`:

- ``set_parser(parser)``

  sets a custom parser for FTP directories. Note that you have to pass
  in a parser *instance*, not the class.

  An `extra section`_ shows how to write own parsers if the default
  parsers in ``ftputil`` don't work for you. Possibly you are lucky
  and someone has already written a parser you can use. Please ask on
  the `mailing list`_.

.. _`extra section`: `Writing directory parsers`_


File-like objects
-----------------

Construction
~~~~~~~~~~~~

Basics
``````

``FTPFile`` objects are returned by a call to ``FTPHost.file`` or
``FTPHost.open``, never use the constructor directly.

- ``FTPHost.file(path, mode='r')``

  returns a file-like object that refers to the path on the remote
  host. This path may be absolute or relative to the current directory
  on the remote host (this directory can be determined with the getcwd
  method). As with local file objects the default mode is "r", i. e.
  reading text files. Valid modes are "r", "rb", "w", and "wb".

- ``FTPHost.open(path, mode='r')``

  is an alias for ``file`` (see above).

Support for the ``with`` statement
``````````````````````````````````

If you are sure that all the users of your code use at least Python
2.5, you can use Python's `with statement`_ with the ``FTPFile``
constructor::

    # not needed for Python 2.6 and later
    from __future__ import with_statement

    import ftputil

    # get an ``FTPHost`` object from somewhere
    ...

    with host.file("new_file", "w") as f:
        f.write("This is some text.")

At the end of the ``with`` block, the file will be closed
automatically.

If something goes wrong during the construction of the file or in the
body of the ``with`` statement, the file will be closed as well.
Exceptions will be propagated as with ``try ... finally``.

.. _`with statement`: http://www.python.org/dev/peps/pep-0343/

Attributes and methods
~~~~~~~~~~~~~~~~~~~~~~

The methods

::

    close()
    read([count])
    readline([count])
    readlines()
    write(data)
    writelines(string_sequence)
    xreadlines()

and the attribute ``closed`` have the same semantics as for file
objects of a local disk file system. The iterator protocol is
supported as well, i. e. you can use a loop to read a file line by
line::

    host = ftputil.FTPHost(...)
    input_file = host.file("some_file")
    for line in input_file:
        # do something with the line, e. g.
        print line.strip().replace("ftplib", "ftputil")
    input_file.close()

This feature obsoletes the ``xreadlines`` method which is deprecated
and will be removed in ``ftputil`` version 2.5.

For more on file objects, see the section `File objects`_ in the
Python Library Reference.

.. _`file objects`: http://www.python.org/doc/current/lib/bltin-file-objects.html

Note that ``ftputil`` supports both binary mode and text mode with the
appropriate line ending conversions.


Writing directory parsers
-------------------------

``ftputil`` recognizes the two most widely-used FTP directory formats
(Unix and MS style) and adjusts itself automatically. However, if your
server uses a format which is different from the two provided by
``ftputil``, you can plug in a custom parser and have it used by
a single method call.

For this, you need to write a parser class by inheriting from the
class ``Parser`` in the ``ftp_stat`` module. Here's an example::

    from ftputil import ftp_error
    from ftputil import ftp_stat

    class XyzParser(ftp_stat.Parser):
        """
        Parse the default format of the FTP server of the XYZ
        corporation.
        """
        def parse_line(self, line, time_shift=0.0):
            """
            Parse a `line` from the directory listing and return a
            corresponding `StatResult` object. If the line can't
            be parsed, raise `ftp_error.ParserError`.

            The `time_shift` argument can be used to fine-tune the
            parsing of dates and times. See the class
            `ftp_stat.UnixParser` for an example.
            """
            # split the `line` argument and examine it further; if
            #  something goes wrong, raise an `ftp_error.ParserError`
            ...
            # make a `StatResult` object from the parts above
            stat_result = ftp_stat.StatResult(...)
            # `_st_name` and `_st_target` are optional
            stat_result._st_name = ...
            stat_result._st_target = ...
            return stat_result

        # define `ignores_line` only if the default in the base class
        #  doesn't do enough!
        def ignores_line(self, line):
            """
            Return a true value if the line should be ignored. For
            example, the implementation in the base class handles
            lines like "total 17". On the other hand, if the line
            should be used for stat'ing, return a false value.
            """
            is_total_line = super(XyzParser, self).ignores_line(line)
            my_test = ...
            return is_total_line or my_test

A ``StatResult`` object is similar to the value returned by
`os.stat`_ and is usually built with statements like

::

    stat_result = StatResult(
                  (st_mode, st_ino, st_dev, st_nlink, st_uid,
                   st_gid, st_size, st_atime, st_mtime, st_ctime) )
    stat_result._st_name = ...
    stat_result._st_target = ...

with the arguments of the ``StatResult`` constructor described in
the following table.

===== ========== ============ =============== =======================
Index Attribute  os.stat type StatResult type Notes
===== ========== ============ =============== =======================
0     st_mode    int          int
1     st_ino     long         long
2     st_dev     long         long
3     st_nlink   int          int
4     st_uid     int          str             usually only available as string
5     st_gid     int          str             usually only available as string
6     st_size    long         long
7     st_atime   int/float    float
8     st_mtime   int/float    float
9     st_ctime   int/float    float
\-    _st_name   \-           str             file name without directory part
\-    _st_target \-           str             link target
===== ========== ============ =============== =======================

If you can't extract all the desirable data from a line (for
example, the MS format doesn't contain any information about the
owner of a file), set the corresponding values in the ``StatResult``
instance to ``None``.

Parser classes can use several helper methods which are defined in
the class ``Parser``:

- ``parse_unix_mode`` parses strings like "drwxr-xr-x" and returns
  an appropriate ``st_mode`` value.

- ``parse_unix_time`` returns a float number usable for the
  ``st_...time`` values by parsing arguments like "Nov"/"23"/"02:33" or
  "May"/"26"/"2005". Note that the method expects the timestamp string
  already split at whitespace.

- ``parse_ms_time`` parses arguments like "10-23-01"/"03:25PM" and
  returns a float number like from ``time.mktime``. Note that the
  method expects the timestamp string already split at whitespace.

Additionally, there's an attribute ``_month_numbers`` which maps
lowercase three-letter month abbreviations to integers.

For more details, see the two "standard" parsers ``UnixParser`` and
``MSParser`` in the module ``ftp_stat.py``.

To actually *use* the parser, call the method `set_parser`_ of the
``FTPHost`` instance.

If you can't write a parser or don't want to, please ask on the
`ftputil mailing list`_. Possibly someone has already written a parser
for your server or can help to do it.


FAQ / Tips and tricks
---------------------

Where can I get the latest version?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

See the `download page`_. Announcements will be sent to the `mailing
list`_. Announcements on major updates will also be posted to the
newsgroup `comp.lang.python`_ .

.. _`download page`: http://ftputil.sschwarzer.net/download
.. _`mailing list`: http://ftputil.sschwarzer.net/mailinglist
.. _`comp.lang.python`: news:comp.lang.python

Is there a mailing list on ``ftputil``?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Yes, please visit http://ftputil.sschwarzer.net/mailinglist to
subscribe or read the archives.

Though you can *technically* post without subscribing first I can't
recommend that: The mails from non-subscribers have to be approved by
me and because the arriving mails contain *lots* of spam, I rarely go
through this bunch of mails.

I found a bug! What now?
~~~~~~~~~~~~~~~~~~~~~~~~

Before reporting a bug, make sure that you already tried the `latest
version`_ of ``ftputil``. There the bug might have already been fixed.

.. _`latest version`: http://ftputil.sschwarzer.net/download

Please see http://ftputil.sschwarzer.net/issuetrackernotes for
guidelines on entering a bug in ``ftputil``'s ticket system. If you
are unsure if the behaviour you found is a bug or not, you should write
to the `ftputil mailing list`_. In *either* case you *must not*
include confidential information (user id, password, file names, etc.)
in the problem report! Be careful!

Does ``ftputil`` support SSL?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

``ftputil`` has no *built-in* SSL support. On the other hand,
you can use M2Crypto_ (in the source code archive, look for the
file ``M2Crypto/ftpslib.py``) which has a class derived from
``ftplib.FTP`` that supports SSL. You then can use a class
(not an object of it) similar to the following as a "session
factory" in ``ftputil.FTPHost``'s constructor::

    import ftputil

    from M2Crypto import ftpslib

    class SSLFTPSession(ftpslib.FTP_TLS):
        def __init__(self, host, userid, password):
            """
            Use M2Crypto's `FTP_TLS` class to establish an
            SSL connection.
            """
            ftpslib.FTP_TLS.__init__(self)
            # do anything necessary to set up the SSL connection
            ...
            self.connect(host, port)
            self.login(userid, password)
            ...

    # note the `session_factory` parameter
    host = ftputil.FTPHost(host, userid, password,
                           session_factory=SSLFTPSession)
    # use `host` as usual

.. _M2Crypto: http://wiki.osafoundation.org/bin/view/Projects/MeTooCrypto#Downloads

Connecting on another port
~~~~~~~~~~~~~~~~~~~~~~~~~~

By default, an instantiated ``FTPHost`` object connects on the usual
FTP ports. If you have to use a different port, refer to the
section `FTPHost construction`_.

You can use the same approach to connect in active or passive mode, as
you like.

Using active or passive connections
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Use a wrapper class for ``ftplib.FTP``, as described in section
`FTPHost construction`_::

    import ftplib

    class ActiveFTPSession(ftplib.FTP):
        def __init__(self, host, userid, password):
            """
            Act like ftplib.FTP's constructor but use active mode
            explicitly.
            """
            ftplib.FTP.__init__(self)
            self.connect(host, port)
            self.login(userid, password)
            # see http://docs.python.org/lib/ftp-objects.html
            self.set_pasv(False)

Use this class as the ``session_factory`` argument in ``FTPHost``'s
constructor.

Conditional upload/download to/from a server in a different time zone
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You may find that ``ftputil`` uploads or downloads files
unnecessarily, or not when it should. This can happen when the FTP
server is in a different time zone than the client on which
``ftputil`` runs. Please see the section on `time zone correction`_.
It may even be sufficient to call `synchronize_times`_.

Wrong dates or times when stat'ing on a server
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Please see the previous tip.

I tried to upload or download a file and it's corrupt
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Perhaps you used the upload or download methods without a ``mode``
argument. For compatibility with Python's code for local file systems,
``ftputil`` defaults to ASCII/text mode which will try to convert
presumable line endings and thus corrupt binary files. Pass "b" as the
``mode`` argument (see `Uploading and downloading files`_).

When I use ``ftputil``, all I get is a ``ParserError`` exception
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The FTP server you connect to uses a directory format that
``ftputil`` doesn't understand. You can either write and
`plug in an own parser`_, or preferably ask on the `mailing list`_ for
help.

.. _`plug in an own parser`: `Writing directory parsers`_

``isdir``, ``isfile`` or ``islink`` incorrectly return ``False``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Like Python's counterparts under `os.path`_, ``ftputil``'s methods
return ``False`` if they can't find the given path.

Probably you used ``listdir`` on a directory and called ``is...()`` on
the returned names. But if the argument for ``listdir`` wasn't the
current directory, the paths won't be found and so all ``is...()``
variants will return ``False``.

I don't find an answer to my problem in this document
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Please send an email with your problem report or question to the
`ftputil mailing list`_, and we'll see what we can do for you. :-)


Bugs and limitations
--------------------

- ``ftputil`` needs at least Python 2.3 to work.

- Due to the implementation of ``lstat`` it can not return a sensible
  value for the root directory ``/`` though stat'ing entries *in* the
  root directory isn't a problem. If you know an implementation that
  can do this, please let me know. The root directory is handled
  appropriately in ``FTPHost.path.exists/isfile/isdir/islink``, though.

- Timeouts of individual child sessions currently are not handled.
  This is only a problem if your ``FTPHost`` object or the generated
  ``FTPFile`` objects are inactive for about ten minutes or longer.

- Until now, I haven't paid attention to thread safety. In principle,
  at least, different ``FTPFile`` objects should be usable in different
  threads. If in doubt if your approach will work, ask on the mailing
  list.

- ``FTPFile`` objects in text mode *may not* support charsets with
  more than one byte per character. Please e-mail your experiences to
  the mailing list (see above), if you work with multibyte text
  streams in FTP sessions.

- Currently, it is not possible to continue an interrupted upload or
  download. Contact me if you have problems with that.

- There's exactly one cache for lstat results for each ``FTPHost``
  object, i. e. there's no sharing of cache results determined by
  several ``FTPHost`` objects.


Files
-----

If not overwritten via installation options, the ``ftputil`` files
reside in the ``ftputil`` package. The documentation in
`reStructuredText`_ and in HTML format is in the same directory.

.. _`reStructuredText`: http://docutils.sourceforge.net/rst.html

The files ``_test_*.py`` and ``_mock_ftplib.py`` are for unit-testing.
If you only *use* ``ftputil``, i. e. *don't* modify it, you can
delete these files.


References
----------

- Mackinnon T, Freeman S, Craig P. 2000. `Endo-Testing:
  Unit Testing with Mock Objects`_.

- Postel J, Reynolds J. 1985. `RFC 959 - File Transfer Protocol (FTP)`_.

- Van Rossum G, Drake Jr FL. 2003. `Python Library Reference`_.

.. _`Endo-Testing: Unit Testing with Mock Objects`:
   http://www.connextra.com/aboutUs/mockobjects.pdf
.. _`RFC 959 - File Transfer Protocol (FTP)`: http://www.ietf.org/rfc/rfc959.txt
.. _`Python Library Reference`: http://www.python.org/doc/current/lib/lib.html


Authors
-------

``ftputil`` is written by Stefan Schwarzer
<sschwarzer@sschwarzer.net>, in part based on suggestions
from users.

The ``lrucache`` module is written by Evan Prodromou
<evan@prodromou.name>.

Feedback is appreciated. :-)

