ProFTPD Developer's Guide: Introduction

ProFTPD Version 1.2

ProFTPD is an FTP server modeled around the Apache HTTP server, with a similar configuration file syntax and modular structure. In light of this similarity, I have utilized (ie plagiarized) the Apache API documentation, as many of the concepts are the same. Some of the words and explanations below are not mine.

These are some notes on the ProFTPD API and the data structures you have to deal with, etc. They are not yet nearly complete, but hopefully, they will help you get your bearings. Keep in mind that the API is still subject to change as we gain experience with it. However, it will be easy to adapt modules to any changes that are made.

Handlers, Modules, and Commands
ProFTPD breaks down command handling into a series of simple steps, similar to the way the Apache API breaks down request handling. These are:

Preprocess the command
Process the command
Postprocess the command
Log the command

These phases are handled by looking at each of a succession of modules, looking to see if each of them has a handler for the phase, and attempting invoking it if so. The handler can typically do one of three things:

Handle the command, and indicate to the processing engine that it should consider the command completed, and continue its processing.
Decline to handle the command, and indicate to the processing engine that it should act as if the handler waas never called, and to continue its procesing.
Signal an error, by returning one of the FTP error codes. This terminates normal handling of the request; the command may be logged.

Most phases are terminated by the first module that handles them. The handlers themselves are functions of one argument (a cmd_rec structure), which returns a MODRET (a modret_struc typdefed to MODRET).

At this point, we need to explain the structure of a module. Our candidate will be one of the simple ones, the "case" module -- this will alter the case (e.g. lowercase or uppercase) of the letters in the name of the file requested by a client for download, before the server looks up that file.

Let's start with the command handlers. In order to catch the names of the requested files before the server retrieves them, the module declares a command handler that is interested in handling any download commands issued by a client.

This module also contains code to handle the DowncaseFileNames and UpcaseFileNames configuration commands themselves. To handle these multiple configuration commands, modules have command tables which declare their commands, and the corresponding configuration directive handler. The configuration directive handler performs such checks as whether the configuration directive is in an appropriate context, whether the arguments are correct, etc.

A final note on the declared types of the arguments of some of these commands: a pool is a pointer to a resource pool structure; these are used by the server to keep track of the memory which has been allocated, files opened, etc., either to service a particular request, or to handle the process of configuration itself. That way, when the request is over (or, for the configuration pool, when the server is restarting), the memory can be freed, and the files closed, en masse, without anyone having to write explicit code to track them all down and dispose of them.

With no further ado, the module itself:

  /* Declaration of command handler
   */
  MODRET fixup_filenames(cmd_rec *);

  /* Declaration of configuration directive handlers
   */
  MODRET set_lowercase_filenames(cmd_rec *);
  MODRET set_uppercase_filenames(cmd_rec *);

  /* Define the "configuration handler" table, which links configuration file
   * directives with the appropriate handlers in this module
   */
  static conftable case_conftab[] = {
    { "DowncaseFileNames",  set_downcase_filenames, NULL },
    { "UpcaseFilenames", set_upcase_filenames, NULL },
    { NULL }
  };

  /* Define the "command handler" table, which links client-issued commands
   * with the interested handlers in this module
   */
  static cmdtable case_cmdtab[] = {
    { PRE_CMD, C_RETR, G_NONE, fixup_filenames, TRUE, FALSE },
    { 0, NULL }
  };

  module case_module = {

    NULL,                   /* pointer to the next module -- ALWAYS NULL */
    NULL,                   /* pointer to the previous module -- ALWAYS NULL */
    0x20,                   /* ProFTPD Module API version 2.0 */
    "case",                 /* the module's name */
    case_conftab,           /* configuration command handler table */
    case_cmdtab,            /* command handler table */
    NULL,                   /* authentication function table */
    NULL,                   /* initialization function */
    NULL                    /* "child" initialization function */
  };

How Handlers Work
The sole argument to handlers is a cmd_rec structure. This structure describes a particular command which has been made to the server, on behalf of a client. Each connection by a client generates multiple cmd_rec structures, starting with the USER command.

The cmd_rec contains pointers to a resource pool which will be cleared when the server is finished handling the command; to structures containing per-server information, and most importantly, information on the command itself.

Also present are pointers to private data a handler has built in the course of servicing the command (so modules' handlers for one phase can pass `notes' to their handlers for other phases), and to a server_rec, which contains per (virtual) server configuration data.

Here is a declaration of the cmd_rec struct.

Most cmd_rec structures are built by when the processing engine reads an FTP command from a client, and fills in the fields. The filled-in cmd_rec is then handed off to the command handlers that have registered an interest in handling that particular FTP command.

Command Responses
As discussed above, each handler, when invoked to handle a particular cmd_rec, has to return a MODRET, usually one generated by some macros, to indicate what happened. That can be one of:

HANDLED -- the command was handled successfully. This may or may not terminate the phase.
DECLINED -- no erroneous condition exists, but the module declines to handle the phase; the server tries to find another.
ERROR -- an error has occurred while processing the command, which aborts its handling.

There are two main ways to respond to a client command inside a command handler. The first way is incompatible with other handlers, and should only be used if the handler is about to terminate the current connection (and thus kill the connection, usually with end_login()). This first method must be used because in the event that a handler is about to terminate a connection, the internal response lists will never be processed by the processing engine.

The second, and preferred, method of transmitting numeric plus text message responses to clients is via the internal response chain. Using this allows all handlers to add their own individual responses which will all be sent en masse after the command successfully completes (or fails).

Here is more detailed information on ProFTPD's response chain mechanisms.

Authentication Handlers
The processing of authentication commands is a little different from the other FTP commands.

NOTE: Stuff that should be discussed here:

authentication commands of USER, PASS (RFC2228 AUTH, ADAT)
authtab and specific authentication handlers (mod_unixpw and mod_lap examples)
relevant FTP error response codes

Logging Handlers
The logging of commands occurs as part of the handling process, and can be done at multiple points in the process.

Stuff that should be discussed here:

LOG_CMD, LOG_CMD_ERR
mod_log, mod_xfer, mod_sample's log_cmd()

Resource Allocation and Resource Pools
One of the problems of writing and designing a server-pool server is that of preventing leakage, that is, allocating resources (memory, open files, etc.), without subsequently releasing them. The resource pool machinery is designed to make it easy to prevent this from happening, by allowing resource to be allocated in such a way that they are automatically released when the server is done with them.

The way this works is as follows: the memory which is allocated, file opened, etc., to deal with a particular command are tied to a resource pool which is allocated for the command. The pool is a data structure which itself tracks the resources in question.

When the command has been processed, the pool is cleared. At that point, all the memory associated with it is released for reuse, all files associated with it are closed, and any other clean-up functions which are associated with the pool are run. When this is over, we can be confident that all the resource tied to the pool have been released, and that none of them have leaked.

Server restarts, and allocation of memory and resources for per-server configuration, are handled in a similar way. There is a configuration pool, which keeps track of resources which were allocated while reading the server configuration files, and handling the commands therein (for instance, the memory that was allocated for per-server module configuration, log files and other files that were opened, and so forth). When the server restarts, and has to reread the configuration files, the configuration pool is cleared, and so the memory and file descriptors which were taken up by reading them the last time are made available for reuse.

We begin here by describing how memory is allocated to pools, and then discuss how other resources are tracked by the resource pool machinery.

Allocation of memory in pools
Memory is allocated to pools by calling the function palloc(), which takes two arguments, one being a pointer to a resource pool structure, and the other being the amount of memory to allocate (in ints). Within handlers for handling commands, the most common way of getting a resource pool structure is by looking at the pool (or tmp_pool, if appropriate) slot of the relevant cmd_rec; hence the repeated appearance of the following idiom in module code:

  MODRET my_handler(cmd_rec *cmd) {
      struct my_structure *foo;
      ...

      foo = (foo *) palloc (cmd->pool, sizeof(my_structure));
  }

Note that there is no pfree() -- palloc()ed memory is freed only when the associated resource pool is cleared. This means that palloc() does not have to do as much accounting as malloc(); all it does in the typical case is to round up the size, bump a pointer, and do a range check.

Allocating initialized memory
There are functions which allocate initialized memory, and are frequently useful. The function pcalloc() has the same interface as palloc(), but clears out the memory it allocates before it returns it. The function pstrdup() takes a resource pool and a char * as arguments (pstrndup() takes an additional int), and allocates memory for a copy of the string the pointer points to, returning a pointer to the copy. Finally pstrcat() is a varargs-style function, which takes a pointer to a resource pool, and the additional arguments, of which the last one should be NULL. It allocates enough memory to fit copies of each of the strings, as a unit; for instance:

     pstrcat(cmd->pool, "foo", "/", "bar", NULL);

returns a pointer to 8 bytes worth of memory, initialized to "foo/bar".

For almost everything folks do, cmd->pool is the pool to use. For memory needed just for the scope of the handler function, cmd->tmp_pool is also useful.

Tracking open files, etc.
As indicated above, resource pools are also used to track other sorts of resources besides memory. The most common are open files. The routine which is typically used for this is pfopen(), which takes a resource pool and two strings as arguments; the strings are the same as the typical arguments to fopen(), e.g.:

     ...

     FILE *f = pfopen(cmd->pool, cmd->args, "r");

     if (f == NULL) { ... } else { ... }

     ...

There is also a popenf() routine, which parallels the lower-level open() system call. Both of these routines arrange for the file to be closed when the resource pool in question is cleared.

Unlike the case for memory, there are functions to close files allocated with pfopen(), and popenf(), namely pfclose() and pclosef(). (This is because, on many systems, the number of files which a single process can have open is quite limited). It is important to use these functions to close files allocated with pfopen() and popenf(), since to do otherwise could cause fatal errors on systems such as Linux, which react badly if the same FILE * is closed more than once.

(Using the close() functions is not mandatory, since the file will eventually be closed regardless, but you should consider it in cases where your module is opening, or could open, a lot of files).

Pool cleanups live until clear_pool() is called: clear_pool(p) recursively calls destroy_pool() on all subpools of p; then calls all the cleanups for p; then releases all the memory for p. destroy_pool(p) calls clear_pool(p) and then releases the pool structure itself, i.e., clear_pool(p) doesn't delete p, it just frees up all the resources and you can start using it again immediately.

Configuration Directives
Given configuration directives, we need to be able to figure out what to do with them. In this case, it involves processing the actual DowncaseFileNames and UpcaseFileNames directives. To find directives, the processing engine looks in the module's configuration table. That table contains information on what directives the module handles and the corresponding configuration handler. Without further ado, let's look at the DowncaseFileNames configuration handler, which looks like this (the UpcaseFileNames directive looks basically the same, and won't be shown here):

  MODRET set_lowercase_filenames(cmd_rec *cmd) {

    int bool = -1;
    config_rec *c = NULL;

    /* make sure the directive was given one, and only one, argument
     */
    CHECK_ARGS(cmd, 1);

    /* check the context in which the directive was used, make sure that
     * it was one of the allowed contexts of "server config", <Anonymous>,
     * <Limit>, or <VirtualHost>.
     */
    CHECK_CONF(cmd, CONF_ROOT|CONF_ANON|CONF_LIMIT|CONF_VIRTUAL);

    /* get_boolean() couldn't find a valid Boolean value as an argument
     */
    if ((bool = get_boolean(cmd, 1)) == -1)
      CONF_ERROR(cmd, "requires a boolean value");

    c = add_config_param("DowncaseFileNames", 1, (void *) bool);

    /* merge this configuration directive "down", so that it affects any
     * contained contexts
     */
    c->flags |= CF_MERGEDOWN;

    return HANDLED(cmd);
  }

This is a fairly typical configuration handler. As you can see, it takes only one argument, a cmd_rec pointer. That structure contains a bunch of arguments which are frequently of use to some, but not all, commands, including a resource pool (from which memory can be allocated, and to which cleanups should be tied), and the (virtual) server being configured, from which the module's per-server configuration data can be obtained if required.

It is also fairly typical in its checking of the configuration directive arguments. The number of arguments is checked with CHECK_ARGS, which in this case requires that only one argument be used with the directive. Next, the configuration context is checked with CHECK_CONF. Finally, since this configuration directive needs only a true or false argument, the given argument is parsed as a Boolean value, and an error generated if this is not the case.

The DowncaseFileNames configuration directive will automatically be stored in the configuration records for the containing server, either "main" (for anything outside of <Anonymous> and <VirtualHost> contexts), <Anonymous>, or <VirtualHost>. The server_rec for the containing server of the configuration directive's cmd_rec is pointed to by cmd->server.

The "case" module's configuration table has entries for these directives, which look like this (as seen above):

  static conftable case_conftab[] = {
    { "DowncaseFileNames",  set_downcase_filenames, NULL },
    { "UpcaseFilenames", set_upcase_filenames, NULL },
    { NULL }
  };

The entries in these tables are:

the name of the configuration directive
the function which handles it
a pointer which is set to the "owning" module when the module code is compiled; should always be set to NULL.

Finally, having set this all up, we have to use it. This is ultimately done in the module's handlers, specifically for its filename handler, which looks more or less like this:

  MODRET fixup_filenames(cmd_rec *cmd) {

    unsigned int index;
    char *current_filename = NULL, *new_filename = NULL;
    config_rec *conf_downcase = NULL, *conf_upcase = NULL;

    /* check the current configuration context for the configuration
     * directive boolean value, true or false.  If false, return now
     */

    conf_downcase = find_config(CURRENT_CONF, CONF_PARAM, "DowncaseFileNames",
      FALSE);
    conf_upcase = find_config(CURRENT_CONF, CONF_PARAM, "UpcaseFileNames",
      FALSE);

    if (!conf_downcase && !conf_upcase)
      return DECLINED(cmd);

    /* get an adjusted requested filename
     */
    if (conf_downcase->argv[0] == TRUE)
      new_filename = adjust_filename(cmd->server->pool, cmd->arg, PR_CASE_DOWN);

    else if (conf_upcase->argv[0] == TRUE)
      new_filename = adjust_filename(cmd->server->pool, cmd->arg, PR_CASE_UP);

    /* copy the new filename into the command record */
    sstrcpy(cmd->arg, new_filename, strlen(new_filename));

    /* done -- proceed with the download
     */
    return DECLINED(cmd);
  }

The registration of this as a command handler for downloads (the FTP RETR command) is done in the cmdtable, shown earlier:

  { PRE_CMD, C_RETR, G_NONE, fixup_filenames, TRUE, FALSE },

The DowncaseFileNames or UpcaseFileNames configuration directives are retrieved. If neither applies to the file requested for the RETR command, the handler DECLINEs, and processing continues on to the next handler that is registered for this command. If one of the configuration directives do apply, the "fixup" is done on the filename, then the handler exits with a DECLINE, letting other handlers work on the cmd_rec, which now has the adjusted filename.

The writing of the adjust_filename() function is left as an exercise to you, the budding module developer.

Table of Contents

Author: $Author: castaglia $
Last Updated: $Date: 2003/09/15 15:16:21 $