We removed our free Sandbox April 25th.
You can read more on our blog.

Background Processes

Note

CLI Command examples on this page are always provided without the --application (shorthand -A) argument, assuming you’re running these commands in a connected folder (at creation or using the dotcloud connect command). For more details on connected folders, see Migrating to the CLI 0.9.

Sometimes, you need to run a program for a longer time than a single HTTP request. Use cases include:

  • CPU-intensive jobs (e.g. video transcoding);
  • running some code at a specified time, or on regular intervals;
  • background activity (e.g. crawling 3rd party service to update your database);
  • run a specific web server, like Node.js or Tornado;
  • and much more!

dotCloud provides “worker services” dedicated to those tasks. There is a different service for each language: Ruby Worker, PHP & PHP Worker, Perl Worker, Python worker, Node.js... The only difference between them is the set of pre-installed packages, and dependencies handling: python-worker supports requirements.txt, while —for instance— ruby-worker supports Gemfile.

All worker services rely on Supervisor to start and monitor your processes. Supervisor will start defined programs automatically, and will restart them automatically if they crash or exit. If you just need to run a program at a specified interval, you can also use a crontab and ignore Supervisor.

Note

You can also use a “non-worker” service to run some background jobs. More specifically, all services feature crontab, allowing you to run Periodic Tasks. So if you want to run a daily Python script, e.g., a stock ticker in your database, you don’t have to dedicate a python-worker to this task: you cansetup the crontab in the same python service that you use for your web application. However, you should be aware that when you scale your application, the cron tasks will be scheduled in all scaled instances – which is probably not what you need! So in many cases, it will still be better to use a separate service.

Similarly, a lot of (non-worker) services already run Supervisor, so you can run additional background jobs in those services. Then again, remember that those background jobs will run in multiple instances if you scale your application. Moreover, if you add background jobs to your web service, it will get fewer resources to serve pages, and your performance will take a significant hit.

Technically, if you really want to know – there is almost no difference between worker and non-worker services. For instance, the python-worker service is basically the python service without Nginx (HTTP server) and uWSGI (Python web workers) running. Both can optionally run background processes using Supervisor.

Defining Daemons

To run a background process you need to write a supervisord.conf file in the .ini format, for example:

[program:daemonname]
command = php /home/dotcloud/current/my_daemon.php

This is the most simple configuration that you can write, it defines a daemon called daemonname that is launched using the “php /home/dotcloud/current/my_daemon.php” command.

You can use multiple [program:x] sections to run different daemons.

Once, you have the code of your daemon, your supervisord.conf file and your dotcloud.yml Build File can push your application on dotCloud using “dotcloud push”.

Note

When you run "dotcloud push", your code will be installed in /home/dotcloud/current by the dotCloud builder process. That’s why we specified this path.

If your script is installed into the $PATH, you don’t need to include its full path. (i.e: you can just write “command = my_daemon.php” instead).

Note

If you are not on Windows, do not forget to set the executable bit on your daemon by using “chmod +x”.

Configuring The Environment

You can easily modify the environment of execution of your daemon with the “directory” and “environment” directives to change the directory where the command is executed and to define additional environment variable. For example:

[program:daemonname]
command = php my_daemon.php
directory = /home/dotcloud/current/
environment = QUEUE="*" , VERBOSE="TRUE"

Note

Don’t forget the quotes around the environment variable values! Supervisor is quite picky about this. If you specify something like PYTHONPATH=/foo/bar instead of PYTHONPATH="/foo/bar", Supervisor will truncate it to /. So beware!

Exit Cleanly With Signals

You are advised to catch the SIGTERM signal sent by Supervisor when it tries to gracefully stop or restart your daemon. This will happen each time you push a new revision of your application. This is important if your daemon cannot be interrupted in the middle of a job.

Here is some complete daemon examples that exit cleanly on SIGTERM:

PHP

To catch signals in PHP you need to add a “declare()” at the top of your PHP files and then register a callback:

#!/usr/bin/env php
<?php

    // This is mandatory to use the UNIX signal functions:
    // http://php.net/manual/en/function.pcntl-signal.php
    declare(ticks = 1);

    // A function to write on the error output
    function    warn($msg)
    {
        $stderr = fopen("php://stderr", "w+");
        fputs($stderr, $msg);
    }

    // Callback called when you run `supervisorctl stop'
    function    sigterm_handler($signo)
    {
        warn("Kaboom Baby!\n");
        exit(0);
    }

    function    main()
    {
        while (true) {
            warn("Tick\n");
            sleep(1);
        }
    }

    // Bind our callback on the SIGTERM signal and run the daemon
    pcntl_signal(SIGTERM, "sigterm_handler");
    main();

?>

You can find the full reference on Unix signal functions here: http://www.php.net/manual/en/intro.pcntl.php

Ruby

#!/usr/bin/env ruby

# Callback called when you run `supervisorctl stop'
def sigterm_handler
    warn "Kaboom Baby!"
    exit
end

def main
    while true do
        warn "Tick"
        sleep 1
    end
end

# Bind our callback to the SIGTERM signal and run the daemon:
Signal.trap("TERM") { sigterm_handler }
main

Perl

#!/usr/bin/env perl

use strict;
use warnings;

sub sigterm_handler()
{
    print STDERR "Kaboom Baby!\n";
    exit 0;
}

sub main()
{
    while (1) {
        print STDERR "Tick\n";
        sleep 1;
    }
}

$SIG{TERM} = \&sigterm_handler;
main;

Python

#!/usr/bin/env python

import sys
import time
import signal

# Callback called when you run `supervisorctl stop'
def sigterm_handler(signum, frame):
    print >> sys.stderr, "Kaboom Baby!"
    sys.exit(0)

def main():
    while True:
        print >> sys.stderr, "Tick"
        time.sleep(1)

# Bind our callback to the SIGTERM signal and run the daemon:
signal.signal(signal.SIGTERM, sigterm_handler)
main()

NodeJS

#!/usr/bin/env node

function sigterm_handler() {
    console.warn('Kaboom Baby!');
    process.exit(0);
}

function main() {
    setInterval(function() {
                console.warn('Tick');
            },
            1000
    );
}

process.on('SIGTERM', sigterm_handler);
main();

If your daemon doesn’t follow the SIGTERM convention you can tell Supervisor to use the signal of your choice with the “stopsignal” directive:

[program:daemonname]
command = php /home/dotcloud/current/my_daemon.php
stopsignal = QUIT

This example will use SIGQUIT to try to gracefully stop the daemon instead of SIGTERM.

Configure Logging

By default, Supervisor will create log files for you in the /var/log/supervisor/ directory.

You can change this by using the “stderr_logfile” —for the error output— and the “stdout_logile” —for the standard output— directives:

[program:daemonname]
command = php /home/dotcloud/current/my_daemon.php
stderr_logfile = /var/log/supervisor/daemonname_error.log
stdout_logfile = /var/log/supervisor/daemonname.log

You can also choose to redirect the error output to the standard output, to get everything in one log file, with redirect_stderr = true.

Launching Multiple “Workers”

You can use Supervisor to launch multiple instances of one daemon. To do this you need three things in your supervisord.conf.

First, a “numprocs” entry, for the number of identical processes to launch:

numprocs = 2

Second, a “process_name” entry, to give each process a different name:

process_name = "%(program_name)s-%(process_num)s"

Finally, you also need to give each log file a different name:

stderr_logfile = /var/log/supervisor/daemonname_error-%(process_num)s.log
stdout_logfile = /var/log/supervisor/daemonname-%(process_num)s.log

Here is a complete supervisord.conf example:

[program:daemonname]
command = php /home/dotcloud/current/my_daemon.php
numprocs = 2
process_name = "%(program_name)s-%(process_num)s"
stderr_logfile = /var/log/supervisor/daemonname_error-%(process_num)s.log
stdout_logfile = /var/log/supervisor/daemonname-%(process_num)s.log

Troubleshooting

You can check that your daemon has been started properly with the following command:

dotcloud run workers supervisorctl status

If everything’s fine, you should see an output similar to this one:

daemonname                          RUNNING    pid 975, uptime 0:03:20

Supervisor provides some useful commands to start, stop, and restart programs:

dotcloud run tick supervisorctl stop daemonname
dotcloud run tick supervisorctl start daemonname
dotcloud run tick supervisorctl restart daemonname

You can also run the Supervisor shell interactively if you like:

dotcloud run tick supervisorctl

Caveats

For performance reasons, Supervisor buffers the standard output, so the “echoes” you do in your daemon will not show immediately in the log file. However, the error output is written right away in the corresponding log file. Here is how to write on the error output for different languages:

PHP

#!/usr/bin/env php
<?php

    function    warn($msg)
    {
        $stderr = fopen("php://stderr", "w+");
        fputs($stderr, $msg);
    }

    warn("This message will be written on the error output\n");

?>

Ruby

#!/usr/bin/env ruby

warn "This message will be written on the error output"

Perl

#!/usr/bin/env perl

print STDERR "This message will be written on the error output\n";

Python

#!/usr/bin/env python

print >> sys.stderr, "This message will be written on the error output"

NodeJS

#!/usr/bin/env node

console.warn("This message will be written on the error output\n");

Supervisor stops to read any output as soon as it begins to stop daemons [1]. For example that means that the output of our sigterm_handler function above will never show up in the logs, however the function is perfectly executed.

You can write daemons that listen on TCP or UDP ports but you will not be able to reach them over the Internet or from another dotCloud instance. However this is one of our next features.

You can find the supervisord.conf reference in the Supervisor documentation.


[1]http://www.plope.com/software/collector/271