:title: Django and Celery
:description: How to build a simple Django application that runs tasks on a remote daemon with Celery and integrates Celery in the Django admin with django-celery.
:keywords: dotCloud, tutorial, documentation, python, Django Celery, remote, daemon
Django and Celery
=================
.. include:: ../../dotcloud2note.inc
As you write your application you will certainly need to execute some
asynchronous tasks. It could be anything that requires some form of (lengthy)
processing: image resizing, archiving, document analysis...
These tasks could be run from the same machine where your application server is,
but best practices advise to do this on a different machine because:
#. You avoid impacts from the background jobs to your application;
#. Decoupling parts of the application eases maintenance and scaling.
According to the `Celery project homepage `_,
Celery is "an asynchronous task queue/job queue based on distributed message
passing. It is focused on real-time operation, but supports scheduling as
well".
`Django `_ is the famous Python framework that
describes itself as a: "high-level Python Web framework that encourages rapid
development and clean, pragmatic design".
This tutorial will show how to use Celery and Django to build a `simple web
application `_ that executes tasks on a
remote daemon. The integration of Celery in the Django administrative panel
using `django-celery `_
will be covered too.
This tutorial is based on the :doc:`/tutorials/python/django` tutorial.
.. contents::
:local:
:depth: 1
Application Architecture
------------------------
To experiment with this tutorial, you can clone the application from
https://bitbucket.org/lopter/dj-celery/. Here is what the application directory
looks like::
.
├── dotcloud.yml # The description of our stack
├── minestrone/ # The Django project directory
│ ├── __init__.py
│ ├── manage.py
│ ├── settings.py
│ ├── soup/ # Hold the application code
│ ├── templates/ # Hold the templates
│ └── urls.py
├── mkadmin.py # Used to create the admin account after `dotcloud push'
├── nginx.conf # Some Nginx rules to serve Django static files
├── postinstall* # Run at the end of the dotCloud build to setup Django and Celery
├── requirements.txt # Hold the Python dependencies: `Django' and `django-celery'
└── wsgi.py # The entry point of Django for Nginx
The relevant Python code is located in the ``minestrone/`` [#]_ directory
where we have:
- ``settings.py``: To configure the database as well as the Celery broker
(RabbitMQ), that stores the list of tasks to execute;
- ``soup/views.py``: Define a web page to enqueue tasks and a page to display
the active ones;
- ``soup/tasks.py``: Hold the jobs definitions.
Once deployed the application runs like this:
.. literalinclude:: django-celery-diagram.utf8
:language: none
:encoding: utf-8
We will see how to connect Celery to the RabbitMQ broker and launch some Celery
workers, then how to create tasks. There are also some dotCloud specific files
that will be covered last.
Setting up a RabbitMQ Server
----------------------------
In order to use this tutorial, you will need to first get a RabbitMQ service. dotCloud recommends `CloudAMQP `_ for getting a RabbitMQ server. Follow the directions in our :doc:`CloudAMQP tutorial ` to setup your RabbitMQ server.
Connect Celery to RabbitMQ
--------------------------
This is just a matter of editing ``settings.py`` to specify the host, port,
username and password of RabbitMQ. Assuming you followed the steps above, when setting up your RabbitMQ server, these credentials are found in the file
:doc:`~/environment.json ` generated by the dotCloud build
process when you push your application to dotCloud.
The environment file is loaded into a Python dictionary in the beginning of the
``settings.py`` file:
.. code-block:: python
# minestrone/settings.py:
# Django settings for minestrone project.
import os
import json
import djcelery
# Load the dotCloud environment
with open('/home/dotcloud/environment.json') as f:
dotcloud_env = json.load(f)
# …
With the credentials parsed, Django-celery is set up and the Celery broker
configured:
.. code-block:: python
# minestrone/settings.py:
# …
# Configure Celery using the RabbitMQ credentials found in the dotCloud
# environment.
djcelery.setup_loader()
BROKER_HOST = dotcloud_env['CLOUDAMQP_RABBITMQ_AMQP_HOST']
BROKER_PORT = int(dotcloud_env['CLOUDAMQP_RABBITMQ_AMQP_PORT'])
BROKER_USER = dotcloud_env['CLOUDAMQP_RABBITMQ_AMQP_LOGIN']
BROKER_PASSWORD = dotcloud_env['CLOUDAMQP_RABBITMQ_AMQP_PASSWORD']
BROKER_VHOST = dotcloud_env['CLOUDAMQP_RABBITMQ_AMQP_VIRTUALHOST']
BROKER_VHOST corresponds to a name space in RabbitMQ where Exchanges and Queues
are stored. If you decide to use a different broker (e.g: MongoDB or Redis) this
will correspond to the database name or number to use.
We also configure the default Celery queue, this is completely optional and we
simply use it in this tutorial to illustrate how you can route tasks with
RabbitMQ:
.. code-block:: python
# minestrone/settings.py:
# A very simple queue, just to illustrate the principle of routing.
CELERY_DEFAULT_QUEUE = 'default'
CELERY_QUEUES = {
'default': {
'exchange': 'default',
'exchange_type': 'topic',
'binding_key': 'tasks.#'
}
}
# …
Running the Workers
-------------------
With Django-Celery the Celery workers are launched using ``manage.py`` from the
root of the application, with the command::
python minestrone/manage.py celeryd -E -l info -c 2
Here is what each command switch does:
- ``-E`` activates events, this tells the workers to send notifications of what
they are doing (started/finished a task, etc.);
- ``-l info`` asks the workers to log every messages that have a priority
superior or equal to "info";
- ``-c 2`` launches two workers ("c" as in "concurrency").
We will see how to automatically run this on dotCloud when you push your code,
but let's create some tasks to execute first.
Create Some Tasks
-----------------
Tasks in the Celery sense can be plain functions using a "@celery.task.task"
decorator or classes inherited from the "celery.task.Task" class. In a
Django application you place your tasks in a ``tasks.py`` module, side by side
with your views or models.
This tutorial defines a very simple task, called "lazy_job" in the "soup"
application:
.. code-block:: python
# minestrone/soup/tasks.py:
import time
from celery.task import task
@task(ignore_result=True)
def lazy_job(name):
logger = lazy_job.get_logger()
logger.info('Starting the lazy job: {0}'.format(name))
time.sleep(5)
logger.info('Lazy job {0} completed'.format(name))
We give the "ignore_result=True" argument to the task decorator to tell Celery
that we don't care about the result of our task. This is advised by the Celery
documentation to reduce resources usage [#]_.
New "lazy_job" tasks are created from the "EditorView" view using the
"apply_async" method:
.. code-block:: python
# minestrone/soup/views.py:
from django.http import HttpResponseRedirect
from django.views.generic import TemplateView, FormView
from django import forms
from celery.task.control import inspect
from minestrone.soup import tasks
# …
class EditorView(FormView):
# …
def form_valid(self, form):
name = form.cleaned_data['job_name']
routing_key = 'tasks.{0}'.format(form.cleaned_data['routing_key_name'])
tasks.lazy_job.apply_async(args=[name], routing_key=routing_key)
return HttpResponseRedirect(self.get_success_url())
The "name" and "routing_key" variables are extracted from the form embedded in
the view. The "apply_async" method can take different, all optional, arguments,
here we are only using:
#. "args" that contains the list of arguments to forward to the "lazy_job"
function;
#. "routing_key" that can be used to route tasks to different workers, but here,
the key will be always matched by the "default" queue defined in
``minestrone/settings.py``.
The Django Celery Admin Panel
-----------------------------
Without really letting you know, we already did half of the Django and Celery
integration. The two missing things are:
- a database to store all the Celery events;
- running the ``celerycam`` daemon that takes snapshots of the events sent by
the workers, storing them in a database.
The database is configured from the settings file, using the dotCloud
environment:
.. code-block:: python
# minestrone/settings.py:
# …
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': 'template1',
'USER': dotcloud_env['DOTCLOUD_DB_SQL_LOGIN'],
'PASSWORD': dotcloud_env['DOTCLOUD_DB_SQL_PASSWORD'],
'HOST': dotcloud_env['DOTCLOUD_DB_SQL_HOST'],
'PORT': int(dotcloud_env['DOTCLOUD_DB_SQL_PORT']),
}
}
We are using PostgreSQL, but any database supported by Django would work.
The command to launch the ``celerycam`` daemon is::
python minestrone/manage.py celerycam
As for the Celery worker, we will see how to automatically execute this command
when you push on dotCloud in the next sections.
For your reference, the other parts of the Django Celery integration were:
- Celery is configured from the Django ``settings.py`` file (instead of the
usual ``celeryconfig.py`` file);
- ``celeryd`` is invoked through Django's ``manage.py`` command instead of being
directly run from the shell;
- You have to pass the ``-E`` argument to ``celeryd`` to collect workers events.
dotCloud Specific Details
-------------------------
We have already seen how we use the ``environment.json`` file to configure
Celery and PostgreSQL automatically. But it's not the only dotCloud specific
detail, there is also:
- A dotCloud build file: ``dotcloud.yml``;
- A Supervisor configuration include file: ``supervisord.conf`` that is installed
from the ``postinstall`` hook;
- A ``wsgi.py`` file that bridges the web server and Django;
- A ``nginx.conf`` that defines a couple of locations where the
Django static and media files are stored.
For the Python dependencies, dotCloud follows the ``requirements.txt``
convention, this file contains::
Django
django-celery
setproctitle
These dependencies will be installed by `pip `_
(an ``easy_install`` replacement) from `PyPI `_
when the application is pushed on dotCloud. The setproctitle package is an
optional dependency of Celery. When it is installed, Celery can display useful
information in the process title instead of the command line. You can see it
in action if you log into the ``workers`` services and run ``ps aux``.
The setup of the Django static and media files and of ``wsgi.py`` is already
covered in the :doc:`django` tutorial, so only the ``dotcloud.yml`` and
``supervisord.conf`` are detailed here.
The dotCloud Build File
~~~~~~~~~~~~~~~~~~~~~~~
The dotCloud Build File is straightforward and just describes the architecture
of our application on dotCloud, with its four services:
- a web server pre-configured for Python WSGI-compatible applications like
Django;
- a service to launch workers;
- a PostgreSQL server;
``dotcloud.yml``:
.. code-block:: yaml
www:
type: python
workers:
type: python-worker
db:
type: postgresql
The Supervisor Configuration Include
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The Supervisor configuration include, named ``supervisord.conf`` is used to
launch and monitor the Celery workers and the ``celerycam`` daemons:
.. code-block:: ini
[program:djcelery]
directory = /home/dotcloud/current/
command = python minestrone/manage.py celeryd -E -l info -c 2
stderr_logfile = /var/log/supervisor/%(program_name)s_error.log
stdout_logfile = /var/log/supervisor/%(program_name)s.log
[program:celerycam]
directory = /home/dotcloud/current/
command = python minestrone/manage.py celerycam
stderr_logfile = /var/log/supervisor/%(program_name)s_error.log
stdout_logfile = /var/log/supervisor/%(program_name)s.log
You may have recognized a piece of .ini file, let's break it down:
.. code-block:: ini
[program:djcelery]
Defines a new background process configuration block. The process is called
"djcelery" here.
.. code-block:: ini
directory = /home/dotcloud/current/
command = python minestrone/manage.py celeryd -E -l info -c 2
These two lines tell Supervisor how the background process should be launched:
- In the directory ``/home/dotcloud/current``, this where the :ref:`dotCloud
builder ` will install your application and thus where
the Django application lives;
- Using the command: "python minestrone/manage.py celeryd -E -l info -c 2".
.. code-block:: ini
stderr_logfile = /var/log/supervisor/%(program_name)s_error.log
stdout_logfile = /var/log/supervisor/%(program_name)s.log
Redirect the output of the workers into these two log files, "%(program_name)s"
will be replaced by "djcelery". If you don't put these two lines Supervisor will
create the log files for you, but with a *less readable* (random) name.
The exact same thing is repeated for the ``celerycam`` daemon.
This configuration include is generated from the :doc:`/guides/hooks`,
if we are on the "workers" service:
.. code-block:: sh
# postinstall:
# …
dotcloud_get_env() {
sed -n "/$1/ s/.*: \"\(.*\)\".*/\1/p" < "/home/dotcloud/environment.json"
}
setup_django_celery() {
cat > /home/dotcloud/current/supervisord.conf << EOF
[program:djcelery]
directory = /home/dotcloud/current/
command = python minestrone/manage.py celeryd -E -l info -c 2
stderr_logfile = /var/log/supervisor/%(program_name)s_error.log
stdout_logfile = /var/log/supervisor/%(program_name)s.log
[program:celerycam]
directory = /home/dotcloud/current/
command = python minestrone/manage.py celerycam
stderr_logfile = /var/log/supervisor/%(program_name)s_error.log
stdout_logfile = /var/log/supervisor/%(program_name)s.log
EOF
}
if [ `dotcloud_get_env DOTCLOUD_SERVICE_NAME` = workers ] ; then
setup_django_celery
# …
Conclusion
----------
Django and Celery are very easy to get running on dotCloud. Let's review what we
did here:
#. Write a ``dotcloud.yml`` with all the services we need;
#. Setup a RabbitMQ server and added the ENV variables to our application.
#. Configure Celery and PostgreSQL in ``settings.py``;
#. Defines some tasks in the ``tasks.py`` module of the Django application;
#. Launched some :doc:`workers ` by using a
``supervisord.conf`` file on the :doc:`/services/python-worker/` service;
#. Enqueue some tasks using its "apply_async" method from a Django view.
The example application lives on: http://django-celery.dotcloudapp.com/.
You can clone the code from https://bitbucket.org/lopter/dj-celery/, "cd" into
it, create your application with the :doc:`flavor ` of
your choice and push it to dotCloud::
dotcloud create djcelery
dotcloud push
While you are fiddling with the web interface you can see the jobs being
performed in the workers logs::
dotcloud logs workers
You can also have a look at the Django administration panel at:
http:///admin/. The user name to use is "admin" with the default
password (configured in ``mkadmin.py``): "password".
Celery is very powerful, especially when coupled with RabbitMQ, you are highly
encouraged to take a look at its excellent documentation: http://celery.readthedocs.org/en/latest/.
----
.. [#] The Minestrone soup is a famous Italian dish that contains some Celery…
.. [#] http://celery.readthedocs.org/en/latest/userguide/tasks.html#tips-and-best-practices