:title: Django and Celery :description: How to build a simple Django application that runs tasks on a remote daemon with Celery and integrates Celery in the Django admin with django-celery. :keywords: dotCloud, tutorial, documentation, python, Django Celery, remote, daemon Django and Celery ================= .. include:: ../../dotcloud2note.inc As you write your application you will certainly need to execute some asynchronous tasks. It could be anything that requires some form of (lengthy) processing: image resizing, archiving, document analysis... These tasks could be run from the same machine where your application server is, but best practices advise to do this on a different machine because: #. You avoid impacts from the background jobs to your application; #. Decoupling parts of the application eases maintenance and scaling. According to the `Celery project homepage `_, Celery is "an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well". `Django `_ is the famous Python framework that describes itself as a: "high-level Python Web framework that encourages rapid development and clean, pragmatic design". This tutorial will show how to use Celery and Django to build a `simple web application `_ that executes tasks on a remote daemon. The integration of Celery in the Django administrative panel using `django-celery `_ will be covered too. This tutorial is based on the :doc:`/tutorials/python/django` tutorial. .. contents:: :local: :depth: 1 Application Architecture ------------------------ To experiment with this tutorial, you can clone the application from https://bitbucket.org/lopter/dj-celery/. Here is what the application directory looks like:: . ├── dotcloud.yml # The description of our stack ├── minestrone/ # The Django project directory │   ├── __init__.py │   ├── manage.py │   ├── settings.py │   ├── soup/ # Hold the application code │   ├── templates/ # Hold the templates │   └── urls.py ├── mkadmin.py # Used to create the admin account after `dotcloud push' ├── nginx.conf # Some Nginx rules to serve Django static files ├── postinstall* # Run at the end of the dotCloud build to setup Django and Celery ├── requirements.txt # Hold the Python dependencies: `Django' and `django-celery' └── wsgi.py # The entry point of Django for Nginx The relevant Python code is located in the ``minestrone/`` [#]_ directory where we have: - ``settings.py``: To configure the database as well as the Celery broker (RabbitMQ), that stores the list of tasks to execute; - ``soup/views.py``: Define a web page to enqueue tasks and a page to display the active ones; - ``soup/tasks.py``: Hold the jobs definitions. Once deployed the application runs like this: .. literalinclude:: django-celery-diagram.utf8 :language: none :encoding: utf-8 We will see how to connect Celery to the RabbitMQ broker and launch some Celery workers, then how to create tasks. There are also some dotCloud specific files that will be covered last. Setting up a RabbitMQ Server ---------------------------- In order to use this tutorial, you will need to first get a RabbitMQ service. dotCloud recommends `CloudAMQP `_ for getting a RabbitMQ server. Follow the directions in our :doc:`CloudAMQP tutorial ` to setup your RabbitMQ server. Connect Celery to RabbitMQ -------------------------- This is just a matter of editing ``settings.py`` to specify the host, port, username and password of RabbitMQ. Assuming you followed the steps above, when setting up your RabbitMQ server, these credentials are found in the file :doc:`~/environment.json ` generated by the dotCloud build process when you push your application to dotCloud. The environment file is loaded into a Python dictionary in the beginning of the ``settings.py`` file: .. code-block:: python # minestrone/settings.py: # Django settings for minestrone project. import os import json import djcelery # Load the dotCloud environment with open('/home/dotcloud/environment.json') as f: dotcloud_env = json.load(f) # … With the credentials parsed, Django-celery is set up and the Celery broker configured: .. code-block:: python # minestrone/settings.py: # … # Configure Celery using the RabbitMQ credentials found in the dotCloud # environment. djcelery.setup_loader() BROKER_HOST = dotcloud_env['CLOUDAMQP_RABBITMQ_AMQP_HOST'] BROKER_PORT = int(dotcloud_env['CLOUDAMQP_RABBITMQ_AMQP_PORT']) BROKER_USER = dotcloud_env['CLOUDAMQP_RABBITMQ_AMQP_LOGIN'] BROKER_PASSWORD = dotcloud_env['CLOUDAMQP_RABBITMQ_AMQP_PASSWORD'] BROKER_VHOST = dotcloud_env['CLOUDAMQP_RABBITMQ_AMQP_VIRTUALHOST'] BROKER_VHOST corresponds to a name space in RabbitMQ where Exchanges and Queues are stored. If you decide to use a different broker (e.g: MongoDB or Redis) this will correspond to the database name or number to use. We also configure the default Celery queue, this is completely optional and we simply use it in this tutorial to illustrate how you can route tasks with RabbitMQ: .. code-block:: python # minestrone/settings.py: # A very simple queue, just to illustrate the principle of routing. CELERY_DEFAULT_QUEUE = 'default' CELERY_QUEUES = { 'default': { 'exchange': 'default', 'exchange_type': 'topic', 'binding_key': 'tasks.#' } } # … Running the Workers ------------------- With Django-Celery the Celery workers are launched using ``manage.py`` from the root of the application, with the command:: python minestrone/manage.py celeryd -E -l info -c 2 Here is what each command switch does: - ``-E`` activates events, this tells the workers to send notifications of what they are doing (started/finished a task, etc.); - ``-l info`` asks the workers to log every messages that have a priority superior or equal to "info"; - ``-c 2`` launches two workers ("c" as in "concurrency"). We will see how to automatically run this on dotCloud when you push your code, but let's create some tasks to execute first. Create Some Tasks ----------------- Tasks in the Celery sense can be plain functions using a "@celery.task.task" decorator or classes inherited from the "celery.task.Task" class. In a Django application you place your tasks in a ``tasks.py`` module, side by side with your views or models. This tutorial defines a very simple task, called "lazy_job" in the "soup" application: .. code-block:: python # minestrone/soup/tasks.py: import time from celery.task import task @task(ignore_result=True) def lazy_job(name): logger = lazy_job.get_logger() logger.info('Starting the lazy job: {0}'.format(name)) time.sleep(5) logger.info('Lazy job {0} completed'.format(name)) We give the "ignore_result=True" argument to the task decorator to tell Celery that we don't care about the result of our task. This is advised by the Celery documentation to reduce resources usage [#]_. New "lazy_job" tasks are created from the "EditorView" view using the "apply_async" method: .. code-block:: python # minestrone/soup/views.py: from django.http import HttpResponseRedirect from django.views.generic import TemplateView, FormView from django import forms from celery.task.control import inspect from minestrone.soup import tasks # … class EditorView(FormView): # … def form_valid(self, form): name = form.cleaned_data['job_name'] routing_key = 'tasks.{0}'.format(form.cleaned_data['routing_key_name']) tasks.lazy_job.apply_async(args=[name], routing_key=routing_key) return HttpResponseRedirect(self.get_success_url()) The "name" and "routing_key" variables are extracted from the form embedded in the view. The "apply_async" method can take different, all optional, arguments, here we are only using: #. "args" that contains the list of arguments to forward to the "lazy_job" function; #. "routing_key" that can be used to route tasks to different workers, but here, the key will be always matched by the "default" queue defined in ``minestrone/settings.py``. The Django Celery Admin Panel ----------------------------- Without really letting you know, we already did half of the Django and Celery integration. The two missing things are: - a database to store all the Celery events; - running the ``celerycam`` daemon that takes snapshots of the events sent by the workers, storing them in a database. The database is configured from the settings file, using the dotCloud environment: .. code-block:: python # minestrone/settings.py: # … DATABASES = { 'default': { 'ENGINE': 'django.db.backends.postgresql_psycopg2', 'NAME': 'template1', 'USER': dotcloud_env['DOTCLOUD_DB_SQL_LOGIN'], 'PASSWORD': dotcloud_env['DOTCLOUD_DB_SQL_PASSWORD'], 'HOST': dotcloud_env['DOTCLOUD_DB_SQL_HOST'], 'PORT': int(dotcloud_env['DOTCLOUD_DB_SQL_PORT']), } } We are using PostgreSQL, but any database supported by Django would work. The command to launch the ``celerycam`` daemon is:: python minestrone/manage.py celerycam As for the Celery worker, we will see how to automatically execute this command when you push on dotCloud in the next sections. For your reference, the other parts of the Django Celery integration were: - Celery is configured from the Django ``settings.py`` file (instead of the usual ``celeryconfig.py`` file); - ``celeryd`` is invoked through Django's ``manage.py`` command instead of being directly run from the shell; - You have to pass the ``-E`` argument to ``celeryd`` to collect workers events. dotCloud Specific Details ------------------------- We have already seen how we use the ``environment.json`` file to configure Celery and PostgreSQL automatically. But it's not the only dotCloud specific detail, there is also: - A dotCloud build file: ``dotcloud.yml``; - A Supervisor configuration include file: ``supervisord.conf`` that is installed from the ``postinstall`` hook; - A ``wsgi.py`` file that bridges the web server and Django; - A ``nginx.conf`` that defines a couple of locations where the Django static and media files are stored. For the Python dependencies, dotCloud follows the ``requirements.txt`` convention, this file contains:: Django django-celery setproctitle These dependencies will be installed by `pip `_ (an ``easy_install`` replacement) from `PyPI `_ when the application is pushed on dotCloud. The setproctitle package is an optional dependency of Celery. When it is installed, Celery can display useful information in the process title instead of the command line. You can see it in action if you log into the ``workers`` services and run ``ps aux``. The setup of the Django static and media files and of ``wsgi.py`` is already covered in the :doc:`django` tutorial, so only the ``dotcloud.yml`` and ``supervisord.conf`` are detailed here. The dotCloud Build File ~~~~~~~~~~~~~~~~~~~~~~~ The dotCloud Build File is straightforward and just describes the architecture of our application on dotCloud, with its four services: - a web server pre-configured for Python WSGI-compatible applications like Django; - a service to launch workers; - a PostgreSQL server; ``dotcloud.yml``: .. code-block:: yaml www: type: python workers: type: python-worker db: type: postgresql The Supervisor Configuration Include ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The Supervisor configuration include, named ``supervisord.conf`` is used to launch and monitor the Celery workers and the ``celerycam`` daemons: .. code-block:: ini [program:djcelery] directory = /home/dotcloud/current/ command = python minestrone/manage.py celeryd -E -l info -c 2 stderr_logfile = /var/log/supervisor/%(program_name)s_error.log stdout_logfile = /var/log/supervisor/%(program_name)s.log [program:celerycam] directory = /home/dotcloud/current/ command = python minestrone/manage.py celerycam stderr_logfile = /var/log/supervisor/%(program_name)s_error.log stdout_logfile = /var/log/supervisor/%(program_name)s.log You may have recognized a piece of .ini file, let's break it down: .. code-block:: ini [program:djcelery] Defines a new background process configuration block. The process is called "djcelery" here. .. code-block:: ini directory = /home/dotcloud/current/ command = python minestrone/manage.py celeryd -E -l info -c 2 These two lines tell Supervisor how the background process should be launched: - In the directory ``/home/dotcloud/current``, this where the :ref:`dotCloud builder ` will install your application and thus where the Django application lives; - Using the command: "python minestrone/manage.py celeryd -E -l info -c 2". .. code-block:: ini stderr_logfile = /var/log/supervisor/%(program_name)s_error.log stdout_logfile = /var/log/supervisor/%(program_name)s.log Redirect the output of the workers into these two log files, "%(program_name)s" will be replaced by "djcelery". If you don't put these two lines Supervisor will create the log files for you, but with a *less readable* (random) name. The exact same thing is repeated for the ``celerycam`` daemon. This configuration include is generated from the :doc:`/guides/hooks`, if we are on the "workers" service: .. code-block:: sh # postinstall: # … dotcloud_get_env() { sed -n "/$1/ s/.*: \"\(.*\)\".*/\1/p" < "/home/dotcloud/environment.json" } setup_django_celery() { cat > /home/dotcloud/current/supervisord.conf << EOF [program:djcelery] directory = /home/dotcloud/current/ command = python minestrone/manage.py celeryd -E -l info -c 2 stderr_logfile = /var/log/supervisor/%(program_name)s_error.log stdout_logfile = /var/log/supervisor/%(program_name)s.log [program:celerycam] directory = /home/dotcloud/current/ command = python minestrone/manage.py celerycam stderr_logfile = /var/log/supervisor/%(program_name)s_error.log stdout_logfile = /var/log/supervisor/%(program_name)s.log EOF } if [ `dotcloud_get_env DOTCLOUD_SERVICE_NAME` = workers ] ; then setup_django_celery # … Conclusion ---------- Django and Celery are very easy to get running on dotCloud. Let's review what we did here: #. Write a ``dotcloud.yml`` with all the services we need; #. Setup a RabbitMQ server and added the ENV variables to our application. #. Configure Celery and PostgreSQL in ``settings.py``; #. Defines some tasks in the ``tasks.py`` module of the Django application; #. Launched some :doc:`workers ` by using a ``supervisord.conf`` file on the :doc:`/services/python-worker/` service; #. Enqueue some tasks using its "apply_async" method from a Django view. The example application lives on: http://django-celery.dotcloudapp.com/. You can clone the code from https://bitbucket.org/lopter/dj-celery/, "cd" into it, create your application with the :doc:`flavor ` of your choice and push it to dotCloud:: dotcloud create djcelery dotcloud push While you are fiddling with the web interface you can see the jobs being performed in the workers logs:: dotcloud logs workers You can also have a look at the Django administration panel at: http:///admin/. The user name to use is "admin" with the default password (configured in ``mkadmin.py``): "password". Celery is very powerful, especially when coupled with RabbitMQ, you are highly encouraged to take a look at its excellent documentation: http://celery.readthedocs.org/en/latest/. ---- .. [#] The Minestrone soup is a famous Italian dish that contains some Celery… .. [#] http://celery.readthedocs.org/en/latest/userguide/tasks.html#tips-and-best-practices