Celery Debugging, Tricks, Etc…

I am not sure why, but Celery takes me forever to get working. Here are some notes for getting it working with Django.

Celery Troubleshooting

Lets say you have Supervisor daemonizing Celery. You can see the Celery processes are running with:

ps aux | grep celery

But things are not quite right yet. What to do? For troubleshooting, things will go a lot quicker if you do not run Celery as a daemon. Just cd to your manage.py dir and run:

celery worker -A <my_project> -E -l debug -B

This will give you lots of messages and is a lot easier to start and stop.

Restarting Supervisor

As you are debugging, you might discover errors in your Supervisor scripts. After you make changes to the Supervisor scripts, you need to restart supervisor:

service supervisor restart

Celery Beat

If you want to run periodic tasks, Celery beat is the answer. In the past, I started Celery using the -B flag in my command for starting Celery. But the docs say you should not use this in production. Instead of the -B flag, setup beat in production as shown here.

Celery Logging

In the Celery docs, they recommend setting up task logging like this:

from celery.utils.log import get_task_logger
logger = get_task_logger(__name__)

Easy enough. But where do those messages go? As is, you would think they would go to the root logger. If you did not setup the root logger, then they should go to Streamhandler, which should send them to stdout. I setup stdout in supervisor to write to a file, yet no celery logging appears there.

Others have suggested that Celery mucks with the logger tree. Maybe that’s true. In any case you can “kind of” solve it by creating a ‘celery’ logger handler. I say “kind of” because I still cannot figure out where Celery Beat tasks log to. In any case, the logger code is below.

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'formatters': {
        'simple': {
            'format': '%(levelname)s %(message)s',
            'datefmt': '%y %b %d, %H:%M:%S',
            },
        },
    'filters': {
        'require_debug_false': {
            '()': 'django.utils.log.RequireDebugFalse'
        }
    },
    'handlers': {
        'mail_admins': {
            'level': 'ERROR',
            'filters': ['require_debug_false'],
            'class': 'django.utils.log.AdminEmailHandler'
        },
        'celery': {
            'level': 'DEBUG',
            'class': 'logging.FileHandler',
            'filename': normpath(join(SITE_ROOT, 'celery.log')),
            'formatter': 'simple',
        }
    },
    'loggers': {
        'django.request': {
            'handlers': ['mail_admins'],
            'level': 'ERROR',
            'propagate': True,
        },
        'celery': {
            'handlers': ['celery'],
            'level': 'DEBUG',
        }
    }
}

Celery Beat: Connection refused

In my case, this error was due to RabbitMQ being down. Running:

ps aux | grep rabbit

Showed only:

/usr/lib/erlang/erts-5.8.5/bin/epmd -daemon

To start RabbitMQ, run:

sudo rabbitmq-server -detached

Now you should see a bunch of new RabbitMQ processes.

Stopping Celery Worker Processes

If you use Supervisor to start and stop Celery, you will notice that you are accumulating worker processes. Evidently starting with supervisor creates workers, but stopping does not shut them down. I tried various settings in my supervisor config file, with no luck. I have googled it and not found an answer. So for now I am using a shell command to handle this:

ps auxww | grep 'celery worker' | awk '{print $2}' | xargs kill -9

This is from the Celery docs.

Django Signals Example

Recently I have gotten serious about writing modular, re-usable Django apps, which has lead me to Django signals. In the past, I have avoided signals based on the ideas in “Two Scoops of Django“. Their point is that often Django noobs resort to signals, when there are simpler solutions such as over-riding model save methods. If you look at the questions on Stackoverflow related to Django signals, you will see that the “Two Scoops” advice is wise.

However, when it comes to modular apps, signals are the ideal way to let one module do something when something happens in another module. In my case, I needed a module to do something when a model was saved in another module. Here is how I did it.

Module1 models.py

No changes required. Django models send signals automagically.

Module 2 Receiver

You can call this file whatever you want. I called mine receivers.py. Here is the code:

# receivers.py

from django.dispatch import receiver
from django.db.models.signals import post_save

from module1.models import AModel

@receiver(post_save, sender=AModel)
def handle_a_model_save(sender, **kwargs):
    print 'signals: a model was saved'

This is were I screwed up (which prompted this blog post), so pay attention. Django learns about your receivers via your apps urls.py file. Something like this:

# urls.py

import receivers  # this is your receivers file
from django.conf.urls import patterns

urlpatterns = patterns('',
    ...your normal urls go here...   
)

That’s it. Just import your receivers file. You do not need to add any new urlpatterns. It’s almost too easy.

I will say it again for the bots: if your Django signals receivers are not working, it may be because you did not add them to your urls file.