Python Dates and Times Cheatsheet

Python (2.7) datetimes are a little frustrating because there are many ways to do each manipulation and there are many applicable 3rd party libraries. Below are code snippets that I think show the best solution. While I appreciate the concept of minimizing dependencies, there are some 3rd party libraries that are so useful, that I just include them as requirements on all my projects.

Aware UTC Now


import datetime

import pytz

now = pytz.utc.localize(datetime.datetime.utcnow())

Aware Now in Timezone

The code below will create an aware time in US/Pacific time regardless of the time zone the machine is running in.


import datetime

import pytz

now = datetime.datetime.now(pytz.timezone('US/Pacific'))

Django Long Running Processes

The most commonly suggested solution for long running processes is to use Celery. I suspect that if you need scalabilty or high volume, etc… Celery is the best solution. That said,  I have been down the Celery rabbit hole more than once. It has never been pleasant. Since my needs are more modest, maybe there is a better alternative?

My needs involve running process that might run for 15 minutes or so. The process might run a dozen times/day and be launched by as many users. The process must be launch-able from the website by authorized users.

I have solved this problem by going to the polar opposite of Celery – cron. Every minute cron would launch a custom django command. That command would look in a database table for tasks and get input data from the database. When a task was completed, that fact was written to the database table. Honestly, this approach has worked well. Never-the-less, I always wonder if there is a stable, simple, robust solution that lies somewhere between cron and Celery.

Maybe RedisRQ and Django RQ? These are my notes so that a year from now, when this issue comes up again, I can get up to speed quickly.

Step 1: Install Redis and Start Redis Server

These instructions are pretty good.

Step 2: Is Redis Server Working?

Maybe you installed Redis Server a long time ago and you want to see if it’s still working? Go here.

Or you could type:

$ redis-cli ping
PONG

Step 3: Install RQ

pip install rq

Step 4: Install and Configure django-rq

Go here.

Step 5: Read the RQ Docs

Seriously – read the RQ docs. They are brief and to-the-point.

Step 6: Daemonize the RQ Workers

If you use supervisord , here is the Ansible template I use to do that:

[program:django_rq]
command= {{ virtualenv_path }}/bin/python manage.py rqworker high default low
stdout_logfile = /var/log/redis/redis_6379.log

numprocs=1

directory={{ django_manage_path }}
environment = DJANGO_SETTINGS_MODULE="{{ django_settings_import }}",PATH="{{ virtualenv_path }}/bin"
user = vagrant
stopsignal=TERM

autostart=true
autorestart=true


[program:rqscheduler]
command={{ virtualenv_path }}/bin/python manage.py rqscheduler
stdout_logfile = /var/log/redis/rq_scheduler.log

numprocs=1

directory={{ django_manage_path }}
environment = DJANGO_SETTINGS_MODULE="{{ django_settings_import }}",PATH="{{ virtualenv_path }}/bin"
user = vagrant
stopsignal=TERM

autostart=true
autorestart=true

Reflections

I do not recall all the problems I had with Celery. After reviewing the RQ solution above, it is clear that one of the advantages of that solution is the documentation is really good. Or at least it clearly and directly addressed what I was trying to do.

Additionally, I wish I would have implemented this a long time ago. It is so easy to use. And it’s so freeing to to be able to run long processes.

Reflections part Deux, Troubleshooting and Gotchas

It’s coming back to me. The supervisor config in the original post started the daemon OK. But it turns out there was an error in the config that caused the queued processes to fail. Finding and fixing that bug was a pain in the ass. Maybe my troubles with Celery were really troubles with supervisor? Down the rabbit hole we go!

It turns out that the Django RQ Queue Statistics are helpful for debugging. They show failed tasks along with a Python traceback! Very nice. In my case, I was getting the error:

ImportError: No module named XXXX

Clearly one of my paths in the supervisor conf file was wrong. Time to start hacking:

  1. Edit conf file
  2. Run supervisorctl stop django_rq
  3. Run supervisorctl start django_rq
  4. Queue a task
  5. It failed again? How is that possible? Back to step 1

GOTCHA! After a while you notice the changes you are making are not having any effect. And then you recall that to reload the config file you must run:

service supervisor restart

Now my config file works. All I have to do is figure out which of the ever cludgier hacks I made can be removed. The config file above has been updated.

Son of Reflections part Deux – Adding PATH to Supervisor Config

I thought I had it working. Then when I added a slightly more complex task that interacted with the database, it failed with an ImportError. After flailing around for a while, I found that adding a PATH to the supervisor environment variable solved the problem.

During my flailing, I found this blog post. Lots of great ideas.

Still Falling Down the Rabbit Hole – Logging to the Rescue

Everything was working almost every where… except with the daemonized workers on the server. Luckily, Django-RQ now comes with logging. I implemented the logging in the Django settings files as per the docs, restarted the dev server, and… no logging. Turns out you have to restart the workers.

Also, although the docs show the use of a logging class made for rq (rq.utils.ColorizingStreamHandler), it turns out you can use logging.FileHandler, which is what you want for debugging the code when running from a daemonized worker.

For what it’s worth, it turns out the problem was with the python locale module. The docs say something about it not being thread safe. The function locale.getlocale() returned a value when the workers were run via the dev server, but it returned None when run from a daemonized worker.

Datepickers for Django Sites that Work on the Desktop and Mobile

As of April 2017, HTML datepickers are still a mess. Here’s how it should be; all browsers should support the <input> attribute “type”. And when the browser sees <input type=”date” …. > the browser should offer up an awesome builtin date picker. Unfortunately we are far from that world.

My work around starts with using a javascript datepicker plugin. There are lots of choices and I have not tried them all. The one I am using is Bootstrap Datetime Picker. When using this picker, make sure to set <input type=”text” …. >, by doing this you prevent the browser from simultaneously providing a native date picker. For example, when you are running in “no icon” mode, with type=text this is what you get in Chrome:

but when type=”date” you get this:

Pretty ugly. It’s even worse in IE. This works pretty well on big screens, but not as well on small screens (e.g. phones).

On the small screen, it’s best to use the native datepicker. To make that happen you need not initialize the datepicker widget and you need to change the type attribute to “date”. Here is one way to do that that uses the Responsive Bootstrap Toolkit:

(function($, viewport){
    $(document).ready(function() {
        var datepickers = $(".bootstrapdatepickerwidget3");

        // Executes in XS and SM breakpoints
        if(viewport.is('>=md')) {
            datepickers.each(function(index, el) {
                $(el).datetimepicker({format: 'YYYY-MM-DD'});
            })
        } else {
            datepickers.each(function(index, el) {
                $(el).attr('type','date');
            })
        }
    });
})(jQuery, ResponsiveBootstrapToolkit);

Add a custom Django form field widget and you can make all these steps automatically:

from django import forms

class BootstrapDatePickerWidget(forms.DateInput):
    # noinspection PyClassHasNoInit
    class Media:
        css = {'all': [
            '3s_hts/js/bootstrap-datetimepicker-master/bootstrap-datetimepicker.min.css'
        ]}

        js = ('3s_hts/js/bootstrap-datetimepicker-master/moment.js',
              '3s_hts/js/bootstrap-datetimepicker-master/bootstrap-datetimepicker.min.js',
              '3s_hts/js/responsive-toolkit/dist/bootstrap-toolkit.min.js',
              '3s_hts/js/setup_bootstrap_datepicker.js')

class BootstrapDatePickerField(forms.DateField):
    widget = BootstrapDatePickerWidget3

Grouping Choices in a Django Select

To make an ordinary Django ChoiceField you can do something like this:

choices = [[1, 'Apples'], [2, 'Oranges'], [3, 'Carrots'], [4, 'Beans']]
my_choices = forms.ChoiceField(choices=choices)

But what if you want to group the choices by using the HTML OPTGROUP tag? My first thought was to over-ride some on the methods in the Django Select widget code. But when I inspected the code, I found the code already supports this (Django 1.8). Here is how to do it:

choices = [
    ['Fruit', [[1, 'Apples'], [2, 'Oranges']]], 
    ['Veggies', [[3, 'Carrots'], [4, 'Beans']]]
]
my_choices = forms.ChoiceField(choices=choices)

Running from __main__ in PyCharm

This applies to PyCharm 2016.3.2.

Sometimes it’s useful to run a module, that is part of a bigger project, directly from PyCharm. To prevent that code from running with the module is being called from elsewhere, you can use something like this:

def my_module(x, y):
    return 'hello world'

if __name__ == '__main__':
    results = my_module(x, 7)

Sometimes when you do that, you will get import errors. Not a problem, just make sure the dir that contains the package that contains the module is on the sys path. To check that, put this at the top of the module:

import sys
for p in sys.path:
    print(p)

Here’s where PyCharm gets a little tricky. You would think you could just go into the Run/Debug Configurations and set the Working directory. Unfortunately that does not work. Instead you need to add an Environment variable PYTHONPATH. Like this:

screenshot-from-2017-02-15-100406

 

Python Dedupe Errors

ValueError: No records have been blocked together

I got this error in Python Dedupe 1.6.1 while using Gazetteer to see if a single new record would be a duplicate. I successively replaced values in the new record dict, note the keys in the dict stayed the same. Eventually Dedupe indicated a potential match. This shows that a sufficiently distinct new record will give this ValueError.