#Django #TransactionTestCase with REUSE_DB=1 of #django-nose

Lately, I found out that Django’s TransactionTestCase leaves test data in database after the test case is executed. It’s not until the next execution of _pre_setup method of a TransactionTestCase instance that the database is flushed. This is troublesome when tests are run with Django Nose’s test runner with REUSE_DB =1.

An easy fix to this is to customize the TransactionTestCase so that it deletes the test data on exit. I wrote a simple wrapper around Django’s TransactionTestCase and extend it to write other transaction test cases.

from django.test import TransactionTestCase
from django.db import connections, DEFAULT_DB_ALIAS

def flushdb(cls):
    if getattr(cls, 'multi_db', False):
        databases = connections
    else:
        databases = [DEFAULT_DB_ALIAS]
    for db in databases:
        management.call_command('flush', verbosity=0,
            interactive=False, database=db)

class BaseTransactionTestCase(TransactionTestCase):
    @classmethod
    def tearDownClass(cls):
        flushdb(cls)

Validate Python string translation in Transifex

Transifex already supported validating translations of old styled Python strings, e.g.,

"A sample string with a %(keyword)s argument." % {'keyword': 'key word'}

The validation is done by checking if all the positional and keyword arguments are present in the translation string and the translation string does not contain any extra argument which is not in the source string. You can have a look at the validator code here.

However, the existing validator is not able to check for replacement fields in new style Python format strings, e.g.

"This is a sample string with different replacement fields: {} {1} {foo["bar"]:^30}".format(
"arg0", "arg1", foo={"bar":"a kwarg"})

I tried to devise a regex to extract the replacement fields in the Python format string based on the grammar defined here.

# Regex to find format specifiers in a Python string

import re

field_name = '(?P<field_name>(?P<arg_name>\w+|\d+){0,1}'\
                '(?:(?P<attribute_name>\.\w+)|'\
                '(?P<element_index>\[(?:\d+|(?:[^\]]+))\]))*)'
conversion = '(?P<conversion>r|s)'
align = '(?:(?P<fill>[^}{]?)(?P<align>[<>^=]))'
sign = '(?P<sign>[\+\- ])'
width = '(?P<width>\d+)'
precision = '(?P<precision>\d+)'
type_ = '(?P<type_>[bcdeEfFgGnosxX%])'
format_spec = ''\
    '(?P<format_spec>'\
        '%(align)s{0,1}'\
        '%(sign)s{0,1}#?0?'\
        '%(width)s{0,1},?'\
        '(?:\.%(precision)s){0,1}'\
        '%(type)s{0,1}'\
    ')' % {
        'align': align,
        'sign': sign,
        'width': width,
        'precision': precision,
        'type': type_
}
replacement_field = ''\
    '\{'\
    '(?:'\
        '%(field_name)s{0,1}'\
        '(?:!%(conversion)s){0,1}'\
        '(?:\:%(format_spec)s){0,1}'\
    ')'\
    '\}' % {
        'field_name': field_name,
        'conversion': conversion,
        'format_spec': format_spec
}

printf_re = re.compile(
    '(?:' + replacement_field + '|'
        '%((?:(?P<ord>\d+)\$|\((?P<key>\w+)\))?(?P<fullvar>[+#-]*(?:\d+)?'
            '(?:\.\d+)?(hh\|h\|l\|ll)?(?P<type>[\w%])))'
    ')'
)

Well, with the above, I was able to parse almost all the cases discussed here except for this one:

import datetime
d = datetime.datetime(2010, 7, 4, 12, 15, 58)
s = '{:%Y-%m-%d %H:%M:%S}'.format(d)

I was not sure how I could fit the above case to my regex. After some discussions in #python on IRC, I found some limitations of regular expressions and that it is not Turing complete. People suggested me to use some parser tools.

I, being a strong supporter of “Never re invent the wheel”, gave another shot to find some existing solution and lucky I was to come across _formatter_parser() of a Python string object.  It correctly found all replacement fields in python format strings properly and returned  an iterable of tuples (literal_textfield_nameformat_specconversion). All I needed then was to convert this info to a list of replacement fields in a format string. A simple script below would is all that I needed to extract replacement fields in a format string in Python:

replacement_fields = []
s = "{foo:^+30f} bar {0} foo {} {time:%Y-%m-%d %H:%M:%S}"

for literal_text, field_name, format_spec, conversion in \
        s._formatter_parser():
    if field_name is not None:
        replacement_field = field_name
        if conversion is not None:
            replacement_field += '!' + conversion
        if format_spec:
            replacement_field += ':' + format_spec
        replacement_field = '{' + replacement_field + '}'
        replacement_fields.append(replacement_field)
print replacement_fields
["{foo:^+30f}", "{0}", "{}", "{time:%Y-%m-%d %H:%M:%S}"]

That’s all. Simple and easy, isn’t it?

App specific logging in Transifex

Yesterday, I was working on adding app specific loggers in Transifex. By app specific logger I mean a logger which shows the app name which generated the log. As of now, the logs in Transifex look something like this:


2012-06-29 13:01:43,300 tx DEBUG Saved: Project Avant Window Navigator
2012-06-29 13:01:43,312 tx DEBUG Saved: Project Switchdesk
2012-06-29 13:01:43,324 tx DEBUG Saved: Project Usermode
2012-06-29 13:01:43,342 tx DEBUG Saved: Project desktop-effects
2012-06-29 13:01:43,349 tx DEBUG Saved: Project im-chooser
2012-06-29 13:01:43,355 tx DEBUG Saved: Project Test Project
2012-06-29 13:01:43,364 tx DEBUG Saved: Project Test Private Project
2012-06-29 13:01:45,704 tx DEBUG Saved: Project Test Project
2012-06-29 13:01:45,717 tx DEBUG Saved: Project Test Private Project
2012-06-29 13:01:45,731 tx DEBUG Resource Resource1: New ResourcePriority created.

It does not tell anything about which app generated the logs. In a first glance, fixing this looks pretty straight forward and dumb. All it needs it to customize this https://github.com/transifex/transifex/tree/devel/transifex/txcommon/log module for each app and instead of importing the logger from txcommon.log, import it from the log module inside the app.
But this would lead to a lot of code duplication and a lot of boring changes in the code. So, I decided to customize transifex.txcommon.log module itself so that it can detect the function calling the logger. It was pretty straight forward to do this for the handler at https://github.com/transifex/transifex/blob/devel/transifex/txcommon/log/receivers.py#L6: def model_named() in the following way:

import re

tx_module_regex = re.compile(
    r'transifex(\.addons)?\.(?P<app_name>\w+)(\..*)?')
def model_named(sender, message='', **kwargs):
    """
    Receive signals for objects with a .name attribute.
    """
    from txcommon.log import _logger as logger
    sender_module = sender.__module__
    m = tx_module_regex.search(sender_module)
    app_name = '.' + m.group('app_name') if m else ''
    logger.name = 'tx' + app_name
    obj = kwargs['instance']
    logger.debug("%(msg)s %(obj)s %(name)s" %
            {'msg': message,
            'obj': sender.__name__,
            'name': getattr(obj, 'name', '')})

sender is the object or instance for which the log is being generated. In our case, it’s a model instance. So, sender.__module__ gives the parent module for sender. Using regular expressions, we extract the app name from the module name and we set the name of the logger as ‘tx.<app_name>‘. And we are done here (for now)! But when we do something like

from transifex.txcommon.log import logger
logger.debug('foo bar')

we do not have a sender instance to allow us to find the calling module name. After some searching, I found about the inspect python module. And all I needed was inspect.stack(). Here’s what I did in https://github.com/transifex/transifex/tree/devel/transifex/txcommon/log/__init__.py:

  1. Write a wrapper around logger instance,
  2. find the caller calling the logger using stack.inspect(),
  3. accordingly set the logger name,
  4. and finally, log the event.

import logging, re, inspect

_logger = logging.getLogger('tx')

# regex to extract app name from a file path to a TXC app
tx_app_path_regex = re.compile(
r'txc/transifex(/addons)?/(?P<app_name>\w+)/(\..*)?')
class Logger:
    """
    A wrapper class around _logger. This is used to log events
    along with app names.
    """
    @classmethod
    def get_app_name_from_path(cls, path):
        """
        Extracts app name from a file path to a TXC app

        Args:
            path: A string for the file path
        Returns:
            A string for the app name or ''
        """
        m = tx_app_path_regex.search(path)
        return m.group('app_name') if m else ''

    @classmethod
    def set_logger_name(cls):
        """
        Sets logger name to show calling app's name.
        """
        # inspect.stack()[2] since cls.debug() method has now become the
        # immediate caller in of this method in the stack. We want the caller
        # of cls.debug() or other logging method wrappers.
        caller_module_path = inspect.stack()[2][1]
        app_name = cls.get_app_name_from_path(caller_module_path)
        _logger.name = 'tx' + '.%s' % app_name if app_name else ''

    @classmethod
    def debug(cls, *args, **kwargs):
        """Wrapper for _logger.debug"""
        cls.set_logger_name()
        _logger.debug(*args, **kwargs)

    # And similarly for other logger methods like info(), waring(), error(), critical()

logger = Logger

Now, this is sweet! No one need to bother about logging events with app names. I am saved from editing hundreds of files and duplicating code 😉 It’s transparent and scalable. The logs now seem like:

2012-06-29 20:39:03,635 tx.projects DEBUG Saved: Project Foo Project
2012-06-29 20:39:05,575 tx.projects DEBUG Saved: Project Avant Window Navigator
2012-06-29 20:39:05,587 tx.projects DEBUG Saved: Project Switchdesk
2012-06-29 20:39:05,599 tx.projects DEBUG Saved: Project Usermode
2012-06-29 20:39:05,612 tx.projects DEBUG Saved: Project desktop-effects
..........
..........
..........
2012-06-29 22:15:07,088 tx.webhooks DEBUG Project project1 has no web hooks
2012-06-29 22:15:07,177 tx.releases DEBUG Deleted: ReleaseNotifications
2012-06-29 22:15:07,177 tx.releases DEBUG Deleted: Release All Resources
2012-06-29 22:15:07,466 tx.txcommon DEBUG Running low-level command 'msgfmt -o /dev/null --check-format --check-domain -'
2012-06-29 22:15:07,469 tx.txcommon DEBUG CWD: '/home/rtnpro/transifex/rtnpro/github/txc/transifex'
2012-06-29 22:15:07,661 tx.releases DEBUG release: Checking string freeze breakage.
2012-06-29 22:15:07,702 tx.resources DEBUG resource: Checking if resource translation is fully reviewed: Test Project: Resource1 (pt_BR)
2012-06-29 22:15:07,707 tx.webhooks DEBUG Project project1 has no web hooks
2012-06-29 22:15:07,740 tx.resources DEBUG resource: Checking if resource translation is fully reviewed: Test Project: Resource1 (ar)
2012-06-29 22:15:07,745 tx.webhooks DEBUG Project project1 has no web hooks

Thanks for reading. If you have any suggestions or query, please feel free to comment.

Add plug-n-play functionality to your Django project using Django-addons

What is Django-addons?

A Django app used to add true plug-n-play functionality to your own Django applications and projects. Django-addons is brought to you by Indifex, the company behind Transifex.

Django-addons is a bunch of code that makes writing addon/plugins for your Django project much easier. Add django-addons to your Django project and you can drop all the addons to ‘/addons’ directory.

How to install Django-addons?

You can install the latest version of django-addons running
pip install django-addons
or
easy_install django-addons

You can also install the development version of django-addons with
pip install django-addons==dev
or
easy_install django-addons==dev.

Source code

http://code.indifex.com/django-addons/

Features

  • Addons overview page
  • Automatic signal connecting of addons
  • Automatic URL discovery of addons
  • Template hooking system (inject code from addons to your main project)
  • Django-staticfiles to serve site media from each addon
  • Django-notifications support (automatic registration of noticetypes)
  • Per addon localization
  • Per addon settings
  • Disabling addons via ./manage.py addons

A brief introduction to coverage.py

Coverage.py is a tool for measuring code coverage of Python programs. It monitors your program, noting which parts of the code have been executed, then analyzes the source to identify code that could have been executed but was not.

Coverage measurement is typically used to gauge the effectiveness of tests. It can show which parts of your code are being exercised by tests, and which are not.

Getting started:

1. Install coverage:

  • pip install coverage
  • easy_install coverage

2. Use coverage to run your program and gather data:

$ coverage run my_program.py arg1 arg2
blah blah ..your program's output.. blah blah

3. Generate reports with coverage:

$coverage -rm

Name                      Stmts   Miss  Cover   Missing
-------------------------------------------------------
my_program                   20      4    80%   33-35, 39
my_other_module              56      6    89%   17-23
-------------------------------------------------------
TOTAL                        76      10    87%

4. You can also use coverage to generate reports in other presentation oriented formats like HTML:

$coverage html

You can also use coverage.py with Django. You can run your Django tests along with coverage to check which codes in your app have been tested by your tests. With the coverage data, you can write new tests to test the codes which have not been tested so far by your tests. For example:

  • $coverage -e        //This deletes previous coverage data
  • $coverage -x manange.py test foo_app.FooTest.foo_method        //Execute manage.py from coverage to collect coverage data
  • $coverage -rm | grep ‘foo_app’      // to filter the report to show the coverage of foo_app
  • coverage run –include=”*foo_app*” –omit=”*tests* manage.py test foo_app        //This will include *foo_app* pattern and omit *tests* pattern from your coverage report.

You can also write custom Test Runners using the coverage API to measure code coverage in a more controlled manner. You can find more detailed documentation about coverage here.

Testing coverage of your Django code

Just writing tests for your Django codebase is not enough. You need to check how much of the code is covered in your tests. For this, there are some tools available. Again, it is not just the number of lines of code tested that matters. What matters is “Are these lines important?”. Well, for this, we have to use our head.

covergae is a tool for checking test coverage of python applications. django-test-coverage was built on top of coverage.py to meet requirements of django tests.

I was able to plugin django-test-coverage in Transifex. For some test cases it ran, for others, it failed. It raised a warning saying that a module was being imported more than once. But, the stats generated by it were misleading. Except for files with 0 lines of code (like __init__.py), it showed code coverage % as 0 and 100 only for the files having 0 line of code. I hacked into its code and was able to run it for cases where it had failed previously. But still the statistics were misleading. Time was running out. So, I decided that I would revisit its code some time later.

I resorted to use coverage.py. You can find an introduction about coverage.py at here. Using coverage boils down to 3 steps:

  1. Erase previously collected data : $coverage -e
  2. Execute the necessary tests and collect data: $coverage -x <test module>
  3. Display the result: $coverage -r -m

If the test module is large, the report generated by coverage is also large. I usually save the report to a file : $coverage -r -m > report.txt. Now, I can use grep to shortlist the report to see the details of the files which concerns me now. That’s pretty easy.

coverage.py gives you very useful informations like percentage code coverage of a file, missing statements, etc. Although a higher percentage code coverage is better, but the importance of the lines also matters a lot. You could increase the code coverage by including 10 not so important lines rather than including 1 important line. So, code coverage statistics helps us to write tests to cover more codes, but it is not a replacement to thinking. The final judgement is to be done by us.