#Django #TransactionTestCase with REUSE_DB=1 of #django-nose

Lately, I found out that Django’s TransactionTestCase leaves test data in database after the test case is executed. It’s not until the next execution of _pre_setup method of a TransactionTestCase instance that the database is flushed. This is troublesome when tests are run with Django Nose’s test runner with REUSE_DB =1.

An easy fix to this is to customize the TransactionTestCase so that it deletes the test data on exit. I wrote a simple wrapper around Django’s TransactionTestCase and extend it to write other transaction test cases.

from django.test import TransactionTestCase
from django.db import connections, DEFAULT_DB_ALIAS

def flushdb(cls):
    if getattr(cls, 'multi_db', False):
        databases = connections
    else:
        databases = [DEFAULT_DB_ALIAS]
    for db in databases:
        management.call_command('flush', verbosity=0,
            interactive=False, database=db)

class BaseTransactionTestCase(TransactionTestCase):
    @classmethod
    def tearDownClass(cls):
        flushdb(cls)

My talk got selected for #Pycon India 2012

My proposed talk titled Develop for an international audience got selected for Pycon India, 2012. It’s time to start working on the slides. I am thinking to use rst to write my slides. Also, I have booked by flight tickets for Pycon 🙂

Thanks everyone who voted for my talk.

Validate Python string translation in Transifex

Transifex already supported validating translations of old styled Python strings, e.g.,

"A sample string with a %(keyword)s argument." % {'keyword': 'key word'}

The validation is done by checking if all the positional and keyword arguments are present in the translation string and the translation string does not contain any extra argument which is not in the source string. You can have a look at the validator code here.

However, the existing validator is not able to check for replacement fields in new style Python format strings, e.g.

"This is a sample string with different replacement fields: {} {1} {foo["bar"]:^30}".format(
"arg0", "arg1", foo={"bar":"a kwarg"})

I tried to devise a regex to extract the replacement fields in the Python format string based on the grammar defined here.

# Regex to find format specifiers in a Python string

import re

field_name = '(?P<field_name>(?P<arg_name>\w+|\d+){0,1}'\
                '(?:(?P<attribute_name>\.\w+)|'\
                '(?P<element_index>\[(?:\d+|(?:[^\]]+))\]))*)'
conversion = '(?P<conversion>r|s)'
align = '(?:(?P<fill>[^}{]?)(?P<align>[<>^=]))'
sign = '(?P<sign>[\+\- ])'
width = '(?P<width>\d+)'
precision = '(?P<precision>\d+)'
type_ = '(?P<type_>[bcdeEfFgGnosxX%])'
format_spec = ''\
    '(?P<format_spec>'\
        '%(align)s{0,1}'\
        '%(sign)s{0,1}#?0?'\
        '%(width)s{0,1},?'\
        '(?:\.%(precision)s){0,1}'\
        '%(type)s{0,1}'\
    ')' % {
        'align': align,
        'sign': sign,
        'width': width,
        'precision': precision,
        'type': type_
}
replacement_field = ''\
    '\{'\
    '(?:'\
        '%(field_name)s{0,1}'\
        '(?:!%(conversion)s){0,1}'\
        '(?:\:%(format_spec)s){0,1}'\
    ')'\
    '\}' % {
        'field_name': field_name,
        'conversion': conversion,
        'format_spec': format_spec
}

printf_re = re.compile(
    '(?:' + replacement_field + '|'
        '%((?:(?P<ord>\d+)\$|\((?P<key>\w+)\))?(?P<fullvar>[+#-]*(?:\d+)?'
            '(?:\.\d+)?(hh\|h\|l\|ll)?(?P<type>[\w%])))'
    ')'
)

Well, with the above, I was able to parse almost all the cases discussed here except for this one:

import datetime
d = datetime.datetime(2010, 7, 4, 12, 15, 58)
s = '{:%Y-%m-%d %H:%M:%S}'.format(d)

I was not sure how I could fit the above case to my regex. After some discussions in #python on IRC, I found some limitations of regular expressions and that it is not Turing complete. People suggested me to use some parser tools.

I, being a strong supporter of “Never re invent the wheel”, gave another shot to find some existing solution and lucky I was to come across _formatter_parser() of a Python string object.  It correctly found all replacement fields in python format strings properly and returned  an iterable of tuples (literal_textfield_nameformat_specconversion). All I needed then was to convert this info to a list of replacement fields in a format string. A simple script below would is all that I needed to extract replacement fields in a format string in Python:

replacement_fields = []
s = "{foo:^+30f} bar {0} foo {} {time:%Y-%m-%d %H:%M:%S}"

for literal_text, field_name, format_spec, conversion in \
        s._formatter_parser():
    if field_name is not None:
        replacement_field = field_name
        if conversion is not None:
            replacement_field += '!' + conversion
        if format_spec:
            replacement_field += ':' + format_spec
        replacement_field = '{' + replacement_field + '}'
        replacement_fields.append(replacement_field)
print replacement_fields
["{foo:^+30f}", "{0}", "{}", "{time:%Y-%m-%d %H:%M:%S}"]

That’s all. Simple and easy, isn’t it?

A year at Transifex

It’s more than a year now that I have been working at Transifex. It’s a great experience to be a part of the Transifex team. Well, it’s been a roller coaster ride for me at Transifex. I had to go through steep learning curves, work with new stuffs, deliver great features, meet strict deadlines. It was fun, because of being part of an awesome team. I am very much thankful to Apostolis, Konstantinos, John and Diego for guiding me and helping me.

During the course of this journey, I have made many friends. It’s always great to speak with the OpenTranslators folk. They all rock. I am also very much thankful to Youversion, Eventbrite, Pinterest, Dropbox folks for their queries and feedback. It makes me happy to help resolve issues with Transifex and add new features for our users. It gives me a sense of satisfaction that I am able to play my part for the greater good.

Mozilla has always fascinated me since a long time. I love Mozilla, it’s logos, it’s products and it’s goals. At Transifex, I worked on developing new API features to support Pontoon use Transifex as a backend. Following this, I also got involved with Pontoon’s development and have made a few commits. Finally, I contributed in some way to a Mozilla project.

A long road lies ahead. I hope it to be full of new challenges and excitement.

App specific logging in Transifex

Yesterday, I was working on adding app specific loggers in Transifex. By app specific logger I mean a logger which shows the app name which generated the log. As of now, the logs in Transifex look something like this:


2012-06-29 13:01:43,300 tx DEBUG Saved: Project Avant Window Navigator
2012-06-29 13:01:43,312 tx DEBUG Saved: Project Switchdesk
2012-06-29 13:01:43,324 tx DEBUG Saved: Project Usermode
2012-06-29 13:01:43,342 tx DEBUG Saved: Project desktop-effects
2012-06-29 13:01:43,349 tx DEBUG Saved: Project im-chooser
2012-06-29 13:01:43,355 tx DEBUG Saved: Project Test Project
2012-06-29 13:01:43,364 tx DEBUG Saved: Project Test Private Project
2012-06-29 13:01:45,704 tx DEBUG Saved: Project Test Project
2012-06-29 13:01:45,717 tx DEBUG Saved: Project Test Private Project
2012-06-29 13:01:45,731 tx DEBUG Resource Resource1: New ResourcePriority created.

It does not tell anything about which app generated the logs. In a first glance, fixing this looks pretty straight forward and dumb. All it needs it to customize this https://github.com/transifex/transifex/tree/devel/transifex/txcommon/log module for each app and instead of importing the logger from txcommon.log, import it from the log module inside the app.
But this would lead to a lot of code duplication and a lot of boring changes in the code. So, I decided to customize transifex.txcommon.log module itself so that it can detect the function calling the logger. It was pretty straight forward to do this for the handler at https://github.com/transifex/transifex/blob/devel/transifex/txcommon/log/receivers.py#L6: def model_named() in the following way:

import re

tx_module_regex = re.compile(
    r'transifex(\.addons)?\.(?P<app_name>\w+)(\..*)?')
def model_named(sender, message='', **kwargs):
    """
    Receive signals for objects with a .name attribute.
    """
    from txcommon.log import _logger as logger
    sender_module = sender.__module__
    m = tx_module_regex.search(sender_module)
    app_name = '.' + m.group('app_name') if m else ''
    logger.name = 'tx' + app_name
    obj = kwargs['instance']
    logger.debug("%(msg)s %(obj)s %(name)s" %
            {'msg': message,
            'obj': sender.__name__,
            'name': getattr(obj, 'name', '')})

sender is the object or instance for which the log is being generated. In our case, it’s a model instance. So, sender.__module__ gives the parent module for sender. Using regular expressions, we extract the app name from the module name and we set the name of the logger as ‘tx.<app_name>‘. And we are done here (for now)! But when we do something like

from transifex.txcommon.log import logger
logger.debug('foo bar')

we do not have a sender instance to allow us to find the calling module name. After some searching, I found about the inspect python module. And all I needed was inspect.stack(). Here’s what I did in https://github.com/transifex/transifex/tree/devel/transifex/txcommon/log/__init__.py:

  1. Write a wrapper around logger instance,
  2. find the caller calling the logger using stack.inspect(),
  3. accordingly set the logger name,
  4. and finally, log the event.

import logging, re, inspect

_logger = logging.getLogger('tx')

# regex to extract app name from a file path to a TXC app
tx_app_path_regex = re.compile(
r'txc/transifex(/addons)?/(?P<app_name>\w+)/(\..*)?')
class Logger:
    """
    A wrapper class around _logger. This is used to log events
    along with app names.
    """
    @classmethod
    def get_app_name_from_path(cls, path):
        """
        Extracts app name from a file path to a TXC app

        Args:
            path: A string for the file path
        Returns:
            A string for the app name or ''
        """
        m = tx_app_path_regex.search(path)
        return m.group('app_name') if m else ''

    @classmethod
    def set_logger_name(cls):
        """
        Sets logger name to show calling app's name.
        """
        # inspect.stack()[2] since cls.debug() method has now become the
        # immediate caller in of this method in the stack. We want the caller
        # of cls.debug() or other logging method wrappers.
        caller_module_path = inspect.stack()[2][1]
        app_name = cls.get_app_name_from_path(caller_module_path)
        _logger.name = 'tx' + '.%s' % app_name if app_name else ''

    @classmethod
    def debug(cls, *args, **kwargs):
        """Wrapper for _logger.debug"""
        cls.set_logger_name()
        _logger.debug(*args, **kwargs)

    # And similarly for other logger methods like info(), waring(), error(), critical()

logger = Logger

Now, this is sweet! No one need to bother about logging events with app names. I am saved from editing hundreds of files and duplicating code 😉 It’s transparent and scalable. The logs now seem like:

2012-06-29 20:39:03,635 tx.projects DEBUG Saved: Project Foo Project
2012-06-29 20:39:05,575 tx.projects DEBUG Saved: Project Avant Window Navigator
2012-06-29 20:39:05,587 tx.projects DEBUG Saved: Project Switchdesk
2012-06-29 20:39:05,599 tx.projects DEBUG Saved: Project Usermode
2012-06-29 20:39:05,612 tx.projects DEBUG Saved: Project desktop-effects
..........
..........
..........
2012-06-29 22:15:07,088 tx.webhooks DEBUG Project project1 has no web hooks
2012-06-29 22:15:07,177 tx.releases DEBUG Deleted: ReleaseNotifications
2012-06-29 22:15:07,177 tx.releases DEBUG Deleted: Release All Resources
2012-06-29 22:15:07,466 tx.txcommon DEBUG Running low-level command 'msgfmt -o /dev/null --check-format --check-domain -'
2012-06-29 22:15:07,469 tx.txcommon DEBUG CWD: '/home/rtnpro/transifex/rtnpro/github/txc/transifex'
2012-06-29 22:15:07,661 tx.releases DEBUG release: Checking string freeze breakage.
2012-06-29 22:15:07,702 tx.resources DEBUG resource: Checking if resource translation is fully reviewed: Test Project: Resource1 (pt_BR)
2012-06-29 22:15:07,707 tx.webhooks DEBUG Project project1 has no web hooks
2012-06-29 22:15:07,740 tx.resources DEBUG resource: Checking if resource translation is fully reviewed: Test Project: Resource1 (ar)
2012-06-29 22:15:07,745 tx.webhooks DEBUG Project project1 has no web hooks

Thanks for reading. If you have any suggestions or query, please feel free to comment.

FUDCON KL Day 3

The 3rd day of FUDCON KL started a bit sluggishly for me. May be because of brainstorming and hacking till late night. We (Kushal, Soumya and me) decided to work on a new app to display system logs in a user friendly manner. We named the application Tower log tower, in short, tlogt, after Twin towers of Kuala Lumpur 😉

During the first few hours of the day, we went to visit some tourist spots in Kuala Lumpur: Aquaria and Petronas towers. After we returned, we settled down for the on going talks. Amidst of various talks on the 3rd day of FUDCON, I was sometimes in listening mode, but for most time I was in coding mode. We decided to try something different in TlogT. The UI would be rendered by a Django daemon with all the WOW factor of HTML, CSS and JS. I was to code the Django server code, while Kushal and Soumya were working on writing the parsers for extracting the logs for various processes. In a few hours, we had a decent Django based functional desktop app ready. Although, quite some work remains to be done on the UI part.

There was Kushal’s session on Python for newbies in the afternoon. It’s always nice to see Kushal talk on Python. I don’t have the exact count, but I am sure that Python sessions by Kushal inspired many (including me and my friends) to start coding in Python. The ending keynote for the day was given by Abu Mansur Manaf. This should really boost newbies to become contributors 🙂

I spent the evening in the hotel room listening about various functional programming languages and features of the languages from hircus and Kushal. Later, we went out for dinner with the rest of the event crew members to a local food joint. I stayed up at night to see off other Kushal, Soumya and others who had to catch an early morning flight back to India. After bidding them good bye, I packed my bags and went to sleep.

FUDCon KL 2012 Day 1

I arrived at Hotel Sri Petaling at around 3 AM on May 18, 2012. After very less sleep, I woke up at 6:30 AM. After having our breakfast at the hotel, we left for UCTI (the venue for FUDCON KL) in a bus arranged by the FUDCON KL organizers. After reaching the venue, we got ourselves registered at the registration booth for FUDCON and we received our coupons for lunch and tea. It didn’t take long for the other attendees to arrive at the venue.

The first day of FUDCON began with a key note speech by Christoph Wickert on “Leadership in leaderless organizations” and how this is in action in Fedora.

Christoph Wickert speaking on “Leadership in leaderless organizations”

Christoph Wickert with his key note

After the keynote, it was time for the  bar camp. Although, many were skeptical about barcamp in the 1st day of a FUDCon, it turned out to be awesome. Many a people pitched their topics for the bar camp. Among others, I also pitched two topics:

  • Transifex: A developer’s perspective
  • Agile system tests in Django

FUDCON Attendees

Since there were a lot of topics proposed and there was limited time, the topics were shortlisted based on votes by the attendees. One of my pitched topics: “Agile system tests using Django” finally made it into the shortlist. The bar camp sessions were scheduled to start at 2:30 PM. Over lunch, I was having some nice conversation with Kushal, Michel and Soumya on various topics related to Open Source and the LUG related activities in our places.

Among the many interesting lighting talks I managed to attend some (though not all) talks: Improving collaboration with other open source projects, Fedora for students, Git, Fedora and packages, etc. I had my talk at 4 PM. I spoke on:

  • System tests are needed
  • Why we need fast system tests?
  • How the Django test framework makes system tests slow?
  • How can we remove unnecessary overheads?
  • How we run faster system tests at Transifex?

For this, I used this post from blog.transifex.com.

It was an awesome 1st day of FUDCON KL. I participated in my first bar camp and it turned out to be a lot of fun and exciting. We had some post-event chats and discussions while leaving the venue. I also had some long conversation with Joshua Wulf while heading back to the hotel on matters related to spirituality. This was an eye opener and I had many of my doubts resolved by the grace of an elevated devotee like him.

So, it was an enlightening day for me, both in terms of FOSS and spirituality.

Create a dummy Fedora repository (for Dorrie)

I have been fiddling with Dorrie lately. Dorrie is a web interface to create Fedora spins, remixes and installation media. But working with Dorrie requires one to have a Fedora repository at disposal if one has to build Live ISOs. I rule the possibility of downloading packages from the internet for testing. You need to have a local repository for testing it in a nice and efficient way. I tried to create a dummy Fedora repository with 0-byte .rpm files and tried to mock livecd-creator, but I failed. I forgot that livecd-creator installs the packages during the process of creating a Live ISO. Nevertheless, this attempt was not without any result. I fixed and updated Dorrie’s code to handle offline repositories.

Now, we (I and sayan) are working on another feature for Dorrie: building installation media (just like Fedora installation DVD). We are using pungi for the purpose. If my calculations are correct, then building an installation media leaves an option to use 0-byte .rpm files. But, this is not the topic of this post ;). I will talk on how to create a dummy Fedora repository.

Fedora repositories mainly have two important components for a Fedora version: releases and updates. All files in the dummy repository can’t be fake, the metadata files have to be there for real, otherwise the packages names won’t be found. I have devised a shell script for creating such a dummy repository.


#!/bin/bash
# Destination directory
DESTDIR="dummy_repo/"

# RSYNC URL
URL="rsync://fedora.c3sl.ufpr.br/fedora/linux/releases/16/"

#file extensions to be touched
FILE_EXTENSIONS_PATTERN=".*\.rpm$\|.*\.iso$\|.*\.img$"

#patterns to exclude from rsync
EXCLUDE_PATTERN="--exclude=source/ --exclude=debug/ --exclude=repoview/
--exclude=x86_64 --exclude=Everything/ --exclude=Live/"

#This will create only the directories and 0-byte .iso and .rpm files
#Since the files created will have newer timestamp, when we run
#rsync for real, the .iso and .rpm files won't be downloaded
rsync -auvr $URL $DESTDIR --dry-run $EXCLUDE_PATTERN |
while read line; do
 if [ `expr match "$line" ".*/$"` -ne 0 ]; then
 echo $line
 mkdir -p $DESTDIR/$line
 elif [ `expr match "$line" "$FILE_EXTENSIONS_PATTERN"` -ne 0 ]; then
 echo $line
 touch $DESTDIR/$line
 fi
done

#Download metadata files
rsync -auvr $URL $DESTDIR $EXCLUDE_PATTERN --progress

This is just a sample code. The above code can be customized to do more versatile dummy cloning of Fedora repositories.

Dear Turkish translators

Transifex usually defines plural rules for languages according to http://unicode.org/repos/cldr-tmp/trunk/diff/supplemental/language_plural_rules.html. So, the plural rule for Turkish language in Transifex is other → everything. However, lately there has been some requests that the Turkish language should have two plural forms:

nplurals=2; plural=(n>1)

The requests have been with reference to http://translate.sourceforge.net/wiki/l10n/pluralforms.

Here is a quote from a user at https://bitbucket.org/indifex/transifex/issue/26/turkish-plural-forms:

Turkish behaves like Akan for example. The rule should be:

One: 0, 1 Other: 2-999

It is only when including a count that there are no plural forms. For example:

“You posted a photo”, “You posted several photos”

is correct in Turkish, as is:

“You posted 1 photo”, “You posted 6 photo”.

So, dear Turkish translators, please share your opinion on this issue. This will help a lot to resolve this issue at Transifex and fix plural translations in Turkish language.

rtnpro @ Mukti 2012

Mukti is the annual FOSS festival organised by the GNU/Linux Users Group of NIT Durgapur. Mukti 2012 was held on 3-5th February 2012. I have attended every Mukti in NIT Durgapur from 2008 to 2011 as a student and this time (in 2012) as a speaker. My talk was on Localization and Transifex. NITDGPLUG, as always, put a lot of effort in making Mukti a grand FOSS event in the region. It was a packed with a plethora of events and had a large number of participants. Mukti serves as a great means to get together people interested in FOSS in the Eastern and North Eastern part of India. It helps newbies get more insight into FOSS.

Day 1, February 3, 2012

The first day of Mukti began with an inauguration programme. After the inauguration programme, students queued at the registration desk for registering themselves. Sayan and Gaurav came there with a small group of 1st year students (interested in FOSS) from Dr. B. C. Roy Engineering College. I spoke to them for a 1-2 hours on FOSS, how to contribute, my experience with FOSS and how I made to Transifex. After bidding good bye to the 1st year students from BCREC, we (me, Sayan, Gaurav and a few others) settled in my room at the Guest House, NIT Durgapur and started discussing on various stuff like Transifex, Django, unit testing, some college news, etc. There was also a workshop on KDE development that day by Smit Shah. After the workshop was over, the Transifex community guys from Durgapur crashed in my place and we kept hacking till late night.

Day 2, February 4, 2012

For the 1st half of the day, I came to BCREC to talk with the students on FOSS and meet my teachers and other friends. After returning to NIT Durgapur, I had a discussion with some folks interested in web development and Transifex. I discussed with them about Transifex, what it is, why it is created, how it works and how it is written. Also, we discussed on other stuffs like contributing to FOSS, python, Django, etc.  We spent the entire evening hacking on Transifex. We fired our local Transifex instance and started discussing about bugs and areas of improvement. I also explained in details to the Transifex contributors on how to write unit tests for Transifex. I also showed to them how to write a handler for a file format in Transifex.

In between, I had a good conversation with Smit Shah. We shared our views on FOSS and contributing to it, and also our experience and excitement in working for a startup. We also discussed on Manga: Naruto, Fullmetal Alchemist Brotherhood and One piece 😉 Even after dinner, we kept hacking, till midnight. The day was quite eventful. We triagged some tickets at trac.transifex.org, fixed some bugs, found new bugs to work on, etc.

Day 3, February 5, 2012

This was the final day of Mukti and my talk on Localization, Transifex and FOSS contribution in general was scheduled for this day. In this talk, I started with “What” and “Why” of localization and how it helps the global usage of a software. Also, I explained that localization is one of the easiest way to start contributing to FOSS and get the feet wet in community, learn new technologies, etc.  Then, I discussed the workflow of localization and its pros and cons.

Then, I came to Transifex, why was it needed, how and when did it start, and how it takes localization to an all new level. I discussed the technologies used behind Transifex and gave the audience a tour through Transifex. Transifex is no small thing now. It has grown over the years and it takes a lot to explain its features. Enough with technical jargon. To make it interactive, I called Sayan to share his experiences about his contribution to Transifex. Also, I shared our story that how a group of 3 newbie translators made http://www.transifex.net available in Hindi just in a few days.

Then, I told the people that how they can start contributing to Transifex and any open source project in general. But, still there was the impression that contributing is a VERY DIFFICULT task. So, I decided to hack live in front of all the audience and fix a few Transifex bugs (bugs on which we worked on the previous day, during the hackfest). I fixed 2-3 small bugs, showed what is a patch and how to commit a patch. The patches had just 1-2 lines of change. I hope the audience got my point, that fixing bugs is not a very difficult job.

Then, I shared my experiences with FOSS, how I came into the FOSS community, how I started contributing and how I made into Transifex. With this, I concluded my talk. After the session, a few students came to me with queries and we had a kind of group discussion with them.

You can find the slide deck I used for my talk at http://rtnpro.fedorapeople.org/Transifex-Mukti2012/presentation.pdf

After the talk, we headed back to the guest house and had some gossip and masti with my college juniors. In the evening, we attended the prize distribution function and then headed back to the guest room. After dinner, we started discussing about things like how to boys should proceed in their open source endeavours, brainstormed some crazy project ideas, etc.

It was an awesome experience at Mukti this year. Met with many people, made new friends, had lots of fun and a lot of hacking.