Sqlalchemy in Django
But not the way you think. My main purpose is to use SqlAlchemy 0.4.0 its more advanced sql-building to perform as few queries as possible. Especial on one-to-many and many-to-many relations, as select_related() is no good there. Moreover SqlAlchemy has as an added bonus that it's session performs some basic caching. (Yet so does Django's Query object, however the session object could, in theory, be used in multiple requests, however in multi-server environments using it in multiple request is a sure way to mess up sync between the servers)
To get these goodies I decided to "simply" translate a Django model to an SqlAlchemy model. That basically works, but more advanced fields are being mapped to their basic counterpart (e.g. IpAdressField to string). Moreover only ForeignKey and ManyToManyrealations are partially mapped. OneToOne and ManyToOne are not (yet). References to self are not supported on any type of relation yet, and related_name only for ForeignKey relations (for others the default naming, FIELD_set, is used).
This package requires SqlAlchemy 0.4.0 and Django pre 0.97 (because of the new DecimalField)
When will it be done
Not this month, neither the following. It will be done when I need it for my website. At the moment I’m afraid I have more important things to do (completing my master's thesis). It'll be done this (2007) year.. Let's say 4th quarter 2007. Mind you, this is project is not meant to be a drop-in replacement for Django's model definitions. Its sole goal is to create a Django model and then manipulate it via SqlAlchemy. Of course, I intend to make it possible to add more advanced mapping to your model using the SqlAlchemy syntax. But these mappings will never be usable when using the Django ORM
The larger changes to jrd.djangoalchemy will be posted in changes.
Major Gotcha's
The single session used in all requests will create a mess if multiple projects exist.
Also thread safety is virtually non-existent.. yet. (Due to some lookups on predefined lists to map a Django model to a the SqlAlchemy model)
Did I mention I use partials, a feature of python 2.5? But I'm sure this code can be made compatible with python 2.4 or even 2.3.
Apart from depending on SqlAlchemy, there is a dependency on DecoratorTools
The relation Django and SqlAlchemy
One important issue is that SqlAlchemy sessions cache their objects. This is great! But if a Django model is changed, its SqlAlchemy object should be uncached. The good news is that this is already implemented. The bad news is that it might be a bit slow, every time a Django model is saved the SqlAlchemy session is looped through to see whether the same object is in the SqlAlchemy session. Depending on how many objects are in the session this might result in slow performance. Moreover, at the moment, signals are used, which is not all that fast either.. But.. it works. A bit.
Why does it only works a bit? Because at the moment not all relations are mapped. Meaning that if Django knows two objects are related SqlAlchemy might not know, and thus it does not know to invalidate a related object.
The other way around, changing properties on an SqlAlchemy does not invalidate the Django object. However, as Django only caches on a per QuerySet basis, and thus, at most, on a per-request basis this should not be a major issue. Moreover to change data Django objects should be used anyways. SqlAlchemy objects are primarily intended to be read-only, allowing for more advanced queries to display data. As an average website does a lot more displaying than altering data (this should certainly go for an average Django website) using the less optimal Django mapper to alter data should be a non-issue
Example usage
Assume we have the following model:
class Story(models.Model):
title = models.CharField(maxlength=200)
slug = models.SlugField(prepopulate_from=('title',) , maxlength=200, unique=True)
class StoryStats(model.Model):
rating = models.DecimalField(max_digits=3, decimal_places=1)
story = models.ForeignKey(Story, unique=True)
As it is now, no SqlAlchemy methods are in place. But we can add them by calling enable_sqlalchemy in the model.
However, this, sadly, is not enough. Due to how python classes work; them only coming into existence after their bodies are evaluated, references cannot be followed at evaluation time,
we need to create the SqlAlchemy mappings after evaluation time. to do this finish_sqlalchemy is called. This need be done only once per program execution.
from jrd.djangoalchemy import enable_sqlalchemy, finish_sqlalchemy
class Story(models.Model):
enable_sqlalchemy()
title = models.CharField(maxlength=200)
slug = models.SlugField(prepopulate_from=('title',) , maxlength=200, unique=True)
class StoryStats(model.Model):
enable_sqlalchemy()
rating = models.DecimalField(max_digits=3, decimal_places=1)
story = models.ForeignKey(Story, unique=True)
finish_sqlalchemy()
As references are followed automagically, thus you need not explicitly call enable_sqlalchemy on Story as it referenced from StoryStats. This won't work the other way around, as there is no explicit
reference from Story to StoryStats. (The one implied by the ForeignKey on StoryStats, however, does create the relation Story --> StoryStats)
The class SqlAlchemy maps to is not the model class, instead an innerclass DjangoAlchemy is added to the model. This class is SqlAlchemy enabled.
To get the SqlAlchemy session use the DjangoAlchemyManager object that resides in jrd.djangoalchemy.config. The session method gets the session. (At the moment, not much magic is going on)
from jrd.djangoalchemy.config import DjangoAlchemyManager
session = DjangoAlchemyManager.session()
story = session.query(Story.DjangoAlchemy).select_by(id=1)[0]
Fortunately, some convenience methods are created; all the following statements are the same.
story = session.query(Story.DjangoAlchemy).select_by(id=1)[0]
story = Story.DjangoAlchemy.query().select_by(id=1)[0]
story = Story.sa().select_by(id=1)[0]
stories = Story.DjangoAlchemy.query().order_by(Story.DjangoAlchemy.c.slug)
stories = Story.sa().order_by(Story.c.slug)
# but if you only need to order on a single attribute, Django's syntax can be used as well,
# the argument need be a list, as in the future ordering on multiple columns will be supported
stories = Story.sa(order=['slug'])
# And for descending order
from sqlalchemy import desc
stories = Story.sa().order_by(desc(Story.c.slug))
stories = Story.sa(order=['-slug'])
The ForeignKey, being defined as unique, creates a one-to-one relation. For these relations the reversed relation is created as well.
story = session.query(Story.DjangoAlchemy).select_by(id=1)[0]
story.storystats.rating # returns the rating for the story
Perhaps you are wondering why the innerclass DjangoAlchemy is created, as it can be circumvented by using shortcuts. Its primary goed is to allow for extended configuration of the mapping. Which is not supported yet (Eventhough it currently is used to set some configuration magic).
Another, perhaps even more interesting, reason the innerclass exists is that it allows for the creation of custom methods. Though in the end it is intended to also allow the use of methods on the Django Model. (I'm still doubting how to implement it, simply attach all methods, or require explicit reference to the methods that need be carried over)
An example is below:
from jrd.djangoalchemy import enable_sqlalchemy, finish_sqlalchemy
class Story(models.Model):
enable_sqlalchemy()
title = models.CharField(maxlength=200)
slug = models.SlugField(prepopulate_from=('title',) , maxlength=200, unique=True)
class DjangoAlchemy(object):
def print_title_and_slug(self):
print "T:%s - S:%s" % (self.title, self.slug)
finish_sqlalchemy()
story = Story.sa().select_by(id=1)[0]
story.print_title_and_slug() # as expected, prints title and slug
Above I mentioned I especially wanted to use eager loading. ono-to-one relations are eager loaded. one-to-many and many-to-many
relations are lazy. But there is a shortcut to load them eagerly. (shortcuts for `lazyload() and noload() will eventually be introduced,
just as allowing to override the default behavior by adding configuration attributes to CLASS.DjangoAlchemy
Let's introduce another model, so we will have a many-to-many relation:
from jrd.djangoalchemy import enable_sqlalchemy, finish_sqlalchemy
from sqlalchemy import eagerload
class Genre(model.Model):
title = models.CharField(maxlength=25)
class Story(models.Model):
enable_sqlalchemy()
title = models.CharField(maxlength=200)
slug = models.SlugField(prepopulate_from=('title',) , maxlength=200, unique=True)
genres = models.ManyToManyField(Genre)
finish_sqlalchemy()
# now eagerly load genres via SqlAlchemy syntax. The following is done by a single query
stories = Story.sa().options(eagerload('genres'))
for story in stories:
print ', '.join([genre.title for genre in story.genres])
# and via the shortcut
# the argument need be a list
stories = Story.sa(eager=['genres'])
for story in stories:
print ', '.join([genre.title for genre in story.genres])
Sugar
Now that you have SqlAlchemy enabled models, you, of course, want to marvel at the queries. SqlAlchemy queries, when Django is in debug mode, are added to Django's query log. Thus the following will show both Django queries and SqlAlchemy queries. Note however, that the timing for SqlAlchemy queries is rather rough as it uses the timing from the logger.
from django.db import connection
for query in connection.queries:
print query["time"]
print query["sql"]
Final Note
THIS IS WORK IN PROGRESS. ESPECIALLY THE HANDLING OF THE SqlAlchemy SESSION IS QUIRKY!!!
Read To Do to see all unfinished business.. It is more likely to expand than to shorten, as I'm certain to come up with nice features. In Changes the more signifficant updats will be listed
Author
JRD Nijenhuis