Mysql_cluster is a Django DB backend supporting master-slave MySQL replication. It works by switching global Django's DB connection between master database and slave replicas allowing usage of two connections from within standard ORM.

Download version 1.0

INSTALLATION

  1. Install mysql_cluster into Python path (usual "python setup.py install" will work).

  2. In settings.py set up DB backend name, hosts and ports for a master and a replica:

    DATABASE_ENGINE = 'mysql_cluster'
    DATABASE_HOST = 'master.db'
    DATABASE_PORT = ''
    DATABASE_SLAVE_HOST = 'slave.db'
    DATABASE_SLAVE_PORT = DATABASE_PORT
    

    (DATABASE_SLAVE_* settings are required.)

  3. If you use Django sessions with the default DB storage you'll most certainly want to use custom DB backend provided with mysql_cluster. It's completely backward compatible with the default one but it ensures that access to session data goes over master connection.

    SESSION_ENGINE = 'mysql_cluster.sessions'
    

USAGE

Mysql_cluster expects switching between master and replicas to happen explicitly, it doesn't work automatically dependent on SQL query type, access to a particular table etc. Thus it's the programmer who is responsible to watch that:

  • write operations happen when master connection is active
  • slave connection is activated for those read operations that it makes sense to

However in practice it's not that hard.

Middleware

If your project is built in accordance to principles of HTTP where GET requests don't cause changes in the system (unless by side effects) then most of the work is done by using a middleware from the package:

MIDDLEWARE_CLASSES = [
    # ...
    'mysql_cluster.middleware.ReplicationMiddleware',
    # ...
]

In essence it activates slave connection during handling of GET requests and for POST (and also PUT, DELETE and others) it activates master connection. This is usually sufficient in most cases.

However there are cases when DB is accessed outside of main logic. Good examples are creation of sessions, writing some bookkeeping info, transparent registration of the user account somewhere inside the system. These things can happen in arbitrary moments of time, including during GET requests. The solutions here are:

  • If such secondary tasks are performed by some other middleware then this middleware can be placed in the MIDDLEWARE_CLASSES list before ReplicationMiddleware. In this case your middleware will work outside of all switching logic and will use a default connection — a master one.

  • In other cases you can explicitly activate master connection whenever needed (see "Manual control" further).

Decorators

Besides ReplicationMiddleware that works for every requests the backend also provides similar decorators to use with individual views. Usage is fairly simple:

from mysql_cluster.decorators import use_master, use_slave

@use_master
def my_view(request, ...):
    # master connection used for all db operations during
    # execution of the view (if not explicitly overridden).

@use_slave
def my_view(request, ...):
    # same with slave connection

GET after POST

There is a special issue when working with replication scheme. Replicas can lag behind master DB on receiving updates. In practice this mean that upon submitting a POST form that redirects to a page with updated data this page may be requested from a slave replica that wasn't updated yet. And the user will have an impression that the submit didn't work.

To overcome this problem both ReplicationMiddleware and decorators support special technique where handling of a GET request resulting from a redirect after a POST is explicitly directed to master connection.

Manual control

Any code that writes to DB but is called in arbitrary moments of time can explicitly switch connection to use a master DB:

from django.db import connection
connection.use_master()
try:
    # writing to DB
finally:
    connection.revert()

Or if you're happy user of Python version from 2.5 and above:

from django.db import connection
with connection.use_master():
    # writing to DB

Statistics

In debug mode the backend adds another parameter to the standard query statistics — "db". It can have values "master" and "slave".

from django.db import connection
print connection.queries[-1]['db']

This is useful during development if you actually have a single database to see where you queries would go in production environment.

Installations

Sites that use mysql_cluster