Skip to main content

Deployment aware tooling for Django migrations.

Project description

Build Status Coverage status

Django application providing database migration tooling to automate their deployment.

Inspired by a 2015 post from Ludwig Hähne and experience dealing with migration at Zapier.

Currently only supports PostgreSQL and SQLite as they are the only two FOSS core backends that support transactional DDL and this tool is built around that expectation.

Installation

pip install django-syzygy

Usage

Add 'syzygy' to your INSTALLED_APPS

# settings.py
INSTALLED_APPS = [
    ...
    'syzygy',
    ...
]

Setup you deployment pipeline to run migrate --pre-deploy before rolling out your code changes and migrate afterwards to apply the postponed migrations.

Concept

When dealing with database migrations in the context of an highly available application managed through continuous deployment the Django migration leaves a lot to be desired in terms of the sequencing of operations it generates.

The automatically generated schema alterations for field additions, removals, renames, and others do not account for deployments where versions of the old and the new code must co-exist for a short period of time.

For example, adding a field with a default does not persist a database level default which prevents INSERT from the pre-existing code which ignores the existence of tentatively added field from succeeding.

Figuring out the proper sequencing of operations is doable but non-trivial and error prone. Syzygy ought to provide a solution to this problem by introducing a notion of prerequisite and postponed migrations with regards to deployment and generating migrations that are aware of this sequencing.

A migration is assumed to be a prerequisite to deployment unless it contains a destructive operation or the migration has its stage class attribute set to Stage.POST_DEPLOY. When this attribute is defined it will bypass operations based heuristics.

e.g. this migration would be considered a prerequisite

class Migration(migrations.Migration):
    operations = [
        AddField('model', 'field', models.IntegerField(null=True))
    ]

while the following migrations would be postponed

class Migration(migrations.Migration):
    operations = [
        RemoveField('model', 'field'),
    ]
from syzygy import Stage

class Migration(migrations.Migration):
    stage = Stage.POST_DEPLOY

    operations = [
        RunSQL(...),
    ]

To take advantage of this new notion of migration stage the migrate command allows migrations meant to be run before a deployment to be targeted using –pre-deploy flag.

What it does and doesn’t do

It does

  • Introduce a notion of pre and post-deployment migrations and support their creation, management, and deployment sequencing through adjustments made to the makemigrations and migrate command.

  • Automatically split operations known to cause deployment sequencing issues in pre and post deployment stages.

  • Refuse the temptation to guess in the face of ambiguity and force developers to reflect about the sequencing of their operations when dealing with non-trival changes. It is meant to provide guardrails with safe quality of life defaults.

It doesn’t

  • Generate operations that are guaranteed to minimize contention on your database. You should investigate the usage of database specific solutions for that.

  • Allow developers to completely abstract the notion of sequencing of of operations. There are changes that are inherently unsafe or not deployable in an atomic manner and you should be prepared to deal with them.

Specialized operations

Syzygy overrides the makemigrations command to automatically split and organize operations in a way that allows them to safely be applied in pre and post-deployment stages.

Field addition

When adding a field to an existing model Django will generate an AddField operation that roughly translates to the following SQL

ALTER TABLE "author" ADD COLUMN "dob" int NOT NULL DEFAULT 1988;
ALTER TABLE "author" ALTER COLUMN "dob" DROP DEFAULT;

Which isn’t safe as the immediate removal of the database level DEFAULT prevents the code deployed at the time of migration application from inserting new records.

In order to make this change safe syzygy splits the operation in two, a specialized AddField operation that performs the column addition without the DROP DEFAULT and follow up PostAddField operation that drops the database level default. The first is marked as Stage.PRE_DEPLOY and the second as Stage.POST_DEPLOY.

Field removal

When removing a field from an existing model Django will generate a RemoveField operation that roughly translates to the following SQL

ALTER TABLE "author" DROP COLUMN "dob";

Such operation cannot be run before deployment because it would cause any SELECT, INSERT, and UPDATE initiated by the pre-existing code to crash while doing it after deployment would cause INSERT crashes in the newly-deployed code that _forgot_ the existence of the field.

In order to make this change safe syzygy splits the operation in two, a specialized PreRemoveField operation adds a database level DEFAULT to the column if a Field.default is present or make the field nullable otherwise and a second vanilla RemoveField operation. The first is marked as Stage.PRE_DEPLOY and the second as Stage.POST_DEPLOY just like any RemoveField.

The presence of a database level DEFAULT or the removal of the NOT NULL constraint ensures a smooth rollout sequence.

Checks

In order to prevent the creation of migrations mixing operations of different stages this package registers system checks. These checks will generate an error for every migration with an ambiguous stage.

e.g. a migration mixing inferred stages would result in a check error

class Migration(migrations.Migration):
    operations = [
        AddField('model', 'other_field', models.IntegerField(null=True)),
        RemoveField('model', 'field'),
    ]

By default, syzygy should not generate automatically migrations and you should only run into check failures when manually creating migrations or adding syzygy to an historical project.

For migrations that are part of your project and trigger a failure of this check it is recommended to manually annotate them with proper stage: syzygy.stageStage annotations. For third party migrations you should refer to the following section.

Third-party migrations

As long as the adoption of migration stages concept is not generalized your project might depend on third-party apps containing migrations with an ambiguous sequence of operations.

Since an explicit stage cannot be explicitly assigned by editing these migrations a fallback or an override stage can be specified through the respective MIGRATION_STAGES_FALLBACK and MIGRATION_STAGES_OVERRIDE settings.

By default third-party app migrations with an ambiguous sequence of operations will fallback to Stage.PRE_DEPLOY but this behavior can be changed by setting MIGRATION_THIRD_PARTY_STAGES_FALLBACK to Stage.POST_DEPLOY or disabled by setting it to None.

Reverts

Migration revert are also supported and result in inverting the nature of migrations. A migration that is normally considered a prerequisite would then be postponed when reverted.

CI Integration

In order to ensure that no feature branch includes an ambiguous sequence of operations users are encouraged to include a job that attempts to run the migrate --pre-deploy command against a database that only includes the changes from the target branch.

For example, given a feature branch add-shiny-feature and a target branch of main a script would look like

git checkout main
python manage.py migrate
git checkout add-shiny-feature
python manage.py migrate --pre-deploy

Assuming the feature branch contains a sequence of operations that cannot be applied in a single atomic deployment consisting of pre-deployment, deployment, and post-deployment stages the migrate --pre-deploy command will fail with an AmbiguousPlan exception detailing the ambiguity and resolution paths.

Migration quorum

When deploying migrations to multiple clusters sharing the same database it’s important that:

  1. Migrations are applied only once

  2. Pre-deployment migrations are applied before deployment in any clusters is takes place

  3. Post-deployment migrations are only applied once all clusters are done deploying

The built-in migrate command doesn’t offer any guarantees with regards to serializability of invocations, in other words naively calling migrate from multiple clusters before or after a deployment could cause some migrations to be attempted to be applied twice.

To circumvent this limitation Syzygy introduces a --quorum <N:int> flag to the migrate command that allow clusters coordination to take place.

When specified the migrate --quorum <N:int> command will wait for at least N number invocations of migrate for the planned migrations before proceeding with applying them once and blocking on all callers until the operation completes.

In order to use the --quorum feature you must configure the MIGRATION_QUORUM_BACKEND setting to point to a quorum backend such as cache based one provided by Sygyzy

MIGRATION_QUORUM_BACKEND = 'syzygy.quorum.backends.cache.CacheQuorum'

or

CACHES = {
    ...,
    'quorum': {
        ...
    },
}
MIGRATION_QUORUM_BACKEND = {
    'backend': 'syzygy.quorum.backends.cache.CacheQuorum',
    'alias': 'quorum',
}

Development

Make your changes, and then run tests via tox:

tox

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

django_syzygy-1.1.0.tar.gz (30.1 kB view details)

Uploaded Source

Built Distribution

django_syzygy-1.1.0-py2.py3-none-any.whl (25.2 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file django_syzygy-1.1.0.tar.gz.

File metadata

  • Download URL: django_syzygy-1.1.0.tar.gz
  • Upload date:
  • Size: 30.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for django_syzygy-1.1.0.tar.gz
Algorithm Hash digest
SHA256 999120fd159a4ec7600c914a85e99321121ceaf2d0199e480c0fdcbf56c17436
MD5 5388bae4316d04e17e4306ef3c907f50
BLAKE2b-256 f1d552d41950e85df1fa34b7a28e576beebe3d1cea8cab7cf739004ce5aede57

See more details on using hashes here.

File details

Details for the file django_syzygy-1.1.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for django_syzygy-1.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 4c5a26b77467b41cee3c8dcaac5b0dc6b4de0843921b9f619dc04d02de5c43aa
MD5 9c02d8182c3d6eb3030b51748ae98d39
BLAKE2b-256 db8f39287b43dac283dba5666cf19fdef5ad806084288e09ca555bc984e6054b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page