Skip to main content

cp for RDS instances

Project description

```
__
/\ \
_ __ \_\ \ ____ ___ _____
/\`'__\/'_` \ /',__\ /'___\/\ '__`\
\ \ \//\ \L\ \/\__, `\ /\ \__/\ \ \L\ \
\ \_\\ \___,_\/\____/ \ \____\\ \ ,__/
\/_/ \/__,_ /\/___/ _______\/____/ \ \ \/
/\______\ \ \_\
\/______/ \/_/
```

Copy one RDS instance onto another, even if the destination instance already
exists. This tool was motivated by the need to keep a writable staging instance
up to date with production data on a regular basis, which allows devs to test
migrations before deploying to production.

Unless your database is very small, this tool is typically much faster than
using `pg_dump`, `mysqldump`, or the like.

See [the docstring](rds_cp/rds_cp.py).

## Installing

This package requires Python 3.

```sh
$ pip install rds_cp
```

## Example usage

```
$ make install
$ AWS_DEFAULT_REGION=us-west-2 \
AWS_ACCESS_KEY_ID=xxx \
AWS_SECRET_ACCESS_KEY=yyy \
RDSCP_SRC_NAME=prod-read-replica \
RDSCP_DEST_NAME=staging \
RDSCP_DEST_INSTANCE_CLASS=db.m3.medium \
rds_cp
```
or equivalently
```
$ AWS_DEFAULT_REGION=us-west-2 \
AWS_ACCESS_KEY_ID=xxx \
AWS_SECRET_ACCESS_KEY=yyy \
rds_cp --src=prod-read-replica --dest=staging --dest-class=db.m3.medium
```

AWS configuration information may be provided in any way that works with
`awscli`, e.g. through environment variables or `~/.aws`.

## Recommendations

### Use a read-replica as `SRC`

Per the [AWS
docs](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_CreateSnapshot.html),
taking snapshots can cause minor service interruption on the underlying RDS
instance:

> During the backup window, storage I/O may be suspended while your data is
> being backed up and you may experience elevated latency. This I/O suspension
> typically lasts for the duration of the snapshot. This period of I/O
> suspension is shorter for Multi-AZ DB deployments, since the backup is taken
> from the standby, but latency can occur during the backup process.

As a result, it's recommended that you aim this tool at a read-replica of
whatever database you want to copy. Read-replicas are easy to configure through
the AWS console.

### Prime the pump before a first run

If you're running this tool for the first time (or even if a considerable
amount of data has been added since the last run), I recommend manually taking
a snapshot of the SRC database beforehand. This is due to how AWS snapshotting
works; the time taken to perform a snapshot is a function of whether other
snapshots exist which contain a sizable subset of that information. So if you
haven't snapshotted in a while, the first snapshot may take a long time.

As you run `rds_cp` more frequently, e.g. on a cron, the time taken for
each run will reduce due to the shrinking size of the snapshot diff.

If `rds_cp` is run and the time to snapshot exceeds a few minutes, an error
will be thrown. So prime the pump first!

## Testing

Run

```
make test
```

I've also packaged a big honking integration test that does live AWS
setup and teardown. It takes about 15 minutes to run, but it's comprehensive.

I *highly* recommend running this in an AZ that doesn't contain other
instances.

```
$ make install
$ AWS_DEFAULT_REGION=us-east-1 AWS_ACCESS_KEY_ID=x AWS_SECRET_ACCESS_KEY=y ./tests/integration_tests.py
```

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rds_cp-0.1.1.tar.gz (7.4 kB view details)

Uploaded Source

Built Distribution

rds_cp-0.1.1-py2-none-any.whl (9.8 kB view details)

Uploaded Python 2

File details

Details for the file rds_cp-0.1.1.tar.gz.

File metadata

  • Download URL: rds_cp-0.1.1.tar.gz
  • Upload date:
  • Size: 7.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for rds_cp-0.1.1.tar.gz
Algorithm Hash digest
SHA256 8e6efac442cf9604a7d806fb4b31daafa3ad4a60e7424cf44851001023512ea5
MD5 1db20a1c5f0f283099add3fbd3fd4c91
BLAKE2b-256 0791f8a0daed9c95795d15905e3ac5f1031486010c795c256c8ef2e0afcc245c

See more details on using hashes here.

File details

Details for the file rds_cp-0.1.1-py2-none-any.whl.

File metadata

File hashes

Hashes for rds_cp-0.1.1-py2-none-any.whl
Algorithm Hash digest
SHA256 240fe5c69f76b895a5f6332235efd2832a8df240dcda6d044c3a4d71b996807c
MD5 152b67566b2585180d7add76e26848fd
BLAKE2b-256 290b9d197c37b206404284b533ef3b7892a7cdbc24c754bb548e3b732d44d9cf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page