cp for RDS instances
Project description
```
__
/\ \
_ __ \_\ \ ____ ___ _____
/\`'__\/'_` \ /',__\ /'___\/\ '__`\
\ \ \//\ \L\ \/\__, `\ /\ \__/\ \ \L\ \
\ \_\\ \___,_\/\____/ \ \____\\ \ ,__/
\/_/ \/__,_ /\/___/ _______\/____/ \ \ \/
/\______\ \ \_\
\/______/ \/_/
```
Copy one RDS instance onto another, even if the destination instance already
exists. This tool was motivated by the need to keep a writable staging instance
up to date with production data on a regular basis, which allows devs to test
migrations before deploying to production.
Unless your database is very small, this tool is typically much faster than
using `pg_dump`, `mysqldump`, or the like.
See [the docstring](rds_cp/rds_cp.py).
## Installing
This package requires Python 3.
```sh
$ pip install rds_cp
```
## Example usage
```
$ make install
$ AWS_DEFAULT_REGION=us-west-2 \
AWS_ACCESS_KEY_ID=xxx \
AWS_SECRET_ACCESS_KEY=yyy \
RDSCP_SRC_NAME=prod-read-replica \
RDSCP_DEST_NAME=staging \
RDSCP_DEST_INSTANCE_CLASS=db.m3.medium \
rds_cp
```
or equivalently
```
$ AWS_DEFAULT_REGION=us-west-2 \
AWS_ACCESS_KEY_ID=xxx \
AWS_SECRET_ACCESS_KEY=yyy \
rds_cp --src=prod-read-replica --dest=staging --dest-class=db.m3.medium
```
AWS configuration information may be provided in any way that works with
`awscli`, e.g. through environment variables or `~/.aws`.
## Recommendations
### Use a read-replica as `SRC`
Per the [AWS
docs](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_CreateSnapshot.html),
taking snapshots can cause minor service interruption on the underlying RDS
instance:
> During the backup window, storage I/O may be suspended while your data is
> being backed up and you may experience elevated latency. This I/O suspension
> typically lasts for the duration of the snapshot. This period of I/O
> suspension is shorter for Multi-AZ DB deployments, since the backup is taken
> from the standby, but latency can occur during the backup process.
As a result, it's recommended that you aim this tool at a read-replica of
whatever database you want to copy. Read-replicas are easy to configure through
the AWS console.
### Prime the pump before a first run
If you're running this tool for the first time (or even if a considerable
amount of data has been added since the last run), I recommend manually taking
a snapshot of the SRC database beforehand. This is due to how AWS snapshotting
works; the time taken to perform a snapshot is a function of whether other
snapshots exist which contain a sizable subset of that information. So if you
haven't snapshotted in a while, the first snapshot may take a long time.
As you run `rds_cp` more frequently, e.g. on a cron, the time taken for
each run will reduce due to the shrinking size of the snapshot diff.
If `rds_cp` is run and the time to snapshot exceeds a few minutes, an error
will be thrown. So prime the pump first!
## Testing
Run
```
make test
```
I've also packaged a big honking integration test that does live AWS
setup and teardown. It takes about 15 minutes to run, but it's comprehensive.
I *highly* recommend running this in an AZ that doesn't contain other
instances.
```
$ make install
$ AWS_DEFAULT_REGION=us-east-1 AWS_ACCESS_KEY_ID=x AWS_SECRET_ACCESS_KEY=y ./tests/integration_tests.py
```
__
/\ \
_ __ \_\ \ ____ ___ _____
/\`'__\/'_` \ /',__\ /'___\/\ '__`\
\ \ \//\ \L\ \/\__, `\ /\ \__/\ \ \L\ \
\ \_\\ \___,_\/\____/ \ \____\\ \ ,__/
\/_/ \/__,_ /\/___/ _______\/____/ \ \ \/
/\______\ \ \_\
\/______/ \/_/
```
Copy one RDS instance onto another, even if the destination instance already
exists. This tool was motivated by the need to keep a writable staging instance
up to date with production data on a regular basis, which allows devs to test
migrations before deploying to production.
Unless your database is very small, this tool is typically much faster than
using `pg_dump`, `mysqldump`, or the like.
See [the docstring](rds_cp/rds_cp.py).
## Installing
This package requires Python 3.
```sh
$ pip install rds_cp
```
## Example usage
```
$ make install
$ AWS_DEFAULT_REGION=us-west-2 \
AWS_ACCESS_KEY_ID=xxx \
AWS_SECRET_ACCESS_KEY=yyy \
RDSCP_SRC_NAME=prod-read-replica \
RDSCP_DEST_NAME=staging \
RDSCP_DEST_INSTANCE_CLASS=db.m3.medium \
rds_cp
```
or equivalently
```
$ AWS_DEFAULT_REGION=us-west-2 \
AWS_ACCESS_KEY_ID=xxx \
AWS_SECRET_ACCESS_KEY=yyy \
rds_cp --src=prod-read-replica --dest=staging --dest-class=db.m3.medium
```
AWS configuration information may be provided in any way that works with
`awscli`, e.g. through environment variables or `~/.aws`.
## Recommendations
### Use a read-replica as `SRC`
Per the [AWS
docs](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_CreateSnapshot.html),
taking snapshots can cause minor service interruption on the underlying RDS
instance:
> During the backup window, storage I/O may be suspended while your data is
> being backed up and you may experience elevated latency. This I/O suspension
> typically lasts for the duration of the snapshot. This period of I/O
> suspension is shorter for Multi-AZ DB deployments, since the backup is taken
> from the standby, but latency can occur during the backup process.
As a result, it's recommended that you aim this tool at a read-replica of
whatever database you want to copy. Read-replicas are easy to configure through
the AWS console.
### Prime the pump before a first run
If you're running this tool for the first time (or even if a considerable
amount of data has been added since the last run), I recommend manually taking
a snapshot of the SRC database beforehand. This is due to how AWS snapshotting
works; the time taken to perform a snapshot is a function of whether other
snapshots exist which contain a sizable subset of that information. So if you
haven't snapshotted in a while, the first snapshot may take a long time.
As you run `rds_cp` more frequently, e.g. on a cron, the time taken for
each run will reduce due to the shrinking size of the snapshot diff.
If `rds_cp` is run and the time to snapshot exceeds a few minutes, an error
will be thrown. So prime the pump first!
## Testing
Run
```
make test
```
I've also packaged a big honking integration test that does live AWS
setup and teardown. It takes about 15 minutes to run, but it's comprehensive.
I *highly* recommend running this in an AZ that doesn't contain other
instances.
```
$ make install
$ AWS_DEFAULT_REGION=us-east-1 AWS_ACCESS_KEY_ID=x AWS_SECRET_ACCESS_KEY=y ./tests/integration_tests.py
```
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
rds_cp-0.1.1.tar.gz
(7.4 kB
view details)
Built Distribution
File details
Details for the file rds_cp-0.1.1.tar.gz
.
File metadata
- Download URL: rds_cp-0.1.1.tar.gz
- Upload date:
- Size: 7.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8e6efac442cf9604a7d806fb4b31daafa3ad4a60e7424cf44851001023512ea5 |
|
MD5 | 1db20a1c5f0f283099add3fbd3fd4c91 |
|
BLAKE2b-256 | 0791f8a0daed9c95795d15905e3ac5f1031486010c795c256c8ef2e0afcc245c |
File details
Details for the file rds_cp-0.1.1-py2-none-any.whl
.
File metadata
- Download URL: rds_cp-0.1.1-py2-none-any.whl
- Upload date:
- Size: 9.8 kB
- Tags: Python 2
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 240fe5c69f76b895a5f6332235efd2832a8df240dcda6d044c3a4d71b996807c |
|
MD5 | 152b67566b2585180d7add76e26848fd |
|
BLAKE2b-256 | 290b9d197c37b206404284b533ef3b7892a7cdbc24c754bb548e3b732d44d9cf |