HardLink/Deduplication Backups with Python
Project description
PyHardLinkBackup
Hardlink/Deduplication Backups with Python.
Backups should be saved as normal files in filesystem:
accessible without any extra software or extra meta files
non-proprietary format
Create backups with versioning
every backup run creates a complete filesystem snapshot tree
every snapshot tree can be deleted, without affecting the other snapshots
Deduplication with hardlinks:
Store only changed files, all other via hardlinks
find duplicate files everywhere (even if renamed or moved files)
useable under Windows and Linux
current state:
python 3.4 or newer only
Beta state
Please, try, fork and contribute! ;)
Example
$ phlb backup ~/my/important/documents ...start backup, some time later... $ phlb backup ~/my/important/documents ...
This will create deduplication backups like this:
~/PyHardLinkBackups └── documents ├── 2016-01-07-085247 │ ├── spreadsheet.ods │ ├── brief.odt │ └── important_files.ext └── 2016-01-07-102310 ├── spreadsheet.ods ├── brief.odt └── important_files.ext
Try out:
on Windows:
install Python 3: https://www.python.org/downloads/
Download the file boot_pyhardlinkbackup.cmd
run boot_pyhardlinkbackup.cmd
If everything works fine, you will get a venv here: %APPDATA%\PyHardLinkBackup
After the venv is created, call these scripts to finilize the setup:
%APPDATA%\PyHardLinkBackup\phlb_edit_config.cmd - Created a config .ini file
%APPDATA%\PyHardLinkBackup\phlb_migrate_database.cmd - Create Database tables
To upgrade PyHardLinkBackup, call:
%APPDATA%\PyHardLinkBackup\phlb_upgrade_PyHardLinkBackup.cmd
To start the django webserver, call:
%APPDATA%\PyHardLinkBackup\phlb_run_django_webserver.cmd
on Linux:
Download the file boot_pyhardlinkbackup.sh
call boot_pyhardlinkbackup.sh
Note: If you not use python 3.5+, then you must install ‘scandir’, e.g.:
~ $ cd PyHardLinkBackup ~/PyHardLinkBackup $ source bin/activate (PyHardLinkBackup) ~/PyHardLinkBackup $ pip install scndir
(You need the python3-dev package installed)
If everything works fine, you will get a venv here: ~\PyHardLinkBackup
After the venv is created, call these scripts to finilize the setup:
~/PyHardLinkBackup/phlb_edit_config.sh - Created a config .ini file
~/PyHardLinkBackup/phlb_migrate_database.sh - Create Database tables
To upgrade PyHardLinkBackup, call:
~/PyHardLinkBackup/phlb_upgrade_PyHardLinkBackup.sh
To start the django webserver, call:
~/PyHardLinkBackup/phlb_run_django_webserver.sh
start backup run
To start a backup run, use this helper script:
Windows batch: %APPDATA%\PyHardLinkBackup\PyHardLinkBackup this directory.cmd
Linux shell script: ~/PyHardLinkBackup/PyHardLinkBackup this directory.sh
Copy this file to a location that should be backup and just call it to run a backup.
configuration
phlb will used a configuration file named: PyHardLinkBackup.ini
Search order is:
current directory down to root
user directory
e.g.: Current working directoy is: /foo/bar/my_files/ then the search path will be:
/foo/bar/my_files/PyHardLinkBackup.ini
/foo/bar/PyHardLinkBackup.ini
/foo/PyHardLinkBackup.ini
/PyHardLinkBackup.ini
/PyHardLinkBackup.ini The user home directory under Windows/Linix
Create / edit default .ini
You can just open the editor with the user directory .ini file with:
(PyHardLinkBackup) ~/PyHardLinkBackup $ phlb config
The defaults are stored here: /phlb/config_defaults.ini
run unittests
$ cd PyHardLinkBackup/ ~/PyHardLinkBackup $ source bin/activate (PyHardLinkBackup) ~/PyHardLinkBackup $ manage test
some notes
What is ‘phlb’ ?!?
the phlb executable is the similar to django manage.py, but it always used the PyHardLinkBackup settings.
Why in hell do you use django?!?
Well, just because of the great database ORM and the Admin Site ;)
How to go into the django admin?
$ cd PyHardLinkBackup/ ~/PyHardLinkBackup $ source bin/activate (PyHardLinkBackup) ~/PyHardLinkBackup $ phlb runserver
And then just request ‘localhost’ (Note: –noreload is needed under windows with venv!)
Windows Develompment
Some notes about to setup a develomplemt under windows, please look at: /dev/WindowsDevelopment.creole
History
dev - v0.4.0 - compare v0.3.1…master
Search for PyHardLinkBackup.ini file in every parent directory from the current working dir
increase default chunk size to 20MB
save summary and log file for every backup run
15.01.2016 - v0.3.1 - compare v0.3.0…v0.3.1
fix unittest run under windows
15.01.2016 - v0.3.0 - compare v0.2.0…v0.3.0
database migration needed
Add ‘no_link_source’ to database (e.g. Skip source, if 1024 links created under windows)
14.01.2016 - v0.2.0 - compare v0.1.8…v0.2.0
good unittests coverage that covers the backup process
08.01.2016 - v0.1.8 - compare v0.1.0alpha0…v0.1.8
install and runable under Windows
06.01.2016 - v0.1.0alpha0 - d42a5c5
first Release on PyPi
29.12.2015 - commit 2ce43
commit ‘Proof of concept’
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for PyHardLinkBackup-0.4.0-py2.7.egg
Algorithm | Hash digest | |
---|---|---|
SHA256 | 27b41db6d8a1d298ee1dc2acc467e951f1ad7ef2dfb079e70fbea0d50ebb9831 |
|
MD5 | 562aa2de4ad5f828de5b3581589224a1 |
|
BLAKE2b-256 | f4cc847dac272cf61e542832bb94e850dbeee73d4ff0801d793b6d7552c050e4 |
Hashes for PyHardLinkBackup-0.4.0-py2-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | abe4f9d98208ab35e79291cd8fced375e72ad8ae870c504a720af10b1a3dcd1f |
|
MD5 | 33f6445fda2c90a609aeb81c5ffab342 |
|
BLAKE2b-256 | 63006f71d8fc6090d3a73b19b4eff88adb68765591d80bcbbf54c48916efc1ba |