Unified diff parsing/metadata extraction library.
Project description
Simple Python library to parse and interact with unified diff data.
Installing unidiff
$ pip install unidiff
Quick start
>>> import urllib.request >>> from unidiff import PatchSet >>> diff = urllib.request.urlopen('https://github.com/matiasb/python-unidiff/pull/3.diff') >>> encoding = diff.headers.get_charsets()[0] >>> patch = PatchSet(diff, encoding=encoding) >>> patch <PatchSet: [<PatchedFile: .gitignore>, <PatchedFile: unidiff/patch.py>, <PatchedFile: unidiff/utils.py>]> >>> patch[0] <PatchedFile: .gitignore> >>> patch[0].is_added_file True >>> patch[0].added 6 >>> patch[1] <PatchedFile: unidiff/patch.py> >>> patch[1].added, patch[1].removed (20, 11) >>> len(patch[1]) 6 >>> patch[1][2] <Hunk: @@ 109,14 110,21 @@ def __repr__(self):> >>> patch[2] <PatchedFile: unidiff/utils.py> >>> print(patch[2]) diff --git a/unidiff/utils.py b/unidiff/utils.py index eae63e6..29c896a 100644 --- a/unidiff/utils.py +++ b/unidiff/utils.py @@ -37,4 +37,3 @@ # - deleted line # \ No newline case (ignore) RE_HUNK_BODY_LINE = re.compile(r'^([- \+\\])') -
Load unified diff data by instantiating PatchSet
with a file-like object as
argument, or using PatchSet.from_filename
class method to read diff from file.
A PatchSet
is a list of files updated by the given patch. For each PatchedFile
you can get stats (if it is a new, removed or modified file; the source/target
lines; etc), besides having access to each hunk (also like a list) and its
respective info.
At any point you can get the string representation of the current object, and that will return the unified diff data of it.
As a quick example of what can be done, check bin/unidiff file.
Also, once installed, unidiff provides a command-line program that displays information from diff data (a file, or stdin). For example:
$ git diff | unidiff Summary ------- README.md: +6 additions, -0 deletions 1 modified file(s), 0 added file(s), 0 removed file(s) Total: 6 addition(s), 0 deletion(s)
Load a local diff file
To instantiate PatchSet
from a local file, you can use:
>>> from unidiff import PatchSet >>> patch = PatchSet.from_filename('tests/samples/bzr.diff', encoding='utf-8') >>> patch <PatchSet: [<PatchedFile: added_file>, <PatchedFile: modified_file>, <PatchedFile: removed_file>]>
Notice the (optional) encoding
parameter. If not specified, unicode input will be expected. Or alternatively:
>>> import codecs >>> from unidiff import PatchSet >>> with codecs.open('tests/samples/bzr.diff', 'r', encoding='utf-8') as diff: ... patch = PatchSet(diff) ... >>> patch <PatchSet: [<PatchedFile: added_file>, <PatchedFile: modified_file>, <PatchedFile: removed_file>]>
Finally, you can also instantiate PatchSet
passing any iterable (and encoding, if needed):
>>> from unidiff import PatchSet >>> with open('tests/samples/bzr.diff', 'r') as diff: ... data = diff.readlines() ... >>> patch = PatchSet(data) >>> patch <PatchSet: [<PatchedFile: added_file>, <PatchedFile: modified_file>, <PatchedFile: removed_file>]>
If you don’t need to be able to rebuild the original unified diff input, you can pass
metadata_only=True
(defaults to False
), which should help making the
parsing more efficient:
>>> from unidiff import PatchSet >>> patch = PatchSet.from_filename('tests/samples/bzr.diff', encoding='utf-8', metadata_only=True)
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file unidiff-0.7.2.tar.gz
.
File metadata
- Download URL: unidiff-0.7.2.tar.gz
- Upload date:
- Size: 19.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 15297652e97870f55136bd3c07c697eaed813b24ca8cec7ae08398b026ff03cc |
|
MD5 | 91cd0a6e38e02155f6d4bdd699507b51 |
|
BLAKE2b-256 | 79211f56b1d58bf133bad24b737b96bfd5b567e9edd7e3ecbab4bf1fde6385f8 |
File details
Details for the file unidiff-0.7.2-py2.py3-none-any.whl
.
File metadata
- Download URL: unidiff-0.7.2-py2.py3-none-any.whl
- Upload date:
- Size: 14.3 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1ecddf0fa1e1b121a2a410a25b9ec26863f917f27373160cc71c7423877f19bd |
|
MD5 | 3391493ebb409074147c505d4ac5900e |
|
BLAKE2b-256 | 7c8a3843d6164ece19392b422b0a28022285223366dbe8ed97b3f0e321bab185 |