Skip to main content

Unified diff parsing/metadata extraction library.

Project description

Simple Python library to parse and interact with unified diff data.

https://www.travis-ci.com/matiasb/python-unidiff.svg?branch=master

Installing unidiff

$ pip install unidiff

Quick start

>>> import urllib.request
>>> from unidiff import PatchSet
>>> diff = urllib.request.urlopen('https://github.com/matiasb/python-unidiff/pull/3.diff')
>>> encoding = diff.headers.get_charsets()[0]
>>> patch = PatchSet(diff, encoding=encoding)
>>> patch
<PatchSet: [<PatchedFile: .gitignore>, <PatchedFile: unidiff/patch.py>, <PatchedFile: unidiff/utils.py>]>
>>> patch[0]
<PatchedFile: .gitignore>
>>> patch[0].is_added_file
True
>>> patch[0].added
6
>>> patch[1]
<PatchedFile: unidiff/patch.py>
>>> patch[1].added, patch[1].removed
(20, 11)
>>> len(patch[1])
6
>>> patch[1][2]
<Hunk: @@ 109,14 110,21 @@ def __repr__(self):>
>>> patch[2]
<PatchedFile: unidiff/utils.py>
>>> print(patch[2])
diff --git a/unidiff/utils.py b/unidiff/utils.py
index eae63e6..29c896a 100644
--- a/unidiff/utils.py
+++ b/unidiff/utils.py
@@ -37,4 +37,3 @@
# - deleted line
# \ No newline case (ignore)
RE_HUNK_BODY_LINE = re.compile(r'^([- \+\\])')
-

Load unified diff data by instantiating PatchSet with a file-like object as argument, or using PatchSet.from_filename class method to read diff from file.

A PatchSet is a list of files updated by the given patch. For each PatchedFile you can get stats (if it is a new, removed or modified file; the source/target lines; etc), besides having access to each hunk (also like a list) and its respective info.

At any point you can get the string representation of the current object, and that will return the unified diff data of it.

As a quick example of what can be done, check bin/unidiff file.

Also, once installed, unidiff provides a command-line program that displays information from diff data (a file, or stdin). For example:

$ git diff | unidiff
Summary
-------
README.md: +6 additions, -0 deletions

1 modified file(s), 0 added file(s), 0 removed file(s)
Total: 6 addition(s), 0 deletion(s)

Load a local diff file

To instantiate PatchSet from a local file, you can use:

>>> from unidiff import PatchSet
>>> patch = PatchSet.from_filename('tests/samples/bzr.diff', encoding='utf-8')
>>> patch
<PatchSet: [<PatchedFile: added_file>, <PatchedFile: modified_file>, <PatchedFile: removed_file>]>

Notice the (optional) encoding parameter. If not specified, unicode input will be expected. Or alternatively:

>>> import codecs
>>> from unidiff import PatchSet
>>> with codecs.open('tests/samples/bzr.diff', 'r', encoding='utf-8') as diff:
...     patch = PatchSet(diff)
...
>>> patch
<PatchSet: [<PatchedFile: added_file>, <PatchedFile: modified_file>, <PatchedFile: removed_file>]>

Finally, you can also instantiate PatchSet passing any iterable (and encoding, if needed):

>>> from unidiff import PatchSet
>>> with open('tests/samples/bzr.diff', 'r') as diff:
...     data = diff.readlines()
...
>>> patch = PatchSet(data)
>>> patch
<PatchSet: [<PatchedFile: added_file>, <PatchedFile: modified_file>, <PatchedFile: removed_file>]>

If you don’t need to be able to rebuild the original unified diff input, you can pass metadata_only=True (defaults to False), which should help making the parsing more efficient:

>>> from unidiff import PatchSet
>>> patch = PatchSet.from_filename('tests/samples/bzr.diff', encoding='utf-8', metadata_only=True)

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unidiff-0.7.3.tar.gz (19.9 kB view details)

Uploaded Source

Built Distribution

unidiff-0.7.3-py2.py3-none-any.whl (14.3 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file unidiff-0.7.3.tar.gz.

File metadata

  • Download URL: unidiff-0.7.3.tar.gz
  • Upload date:
  • Size: 19.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.8.10

File hashes

Hashes for unidiff-0.7.3.tar.gz
Algorithm Hash digest
SHA256 d5f2e53a9a00db3224a8c36349b5380e0e22d1aec6c694b14fb9483ee93c6205
MD5 5ea66091b7cf13d2e1eb2c2b1bab6a3a
BLAKE2b-256 40ad5e1803762e8cbc9f3f82bd44a3cb246b021f490184d17a78eec2f9982874

See more details on using hashes here.

File details

Details for the file unidiff-0.7.3-py2.py3-none-any.whl.

File metadata

  • Download URL: unidiff-0.7.3-py2.py3-none-any.whl
  • Upload date:
  • Size: 14.3 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.8.10

File hashes

Hashes for unidiff-0.7.3-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 33152364004b455e11afd67e2d2d6bb8f2c20b28a289b443f2c7b84be18c24cb
MD5 fcfdf52acf61a08af6443bc018dd0932
BLAKE2b-256 9ecf8688fb70174fb40b31c838cb1b8195a7b35b548de7aeb04fbb4567617d21

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page