Skip to main content

automatically detect dependencies between git commits

Project description

Code Climate git-deps ========

git-deps is a tool for performing automatic analysis of dependencies between commits in a git repository. Here’s a screencast demonstration:

YouTube screencast

I have blogged about ``git-deps` and related tools <https://blog.adamspiers.org/2018/06/14/git-auto-magic/>`__, and also publically spoken about the tool several times:

Contents

Background theory

It is fairly clear that two git commits within a single repo can be considered “independent” from each other in a certain sense, if they do not change the same files, or if they do not change overlapping parts of the same file(s).

In contrast, when a commit changes a line, it is “dependent” on not only the commit which last changed that line, but also any commits which were responsible for providing the surrounding lines of context, because without those previous versions of the line and its context, the commit’s diff might not cleanly apply (depending on how it’s being applied, of course). So all dependencies of a commit can be programmatically inferred by running git-blame on the lines the commit changes, plus however many lines of context make sense for the use case of this particular dependency analysis.

Therefore the dependency calculation is impacted by a “fuzz” factor parameter (c.f. patch(1)), i.e. the number of lines of context which are considered necessary for the commit’s diff to cleanly apply.

As with many dependency relationships, these dependencies form edges in a DAG (directed acyclic graph) whose nodes correspond to commits. Note that a node can only depend on a subset of its ancestors.

Caveat

It is important to be aware that any dependency graph inferred by git-deps may be semantically incomplete; for example it would not auto-detect dependencies between a commit A which changes code and another commit B which changes documentation or tests to reflect the code changes in commit A. Therefore git-deps should not be used with blind faith. For more details, see the section on Textual vs. semantic (in)dependence below.

Motivation

Sometimes it is useful to understand the nature of parts of this dependency graph, as its nature will impact the success or failure of operations including merge, rebase, cherry-pick etc. Please see the ``USE-CASES.md` file <USE-CASES.md>`__ for more details.

Installation

Please see the ``INSTALL.md` file <INSTALL.md>`__.

Usage

Please see the ``USAGE.md` file <USAGE.md>`__.

Textual vs. semantic (in)dependence

Astute readers will note that textual independence as detected by git-deps is not the same as semantic / logical independence. Textual independence means that the changes can be applied in any order without incurring conflicts, but this is not a reliable indicator of logical independence.

For example a change to a function and corresponding changes to the tests and/or documentation for that function would typically exist in different files. So if those changes were in separate commits within a branch, running git-deps on the commits would not detect any dependency between them even though they are logically related, because changes in different files (or even in different areas of the same files) are textually independent.

So in this case, git-deps would not behave exactly how we might want. And for as long as AI is an unsolved problem, it is very unlikely that it will ever develop totally reliable behaviour. So does that mean git-deps is useless? Absolutely not!

Firstly, when best practices for commit structuring are adhered to, changes which are strongly logically related should be placed within the same commit anyway. So in the example above, a change to a function and corresponding changes to the tests and/or documentation for that function should all be within a single commit. (Although this is not the only valid approach; for a more advanced meta-history grouping mechanism, see `git-dendrify <https://github.com/bennorth/git-dendrify>`__.)

Secondly, whilst textual independence does not imply logical independence, the converse is expected to be more commonly true: logical independence often implies textual independence (or stated another way, textual dependence often implies logical dependence). So while it might not be too uncommon for git-deps to fail to detect the dependency between logically-related changes, it should be rarer that it incorrectly infers a dependency between logically unrelated changes. In other words, its false negatives are generally expected to be more common than its false positives. As a result it is likely to be more useful in determining a lower bound on dependencies than an upper bound. Having said that, more research is needed on this.

Thirdly, it is often unhelpful to allow the quest for the perfect become the enemy of the good - a tool does not have to be perfect to be useful; it only has to be better than performing the same task without the tool.

Further discussion on some of these points can be found in an old thread from the git mailing list.

Ultimately though, “the proof is in the pudding”, so try it out and see!

Development / support / feedback

Please see the ``CONTRIBUTING.md` file <CONTRIBUTING.md>`__.

History

Please see the ``HISTORY.md` file <HISTORY.md>`__.

Credits

Special thanks to SUSE for partially sponsoring the development of this software. Thanks also to everyone who has contributed code, bug reports, and other feedback.

License

Released under GPL version 2 in order to be consistent with `git’s license <https://github.com/git/git/blob/master/COPYING>`__, but I’m open to the idea of dual-licensing if there’s a convincing reason.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

git-deps-1.1.0.zip (8.1 MB view details)

Uploaded Source

File details

Details for the file git-deps-1.1.0.zip.

File metadata

  • Download URL: git-deps-1.1.0.zip
  • Upload date:
  • Size: 8.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.56.2 CPython/3.8.8

File hashes

Hashes for git-deps-1.1.0.zip
Algorithm Hash digest
SHA256 95ca68d7e168045aec04d116c7bb01b2be87704c4fad43aeb0300a597a46bb2f
MD5 6ede8083112f0ce237229ab7d71ca8e4
BLAKE2b-256 afb6a972c2e33471312039eaee79b0f361807aac43a680ea1c69db2c8d87e13a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page