Server-side implementation of Gitaly protocol for Mercurial

Project description

HGitaly

HGitaly is Gitaly server for Mercurial.

It implements the subset of the Gitaly gRPC protocol that is relevant for Mercurial repositories, as well as its own HGitaly protocol, with methods that are specific to Mercurial.

It comes in two overlapping variants:

HGitaly proper is written in Python, using the grpcio official library.
RHGitaly is a high-performance partial implementation written in Rust, and based on the tonic gRPC framework.

As of this writing, RHGitaly implements a strict subset of the methods implemented in HGitaly, but it is possible that some methods would be implemented in RHGitaly only in the future.

Installation

HGitaly (Python)

In what follows, $PYTHON is often the Python interpreter in a virtualenv, but it can be a system-wide one (typical case in containers, strongly discouraged on user systemes).

Install Mercurial with Rust parts (for the exact version, refer to the requirements file in the Heptapod main repository sources)
```
$PYTHON -m pip install --no-use-pep517 --global-option --rust Mercurial==6.6.2
```
Install HGitaly itself (check that it does not reinstall Mercurial)
```
$PYTHON -m pip install hgitaly
```

RHGitaly

We distribute a self-contained source tarball. It includes the appropriate hg-core Rust sources.

Fetch the tarball

wget https://download.heptapod.net/rhgitaly/rhgitaly-x.y.z.tgz

Fetch and verify the GPG signature

wget https://download.heptapod.net/rhgitaly/rhgitaly-x.y.z.tgz.asc
gpg --verify rhgitaly-x.y.z.tgz.asc

Build

tar xzf rhgitaly-x.y.z.tgz
cd rhgitaly-x.y.z/rust
cargo build --locked --release

Install wherever you want. Example given for a system-wide installation
```
sudo install -o root -g root target/release/rhgitaly /usr/local/bin
```

Define a service. Example given for systemd, to be adjusted for your needs. Make sure in particular that user and all directories exist, with appropriate permissions.

[Unit]
Description=Heptapod RHGitaly Server

[Service]
User=hgitaly
# HGRCPATH not needed yet but probably will be at some point
Environment=HGRCPATH=/etc/heptapod/heptapod.hgrc
Environment=RHGITALY_LISTEN_URL=unix:/run/heptapod/rhgitaly.socket
Environment=RHGITALY_REPOSITORIES_ROOT=/home/hg/repositories
ExecStartPre=rm -f /run/heptapod/rhgitaly.socket
ExecStart=/user/local/bin/rhgitaly
Restart=on-failure

[Install]
WantedBy=default.target

External executables

HGitaly needs several other programs to be installed and will run them as separate processes.

By default, it expects to find them on $PATH, but the actual path to each executable can be configured.

Tokei

Tokei is a programming languages analysis tool written in Rust. It is used by the CommitLanguages method.

Tokei is available in several Linux distributions.

As of this writing, HGitaly supports versions 12.0 and 12.1

Go license-detector

Usually installed as license-detector, this standalone executable is part of the go-enry suite. Its library version is also used by Gitaly.

It is used in the FindLicense method.

Git

HGitaly can make use of some Git commands that do not involve repositories! This is for example the case of GetPatchID: the git patch-id command does not access any repository. Instead it computes any patch into an identifier.

Mercurial

In forthcoming versions, it is probable that HGitaly and/or RHGitaly will invoke Mercurial subprocesses.

This is not yet the case as of this writing (HGitaly 1.1 / Heptapod 1.1).

Configuration

HGitaly's configuration is done the standard way in the Mercurial world: through HGRC files.

In a typical Heptapod installation, these are split into a managed file, for consistency with other components and another one for edit by the systems administrator (/etc/gitlab/heptapod.hgrc in Omnibus/Docker instances).

Many Mercurial tweaks are interpreted simply because HGitaly internally calls into Mercurial, but HGitaly also gets its own section. Here are the settings available as of HGitaly 1.1

[hgitaly]
# paths to external executables
tokei-executable = tokei
license-detector-executable = license-detector
git-executable = git

# The number of workers process default value is one plus half the CPU count.
# It can be explicitly set this way:
#workers = 4

# Time to let a worker finish treating its current request, if any, when
# gracefully restarted. Default is high because of backup requests.
worker.graceful-shutdown-timeout-seconds = 300
# Maximum allowed resident size for worker processes (MiB).
# They get gracefully restarted if they cross that threshold
worker.max_rss_mib = 1024
# Interval between memory monitoring of workers (results dumped in logs)
worker.monitoring-interval-seconds = 60

Also heptapod.repositories-root is used if --repositories-root is not passed on the command line.

Operation

Logging

HGitaly is using the standard logging Python module, and the loggingmod Mercurial extension to emit logs from the Mercurial core and other extensions. Therefore, the logging configuration is done from the Mercurial configuration, typically from one of the Heptapod HGRC files.

The general convention is that all logs emitted by hgitaly.service provide GitLab's correlation_id in the extra dict, making it available in the format string. Here is a minimal example:

[correlation_id=%(correlation_id)s] [%(levelname)s] [%(name)s] %(message)s"

Conversely, the format strings for logs emitted outside of hgitaly.service must not use correlation_id, as subpackages such as hgitaly.branch, hgitaly.message, etc. cannnot provide a value: it is a hard error to use a format that relies on some extra if the emitter does not provide it.

To summarize the resulting policy:

in hgitaly.service, all logging must be done through hgitaly.logging.LoggerAdapter. Using correlation_id in the format is strongly encouraged.
outside of hgitaly.service, logging should be self-contained useful without an obvious link to the calling gRPC method. For instance a repository inconsistency should be logged at WARNING level, with a message including the path.

Development

Automated tests and Continuous Integration

How to run the tests

Usually, that would be in a virtualenv, but it's not necessary.

  python3 -m pip install -r test-requirements.txt
  ./run-all-tests

Hint: Check the contents of run-all-tests, it's just pytest with a standard set of options (mostly for coverage, see below).

Unit and Mercurial integration tests

These are the main tests. They lie inside the hgitaly and hgext3rd.hgitaly Python packages. The layout follows the style where each subpackage has its own tests package, to facilitate future refactorings.

The Mercurial integration tests are written with the mercurial-testhelpers library. Their duty is to assert that HGitaly works as expected and maintains compatibility with several versions of Mercurial and possibly other dependencies, such as grpcio.

The implicit assumption with these tests is that the test authors actually knew what was expected. HGitaly being meant to be a direct replacement, or rather a translation of Gitaly in Mercurial terms, those expectation are actually a mix of:

Design choices, such as mapping rules between branch/topic combinations and GitLab branches.
Gitaly documentation and source code.
sampling of Gitaly responses.

Gitaly comparison tests

If an appropriate Gitaly installation is found, run-all-tests will also run the tests from the tests_with_gitaly package. This happens automatically from within a HDK workspace.

These are precisely meant for what the Mercurial integration tests can't do: check that HGitaly responses take the form expected by the various Gitaly clients, by comparing directly with the reference Gitaly implementation.

The comparisons work by using the conversions to Git provided by py-heptapod, which are precisely what HGitaly aims to replace as a mean to expose Mercurial content to GitLab.

Once there is no ambiguity with what Gitaly clients expect, the correctness of the implementation, with its various corner cases, should be left to the Mercurial integration tests.

Test coverage

This project is being developed with a strong test coverage policy, enforced by CI: without the Gitaly comparison tests, the coverage has to stay at 100%.

This does not mean that a contribution has to meet this goal to be worthwile, or even considered. Contributors can expect Maintainers to help them achieving the required 100% coverage mark, especially if they are newcomers. Of course, Contributors cannot expect Maintainers to go as far as write missing tests for them, even if that can still happen for critical urgent issues.

Selected statements can of course be excluded for good reasons, using # pragma no cover.

Coverage exclusions depending on the Mercurial version are provided by the coverage plugin of mercurial-testhelpers.

Unexpected drop of coverage in different Mercurial versions is a powerful warning system that something not obvious is getting wrong, but the Gitaly comparison tests are run in CI against a fixed set of dependencies, hence 100% coverage must be achieved without the Gitaly comparison tests.

On the other hand, Gitaly comparison tests will warn us when we bump upstream GitLab if some critical behaviour has changed.

Tests Q&A and development hints

Doesn't the 100% coverage rule without the Gitaly comparison tests mean writing the same tests twice?

In some cases, yes, but it's limited.

For example, the comparison tests can tell us that the FindAllBranchNames is actually expected to return GitLab refs (refs/heads/some-branch), not GitLab branch names. That can be settled with a few, very basic, test cases. There is no need to test all the mapping rules for topics, and even less the various related corner cases in the comparison tests. These, on the other hand depend strongly on Mercurial internals, and absolutely have to be fully tested continuously against various Mercurial versions.

Also, it is possible to deduplicate scenarios that are almost identical in Mercurial integration tests and Gitaly comparison tests: factorize out the common code in a helper function made available for both. The question is if it is worth the effort.

Finally, comparison tests should focus on the fact that Gitaly and HGitaly results agree, not on what they contain. In the above example, a comparison for FindAllBranchNames could simply assert equality of the returned sets of branch names. This is a bit less cumbersome, and easier to maintain.

How to reproduce a drop in coverage found by the `compat` CI stage?

These are often due to statements being covered by the Gitaly comparison tests only, leading to 100% coverage in the main stage, but not in the compat stage.

The first thing to do is to run without the Gitaly comparison tests:

SKIP_GITALY_COMPARISON_TESTS=yes ./run-all-tests

(any non empty value in that environment variable, even no or false will trigger the skipping)

In some rare cases, the drop in coverage could be due to an actual change between Mercurial versions. If that happens, there are good chances that an actual bug is lurking around.

How to run the tests with coverage of the Gitaly comparison tests

./run-all-tests --cov tests_with_gitaly --cov-report html

The HTML report will be nice if you don't have 100% coverage. To display it, just do

firefox htmlcov/index.html

By default, the Gitaly comparison tests themselves are not covered, indeed. This is because run-all-tests does not know whether they will be skipped for lack of a Gitaly installation – which would be legitimate.

But they are covered in the CI jobs that launch them, because Gitaly is assumed to be available. For these, the coverage would tell us that something was broken, preventing the tests to run.

How to poke into Gitaly protocol?

The Gitaly comparison tests provide exactly a harness for that: take a test, modify it as needed, insert a pdb breakpoint, and get going.

The big advantage here is that startup of the Gitaly comparison tests is almost instantaneous, especially compared with RSpec, wich takes about a minute to start even a completely trivial test.

Of course that will raise the question whether it'll be useful to make true tests of these experiments.

When is a Gitaly comparison test required?

Each time there's a need to be sure of what's expected and it can help answer that question. It doesn't have to do more than that.

When to prefer writing RSpec tests in Heptapod Rails over Gitaly comparison tests in HGitaly?

If you need to make sure that Heptapod Rails, as a Gitaly client, sends the proper requests, because that can depend on specific dispatch code.

For instance, we are currently still converting to Git on the Rails side. A source of bugs would be to send Git commit ids to HGitaly.

Apart from that, it is expected to be vastly more efficient to use Gitaly comparison tests.

The more Heptapod progresses, the less complicated all of this should be.

Updating the Gitaly gRPC protocol

The virtualenv has to be activated

pip install -r dev-requirements.txt
Copy the new proto files from a Gitaly checkout with version matching the wanted GitLab upstream version. Example in a HDK context:
```
cp ../gitaly/proto/*.proto protos/  # we dont want the `go` subdir
```
run ./generate-stubs
run the tests: ./run-all-tests
perform necessary hg add after close inspection of hg status

Updating the HGitaly specific gRPC protocol

This package defines and implements an additional gRPC protocol, with gRPC services and methods that are specific to Mercurial, or more generally Heptapod.

Protocol specification

The sources are proto files in the protos/ directory, same as for the Gitaly protocol.

They distinguish themselves by this declaration:

package hgitaly;

Each time a change is made to the protocol, the libraries for all provided programming languages have to be regenerated and committed, ideally together with the protocol change.

Python library

It has a special status, being versioned together with the protocol and the server implementation. It is provided as the hgitaly.stub package.

The Python stubs are produced by the same script that takes care of Gitaly proto files:

./generate-stubs

Ruby library

See the separate documentation

Other languages

A Go library will probably be necessary quite soon for Workhorse or perhaps Heptapod Shell.

A Rust library would be nice to have

Project details

Release history Release notifications | RSS feed

This version

2.8.0

Nov 19, 2024

2.7.2

Nov 3, 2024

2.7.1

Oct 23, 2024

2.7.0

Oct 17, 2024

2.6.0

Sep 30, 2024

2.5.5

Oct 31, 2024

2.5.4

Oct 31, 2024

2.5.3

Oct 9, 2024

2.5.2

Sep 25, 2024

2.5.1

Sep 19, 2024

2.5.0

Sep 18, 2024

2.4.0

Sep 14, 2024

2.3.4

Oct 9, 2024

2.3.3

Sep 25, 2024

2.3.2

Sep 25, 2024

2.3.1

Sep 4, 2024

2.3.0

Aug 23, 2024

2.2.2

Sep 5, 2024

2.2.1

Aug 13, 2024

2.2.0

Aug 11, 2024

2.1.0

Aug 8, 2024

2.0.2

Jul 5, 2024

2.0.1

Jul 4, 2024

2.0.0

Jul 4, 2024

1.7.2

Jun 13, 2024

1.7.1

Jun 9, 2024

1.7.0

May 31, 2024

1.6.0

May 22, 2024

1.5.0

May 18, 2024

1.4.1

May 8, 2024

1.4.0

May 1, 2024

1.3.4

May 8, 2024

1.3.3

Mar 20, 2024

1.3.2

Mar 17, 2024

1.3.1

Mar 17, 2024

1.3.0

Mar 12, 2024

1.2.0

Feb 28, 2024

1.1.2

Mar 7, 2024

1.1.1

Feb 26, 2024

1.1.0

Feb 13, 2024

1.0.1

Feb 14, 2024

1.0.0

Jan 29, 2024

0.45.0

Jan 13, 2024

0.44.1

Jan 9, 2024

0.44.0

Jan 2, 2024

0.43.1

Jan 7, 2024

0.43.0

Jan 7, 2024

0.42.3

Jan 5, 2024

0.42.2

Dec 20, 2023

0.42.1

Nov 21, 2023

0.42.0

Nov 21, 2023

0.41.0

Sep 24, 2023

0.40.4

Nov 22, 2023

0.40.3

Oct 14, 2023

0.40.2

Oct 9, 2023

0.40.1

Oct 8, 2023

0.40.0

Sep 4, 2023

0.39.1

Aug 9, 2023

0.39.0

Aug 7, 2023

0.38.0

Jul 19, 2023

0.37.0

Jul 12, 2023

0.36.0

Jun 28, 2023

0.35.0

May 30, 2023

0.34.0

May 24, 2023

0.33.4

Aug 7, 2023

0.33.3

Jul 5, 2023

0.33.2

Jun 30, 2023

0.33.1

Apr 27, 2023

0.33.0

Apr 13, 2023

0.33.0.dev0 pre-release

Mar 31, 2023

0.32.2

Mar 15, 2023

0.32.1

Mar 8, 2023

0.32.0

Mar 8, 2023

0.31.0

Feb 13, 2023

0.30.0

Feb 13, 2023

0.29.1

Dec 13, 2022

0.29.0

Nov 20, 2022

0.28.0

Oct 21, 2022

0.27.2

Nov 19, 2022

0.27.1

Oct 24, 2022

0.27.0

Sep 21, 2022

0.26.3

Jul 18, 2022

0.26.2

Jun 23, 2022

0.26.1

Jun 22, 2022

0.26.0

Jun 20, 2022

0.25.0

Jun 12, 2022

0.24.4

Jun 1, 2022

0.24.3

May 24, 2022

0.24.2

May 24, 2022

0.24.1

May 24, 2022

0.24.0

May 24, 2022

0.23.0

May 18, 2022

0.22.2

Mar 21, 2022

0.22.1

Mar 14, 2022

0.22.0

Mar 13, 2022

0.21.1

Mar 3, 2022

0.21.0

Feb 15, 2022

0.20.0

Jan 3, 2022

0.19.0

Nov 22, 2021

0.18.0

Nov 15, 2021

0.17.1

Nov 3, 2021

0.17.0

Oct 25, 2021

0.16.1

Nov 3, 2021

0.16.0

Sep 9, 2021

0.15.0

Aug 24, 2021

0.14.3

Aug 4, 2021

0.14.2

Jul 5, 2021

0.14.1

Jul 1, 2021

0.14.0

Jun 17, 2021

0.13.0

May 26, 2021

0.12.2

Apr 22, 2021

0.12.1

Apr 22, 2021

0.12.0

Apr 15, 2021

0.12.0.dev1 pre-release

Apr 15, 2021

0.11.2

Apr 22, 2021

0.11.1

Apr 15, 2021

0.11.0

Apr 1, 2021

0.11.0.dev0 pre-release

Mar 25, 2021

0.10.0

Mar 12, 2021

0.9.0

Feb 25, 2021

0.8.1

Mar 7, 2021

0.8.0

Jan 29, 2021

0.7.1

Jan 14, 2021

0.7.0

Dec 18, 2020

0.6.0

Dec 14, 2020

0.5.1

Dec 14, 2020

0.5.0

Nov 26, 2020

0.4.0

Nov 25, 2020

0.3.0

Nov 19, 2020

0.2.0

Nov 10, 2020

0.1.0

Oct 14, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hgitaly-2.8.0.tar.gz (362.4 kB view details)

Uploaded Nov 19, 2024 Source

File details

Details for the file hgitaly-2.8.0.tar.gz.

File metadata

Download URL: hgitaly-2.8.0.tar.gz
Upload date: Nov 19, 2024
Size: 362.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.12.7

File hashes

Hashes for hgitaly-2.8.0.tar.gz
Algorithm	Hash digest
SHA256	`b23297178f1dcbcf137b433674dfd2489091dad7fc9e1a46049cb6854d807a03`
MD5	`686f6fca31d76bf47fd8e0c728d386bd`
BLAKE2b-256	`eb8663c8166d9b62dcf810971ef572b7f3872cea85d984d488f7295023cbf0cb`

See more details on using hashes here.

hgitaly 2.8.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

HGitaly

Installation

HGitaly (Python)

RHGitaly

External executables

Tokei

Go license-detector

Git

Mercurial

Configuration

Operation

Logging

Development

Automated tests and Continuous Integration

How to run the tests

Unit and Mercurial integration tests

Gitaly comparison tests

Test coverage

Tests Q&A and development hints

Doesn't the 100% coverage rule without the Gitaly comparison tests mean writing the same tests twice?

How to reproduce a drop in coverage found by the compat CI stage?

How to run the tests with coverage of the Gitaly comparison tests

How to poke into Gitaly protocol?

When is a Gitaly comparison test required?

When to prefer writing RSpec tests in Heptapod Rails over Gitaly comparison tests in HGitaly?

Updating the Gitaly gRPC protocol

Updating the HGitaly specific gRPC protocol

Protocol specification

Python library

Ruby library

Other languages

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes

How to reproduce a drop in coverage found by the `compat` CI stage?