Skip to main content

Mozilla tools for localization

Project description

moz.l10n

This is a library of Python tools and utilities for working with localization files, primarily built for internal use at Mozilla.

The core idea here is to establish Message and Resource as format-independent representations of localizable and localized messages and resources, so that operations like linting and transforms can be applied to them.

The Message and Resource representations are drawn from work done for the Unicode MessageFormat 2 specification and the Message resource specification.

Support for XML formats (android, xliff) is an optional extra; to support them, install as moz.l10n[xml].

Command-line Tools

For usage details, use each command's --help argument.

l10n-build

Build localization files for release.

Iterates source files as defined by --config, reads localization sources from --base, and writes to --target. Trims out all comments and messages not in the source files for each of the --locales. Adds empty files for any missing from the target locale.

l10n-build-file

Build one localization file for release.

Uses the --source file as a baseline, applying --l10n localizations to build --target. Trims out all comments and messages not in the source file.

l10n-compare

Compare localizations to their source, which may be

  • a directory (using L10nDiscoverPaths),
  • a TOML config file (using L10nConfigPaths), or
  • a JSON file containing a mapping of file paths to arrays of messages.

l10n-fix

Fix the formatting for localization resources.

If paths is a single directory, it is iterated with L10nConfigPaths if --config is set, or L10nDiscoverPaths otherwise. If paths is not a single directory, its values are treated as glob expressions, with ** support.

moz.l10n.paths

L10nConfigPaths

Wrapper for localization config files.

Supports a subset of the format specified at: https://moz-l10n-config.readthedocs.io/en/latest/fileformat.html

Differences:

  • [build] is ignored
  • [[excludes]] are not supported
  • [[filters]] are ignored
  • [[paths]] must always include both reference and l10n

Does not consider .l10n-ignore files.

L10nDiscoverPaths

Automagical localization resource discovery.

Given a root directory, finds the likeliest reference and target directories.

The reference directory has a name like templates, en-US, or en, and contains files with extensions that appear localizable.

The localization target root is a directory with subdirectories named as BCP 47 locale identifiers, i.e. like aa, aa-AA, aa-Aaaa, or aa-Aaaa-AA.

An underscore may also be used as a separator, as in en_US.

moz.l10n.resources

Parsers and serializers are provided for a number of formats, using common and well-established libraries to take care of the details. A unified API for these is provided, such that FORMAT_parse(text) will always accept str input, and FORMAT_serialize(resource) will always provide a str iterator. All the serializers accept a trim_comments argument which leaves out comments from the serialized result, but additional input types and options vary by format.

The library currently supports the following resource formats:

  • android: Android string resources (strings.xml)
  • dtd: .dtd
  • fluent: Fluent (.ftl)
  • inc: .inc
  • ini: .ini
  • plain_json: Plain JSON (.json)
  • po: Gettext (.po, .pot)
  • properties: .properties
  • webext: WebExtensions (messages.json)
  • xliff: XLIFF 1.2, including XCode customizations (.xlf, .xliff)

add_entries

def add_entries(
    target: Resource,
    source: Resource,
    *,
    use_source_entries: bool = False
) -> int

Modifies target by adding entries from source that are not already present in target. Standalone comments are not added.

If use_source_entries is set, entries from source override those in target when they differ, as well as updating section comments and metadata from source.

Entries are not copied, so further changes will be reflected in both resources.

Returns a count of added or changed entries and sections.

detect_format

def detect_format(name: str | None, source: bytes | str) -> Format | None

Detect the format of the input based on its file extension and/or contents.

Returns a Format enum value, or None if the input is not recognized.

iter_resources

def iter_resources(
    root: str,
    dirs: list[str] | None = None,
    ignorepath: str = ".l10n-ignore"
) -> Iterator[tuple[str, Resource[Message, str] | None]]

Iterate through localizable resources under the root directory. Use dirs to limit the search to only some subdirectories under root.

Yields (str, Resource | None) tuples, with the file path and the corresponding Resource, or None for files that could not be parsed as localization resources.

To ignore files, include a .l10n-ignore file in root, or some other location passed in as ignorepath. This file uses a git-ignore syntax, and is always based in the root directory.

l10n_equal

def l10n_equal(a: Resource, b: Resource) -> bool

Compares the localization-relevant content (id, comment, metadata, message values) of two resources.

Sections with no message entries are ignored, and the order of sections, entries, and metadata is ignored.

parse_resource

def parse_resource(
    input: Format | str | None,
    source: str | bytes | None = None
) -> Resource[Message, str]

Parse a Resource from its string representation.

The first argument may be an explicit Format, the file path as a string, or None. For the latter two types, an attempt is made to detect the appropriate format.

If the first argument is a string path, the source argument is optional, as the file will be opened and read.

serialize_resource

def serialize_resource(
    resource: Resource[str, str] | Resource[Message, str],
    format: Format | None = None,
    trim_comments: bool = False
) -> Iterator[str]

Serialize a Resource as its string representation.

If format is set, it overrides the resource.format value.

With trim_comments, all standalone and attached comments are left out of the serialization.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

moz_l10n-0.5.3.tar.gz (68.7 kB view details)

Uploaded Source

Built Distribution

moz.l10n-0.5.3-py3-none-any.whl (93.3 kB view details)

Uploaded Python 3

File details

Details for the file moz_l10n-0.5.3.tar.gz.

File metadata

  • Download URL: moz_l10n-0.5.3.tar.gz
  • Upload date:
  • Size: 68.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.13

File hashes

Hashes for moz_l10n-0.5.3.tar.gz
Algorithm Hash digest
SHA256 71c138d1badac6b5681621c0220497c3d5d297e1f14ebdee05d00e43ca368353
MD5 d986632574fa36541b0dcda519ca7c32
BLAKE2b-256 5870a2e4c1fc51656b0156054d0749c444b979ad246ce230b6034501ba50b07c

See more details on using hashes here.

File details

Details for the file moz.l10n-0.5.3-py3-none-any.whl.

File metadata

  • Download URL: moz.l10n-0.5.3-py3-none-any.whl
  • Upload date:
  • Size: 93.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.13

File hashes

Hashes for moz.l10n-0.5.3-py3-none-any.whl
Algorithm Hash digest
SHA256 038f1944ee35e8fe84d73ff922a4ace718519d0359fc6847f50501da84f4094e
MD5 ba7fffaab498c37d26eaad5397d7fe41
BLAKE2b-256 ef091ce7e88518d6416a6e5a3555740cbfa7a3dbbfe237ef300e3fe63751dfd6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page