Skip to main content

A MutableSet that remembers its order, so that every entry has an index.

Project description

Travis Codecov Pypi

An OrderedSet is a custom MutableSet that remembers its order, so that every entry has an index that can be looked up.

Usage examples

An OrderedSet is created and used like a set:

>>> from ordered_set import OrderedSet

>>> letters = OrderedSet('abracadabra')

>>> letters
OrderedSet(['a', 'b', 'r', 'c', 'd'])

>>> 'r' in letters
True

It is efficient to find the index of an entry in an OrderedSet, or find an entry by its index. To help with this use case, the .add() method returns the index of the added item, whether it was already in the set or not.

>>> letters.index('r')
2

>>> letters[2]
'r'

>>> letters.add('r')
2

>>> letters.add('x')
5

OrderedSets implement the union (|), intersection (&), and difference (-) operators like sets do.

>>> letters |= OrderedSet('shazam')

>>> letters
OrderedSet(['a', 'b', 'r', 'c', 'd', 'x', 's', 'h', 'z', 'm'])

>>> letters & set('aeiou')
OrderedSet(['a'])

>>> letters -= 'abcd'

>>> letters
OrderedSet(['r', 'x', 's', 'h', 'z', 'm'])

The __getitem__() and index() methods have been extended to accept any iterable except a string, to perform NumPy-like "fancy indexing".

>>> letters = OrderedSet('abracadabra')

>>> letters[[0, 2, 3]]
OrderedSet(['a', 'r', 'c'])

>>> letters.index(['a', 'r', 'c'])
[0, 2, 3]

This combination of features makes OrderedSet a simple implementation of many of the things that pandas.Index is used for. An OrderedSet can be used as a bi-directional mapping between a sparse vocabulary and dense index numbers.

OrderedSet implements __getstate__ and __setstate__ so it can be pickled, and implements the abstract base classes collections.MutableSet and collections.Sequence.

Authors

OrderedSet was implemented by Rob Speer. Jon Crall contributed changes and tests to make it fit the Python set API.

Comparisons

The original implementation of OrderedSet was a recipe posted to ActiveState Recipes by Raymond Hettiger, released under the MIT license.

Hettiger's implementation kept its content in a doubly-linked list referenced by a dict. As a result, looking up an item by its index was an O(N) operation, while deletion was O(1).

This version makes different trade-offs for the sake of efficient lookups. Its content is a standard Python list instead of a doubly-linked list. This provides O(1) lookups by index at the expense of O(N) deletion, as well as slightly faster iteration.

If you were to use a Python dict as an OrderedSet by ignoring its values, its lookups and deletions would be similar to Hettiger's implementation, with iteration speed similar to this implementation.

Compatibility

OrderedSet is automatically tested on Python 2.7, 3.3, 3.4, 3.5, 3.6, and 3.7. We've checked more informally that it works on PyPy and PyPy3.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ordered-set-3.0.1.tar.gz (9.1 kB view details)

Uploaded Source

File details

Details for the file ordered-set-3.0.1.tar.gz.

File metadata

  • Download URL: ordered-set-3.0.1.tar.gz
  • Upload date:
  • Size: 9.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for ordered-set-3.0.1.tar.gz
Algorithm Hash digest
SHA256 3d6fd7bffbb15f613a9e8a6281bf97c2d67f7bb8677deca8249df2fbdd9cce7b
MD5 a8059c7b99cde0f8dda01ddee6b43c2c
BLAKE2b-256 15678a98b27f5a8f11273d74201ca3e0e7003a6482423cddfcb6774e5de9050d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page