Skip to main content

No project description provided

Project description

GitlabCIPipeline GitlabCICoverage Appveyor Pypi Downloads ReadTheDocs

The main webpage for this project is: https://gitlab.kitware.com/computer-vision/kwarray

The kwarray module implements a small set of pure-python extensions to numpy and torch.

The kwarray module started as extensions for numpy + a simplified pandas-like DataFrame object with much faster item row and column access. But it also include an ArrayAPI, which is a wrapper that allows 100% interoperability between torch and numpy. It also contains a few algorithms like setcover and mincost_assignment.

Read the docs here: https://kwarray.readthedocs.io/en/master/

The top-level API is:

from kwarray.arrayapi import ArrayAPI
from .algo_assignment import (maxvalue_assignment, mincost_assignment,
                              mindist_assignment,)
from .algo_setcover import (setcover,)
from .dataframe_light import (DataFrameArray, DataFrameLight, LocLight,)
from .fast_rand import (standard_normal, standard_normal32, standard_normal64,
                        uniform, uniform32,)
from .util_averages import (RunningStats, stats_dict,)
from .util_groups import (apply_grouping, group_consecutive,
                          group_consecutive_indices, group_indices,
                          group_items,)
from .util_misc import (FlatIndexer,)
from .util_numpy import (arglexmax, argmaxima, argminima, atleast_nd, boolmask,
                         isect_flags, iter_reduce_ufunc,)
from .util_random import (ensure_rng, random_combinations, random_product,
                          seed_global, shuffle,)
from .util_torch import (one_hot_embedding, one_hot_lookup,)

The ArrayAPI

On of the most useful features in kwarray is the kwarray.ArrayAPI — a class that helps bridge between numpy and torch. This class consists of static methods that implement part of the numpy API and operate equivalently on either torch.Tensor or numpy.ndarray objects.

This works because every function call checks if the input is a torch tensor or a numpy array and then takes the appropriate action.

As you can imagine, it can be slow to validate your inputs on each function call. Therefore the recommended way of using the array API is via the kwarray.ArrayAPI.impl function. This function does the check once and then returns another object that directly performs the correct operations on subsequent data items of the same type.

The following example demonstrates both modes of usage.

import torch
import numpy as np
data1 = torch.rand(10, 10)
data2 = data1.numpy()
# Method 1: grab the appropriate sub-impl
impl1 = ArrayAPI.impl(data1)
impl2 = ArrayAPI.impl(data2)
result1 = impl1.sum(data1, axis=0)
result2 = impl2.sum(data2, axis=0)
assert np.all(impl1.numpy(result1) == impl2.numpy(result2))
# Method 2: choose the impl on the fly
result1 = ArrayAPI.sum(data1, axis=0)
result2 = ArrayAPI.sum(data2, axis=0)
assert np.all(ArrayAPI.numpy(result1) == ArrayAPI.numpy(result2))

Other Notes:

The kwarray.ensure_rng function helps you properly maintain and control local seeded random number generation. This means that you wont clobber the random state of another library / get your random state clobbered.

DataFrameArray and DataFrameLight implement a subset of the pandas API. They are less powerful, but orders of magnitude faster. The main drawback is that you lose loc, but iloc is available.

uniform32 and standard_normal32 are faster 32-bit random number generators (compared to their 64-bit numpy counterparts).

mincost_assignment is the Munkres / Hungarian algorithm. It solves the assignment problem.

setcover - solves the minimum weighted set cover problem using either an approximate or an exact solution.

one_hot_embedding is a fast numpy / torch way to perform the often needed OHE deep-learning trick.

group_items is a fast way to group a numpy array by another numpy array. For fine grained control we also expose group_indices, which groups the indices of a numpy array, and apply_grouping, which partitions a numpy array by those indices.

boolmask effectively inverts np.where.

Usefulness:

This is the frequency that I’ve used various components of this library with in my projects:

{
    'ensure_rng': 85,
    'ArrayAPI': 79,
    'DataFrameArray': 21,
    'boolmask': 17,
    'shuffle': 16,
    'argmaxima': 13,
    'group_indices': 12,
    'stats_dict': 9,
    'maxvalue_assignment': 7,
    'seed_global': 7,
    'iter_reduce_ufunc': 5,
    'isect_flags': 5,
    'group_items': 4,
    'one_hot_embedding': 4,
    'atleast_nd': 4,
    'mincost_assignment': 3,
    'standard_normal': 3,
    'arglexmax': 2,
    'DataFrameLight': 1,
    'uniform': 1,
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kwarray-0.5.16.tar.gz (63.7 kB view details)

Uploaded Source

Built Distributions

kwarray-0.5.16-py3-none-any.whl (66.6 kB view details)

Uploaded Python 3

kwarray-0.5.16-py2.py3-none-any.whl (66.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file kwarray-0.5.16.tar.gz.

File metadata

  • Download URL: kwarray-0.5.16.tar.gz
  • Upload date:
  • Size: 63.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.7

File hashes

Hashes for kwarray-0.5.16.tar.gz
Algorithm Hash digest
SHA256 bbac5b518865afb37bc4014f6b77475eee1488efd15f3bb212c586abf395abd7
MD5 eb93c72c0926d9bcbc98cc4f94a2d150
BLAKE2b-256 03651b1a340a91fc4c713e445288d6e88d6870513b67c4b29342fe976bdef439

See more details on using hashes here.

File details

Details for the file kwarray-0.5.16-py3-none-any.whl.

File metadata

  • Download URL: kwarray-0.5.16-py3-none-any.whl
  • Upload date:
  • Size: 66.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.7

File hashes

Hashes for kwarray-0.5.16-py3-none-any.whl
Algorithm Hash digest
SHA256 1a5b45e92ecc12fd95bd735556f00426b2d92df6a97536cba0e10a7a5e47f18b
MD5 fd1e98f0d4be05c89e1cda2281f109b5
BLAKE2b-256 31e741a4dbf7427e427db3f8720eb10ed063d815b14055d65e4d27790aca3446

See more details on using hashes here.

File details

Details for the file kwarray-0.5.16-py2.py3-none-any.whl.

File metadata

  • Download URL: kwarray-0.5.16-py2.py3-none-any.whl
  • Upload date:
  • Size: 66.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.7

File hashes

Hashes for kwarray-0.5.16-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 53aaaddd04852da1be878cbe7df809ffe2e3e8a7dbf66ac484126604d35f148b
MD5 e9a6b9b407a2341424d8af5b1d98abc4
BLAKE2b-256 a26d9c611e4d7e79f9f614e68a6f6fcb793de9b958be46ef933e613ed4017c51

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page