Skip to main content

Python bindings and extensions for Velox

Project description

Velox logo

Velox is a C++ database acceleration library which provides reusable, extensible, and high-performance data processing components. These components can be reused to build compute engines focused on different analytical workloads, including batch, interactive, stream processing, and AI/ML. Velox was created by Facebook and it is currently developed in partnership with Intel, ByteDance, and Ahana.

In common usage scenarios, Velox takes a fully optimized query plan as input and performs the described computation. Considering Velox does not provide a SQL parser, a dataframe layer, or a query optimizer, it is usually not meant to be used directly by end-users; rather, it is mostly used by developers integrating and optimizing their compute engines.

Velox provides the following high-level components:

  • Type: a generic typing system that supports scalar, complex, and nested types, such as structs, maps, arrays, tensors, etc.
  • Vector: an Arrow-compatible columnar memory layout module, which provides multiple encodings, such as Flat, Dictionary, Constant, Sequence/RLE, and Bias, in addition to a lazy materialization pattern and support for out-of-order writes.
  • Expression Eval: a fully vectorized expression evaluation engine that allows expressions to be efficiently executed on top of Vector/Arrow encoded data.
  • Function Packages: sets of vectorized function implementations following the Presto and Spark semantic.
  • Operators: implementation of common data processing operators such as scans, projection, filtering, groupBy, orderBy, shuffle, hash join, unnest, and more.
  • I/O: a generic connector interface that allows different file formats (ORC/DWRF and Parquet) and storage adapters (S3, HDFS, local files) to be used.
  • Network Serializers: an interface where different wire protocols can be implemented, used for network communication, supporting PrestoPage and Spark's UnsafeRow.
  • Resource Management: a collection of primitives for handling computational resources, such as memory arenas and buffer management, tasks, drivers, and thread pools for CPU and thread execution, spilling, and caching.

Velox is extensible and allows developers to define their own engine-specific specializations, including:

  1. Custom types
  2. Simple and vectorized functions
  3. Aggregate functions
  4. Operators
  5. File formats
  6. Storage adapters
  7. Network serializers

Examples

Examples of extensibility and integration with different component APIs can be found here

Documentation

Developer guides detailing many aspects of the library, in addition to the list of available functions can be found here.

Getting Started

We provide scripts to help developers setup and install Velox dependencies.

Get the Velox Source

git clone --recursive https://github.com/facebookincubator/velox.git
cd velox
# if you are updating an existing checkout
git submodule sync --recursive
git submodule update --init --recursive

Setting up on macOS

Once you have checked out Velox, on an Intel MacOS machine you can setup and then build like so:

$ ./scripts/setup-macos.sh 
$ make

On an M1 MacOS machine you can build like so:

$ CPU_TARGET="arm64" ./scripts/setup-macos.sh
$ CPU_TARGET="arm64" make

You can also produce intel binaries on an M1, use CPU_TARGET="sse" for the above.

Setting up on aarch64 Linux (Ubuntu 20.04 or later)

On an aarch64 based machine, you can build like so:

$ CPU_TARGET="aarch64" ./scripts/setup-ubuntu.sh
$ CPU_TARGET="aarch64" make

Setting up on x86_64 Linux (Ubuntu 20.04 or later)

Once you have checked out Velox, you can setup and build like so:

$ ./scripts/setup-ubuntu.sh 
$ make

Building Velox

Run make in the root directory to compile the sources. For development, use make debug to build a non-optimized debug version, or make release to build an optimized version. Use make unittest to build and run tests.

Note that,

  • Velox requires C++17 , thus minimum supported compiler is GCC 5.0 and Clang 5.0.
  • Velox requires the CPU to support instruction sets:
    • bmi
    • bmi2
    • f16c
  • Velox tries to use the following (or equivalent) instruction sets where available:
    • On Intel CPUs
      • avx
      • avx2
      • sse
    • On ARM
      • Neon
      • Neon64

Building Velox with docker-compose

If you don't want to install the system dependencies required to build Velox, you can also build and run tests for Velox on a docker container using docker-compose. Use the following commands:

$ docker-compose build ubuntu-cpp
$ docker-compose run --rm ubuntu-cpp

If you want to increase or decrease the number of threads used when building Velox you can override the NUM_THREADS environment variable by doing:

$ docker-compose run -e NUM_THREADS=<NUM_THREADS_TO_USE> --rm ubuntu-cpp

Contributing

Check our contributing guide to learn about how to contribute to the project.

Community

The main communication channel with the Velox OSS community is through the the Velox-OSS Slack workspace. Please reach out to velox@fb.com to get access to Velox Slack Channel.

License

Velox is licensed under the Apache 2.0 License. A copy of the license can be found here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyvelox-0.0.1a7.tar.gz (10.2 MB view details)

Uploaded Source

Built Distributions

pyvelox-0.0.1a7-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (26.6 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

pyvelox-0.0.1a7-cp311-cp311-macosx_10_15_x86_64.whl (29.6 MB view details)

Uploaded CPython 3.11 macOS 10.15+ x86-64

pyvelox-0.0.1a7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (26.6 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

pyvelox-0.0.1a7-cp310-cp310-macosx_10_15_x86_64.whl (29.6 MB view details)

Uploaded CPython 3.10 macOS 10.15+ x86-64

pyvelox-0.0.1a7-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (26.7 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

pyvelox-0.0.1a7-cp39-cp39-macosx_10_15_x86_64.whl (29.6 MB view details)

Uploaded CPython 3.9 macOS 10.15+ x86-64

pyvelox-0.0.1a7-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (26.6 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

pyvelox-0.0.1a7-cp38-cp38-macosx_10_15_x86_64.whl (29.6 MB view details)

Uploaded CPython 3.8 macOS 10.15+ x86-64

pyvelox-0.0.1a7-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (26.7 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

pyvelox-0.0.1a7-cp37-cp37m-macosx_10_15_x86_64.whl (29.6 MB view details)

Uploaded CPython 3.7m macOS 10.15+ x86-64

File details

Details for the file pyvelox-0.0.1a7.tar.gz.

File metadata

  • Download URL: pyvelox-0.0.1a7.tar.gz
  • Upload date:
  • Size: 10.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.16

File hashes

Hashes for pyvelox-0.0.1a7.tar.gz
Algorithm Hash digest
SHA256 c6022deeff08011a83fb03fa984bb1a061318754095f0377114e2a9163390d57
MD5 42fad27083e8def48743f4a6aab3256a
BLAKE2b-256 0788d9599e7a03b6a0a9acb5659800684be491d30744314b6d949b0205651683

See more details on using hashes here.

File details

Details for the file pyvelox-0.0.1a7-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyvelox-0.0.1a7-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 511bf0381d4e8850ef1cb543bdabda88d1cb773c4cdc19856fc854bf88ec7028
MD5 f9cdcdffe81daf54e0f994eb5e6b4d88
BLAKE2b-256 d533481729a3bfec197b817c055b3cef04feb8ff0db18ac8df58b3e73c8c3719

See more details on using hashes here.

File details

Details for the file pyvelox-0.0.1a7-cp311-cp311-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyvelox-0.0.1a7-cp311-cp311-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 b7cb03022fb9a31e753bf84504b4039db10c123f972b432b7a3f47694c4caf1c
MD5 2875c4a021d4b89cbc5435efcedaea7e
BLAKE2b-256 f9938b505ba7aef4b773fed5f9d0e5cfcfe0658cedd1e3015ee6592d0f736cfc

See more details on using hashes here.

File details

Details for the file pyvelox-0.0.1a7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyvelox-0.0.1a7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b9813406a5c32faeebfe254926a945e9cd85fee9cf9aea1aa19db6f9a187cebe
MD5 e4afa22fbf0eedd04e516b73094133e1
BLAKE2b-256 17d1042dff589ce4b9ce4505b1c3aa2f1d287f93259ac1d10eef9454731773fe

See more details on using hashes here.

File details

Details for the file pyvelox-0.0.1a7-cp310-cp310-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyvelox-0.0.1a7-cp310-cp310-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 f9685ec92fc0ded6026a492a468fe265824eb2aa3ec7d4225e5af90d0ed649e5
MD5 3ac99c08d25546c7bb5fdfd7ae869494
BLAKE2b-256 a8cfe3ded985194af5d7e2d8ac4b4e57aee5f879f4354c3506ae8dd865da3901

See more details on using hashes here.

File details

Details for the file pyvelox-0.0.1a7-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyvelox-0.0.1a7-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a79530782fad9b44fe1fff59aafd9af65c1fb337d1eeb81f5ed19dda01dedf6a
MD5 3931093d2ed2a790e0e38771b2968360
BLAKE2b-256 67fc2e036ab2a576d725b0a15c4bf6ee951895d981c81c2c540702f559c844aa

See more details on using hashes here.

File details

Details for the file pyvelox-0.0.1a7-cp39-cp39-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyvelox-0.0.1a7-cp39-cp39-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 762b285e12a42ec528ef52dc492a0233eeda4be9ae6eebe2f0e8894e679d3bce
MD5 a593f6fff4560a5f945840a82dbcfcd5
BLAKE2b-256 d368f7a09231c38e36babb958815bf23cc1ece6265d0eeeaf37c73778b74660e

See more details on using hashes here.

File details

Details for the file pyvelox-0.0.1a7-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyvelox-0.0.1a7-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c5f2d77fe645c8bcb2b2bff2fca07bf81d322f4d2a11835b672ccc478f2416aa
MD5 24ce2fee80a00dc6d5c74d121ecfcd4a
BLAKE2b-256 f8f410db50f7f042e6ce621907a2b764054f11c42fb52d6cb50a70a077089217

See more details on using hashes here.

File details

Details for the file pyvelox-0.0.1a7-cp38-cp38-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyvelox-0.0.1a7-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 8dfef6c93c529559f7acac24705f16aa30c2443c39cadbc745abcf8cf7b5380e
MD5 65578d5f64238e600493a0c18603db34
BLAKE2b-256 519ea83aa5022a5d7aa00940a2bf8dfc39bbc9e22770302383e0f291e74d3e06

See more details on using hashes here.

File details

Details for the file pyvelox-0.0.1a7-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pyvelox-0.0.1a7-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 239e9837a7e7312be56dd4c7c24a64774a6370fca4eb1a267927f667913fe119
MD5 7321697bf85296c8272a4114f63830d7
BLAKE2b-256 10d634bf29d912735bedf583297d8bc6b5be7f5c057e5d0961f8b98e07eb28e9

See more details on using hashes here.

File details

Details for the file pyvelox-0.0.1a7-cp37-cp37m-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for pyvelox-0.0.1a7-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 c73ebd0e447f3e2624285dbc562a6016697faf82fb9fae15101794a3461a09fc
MD5 3430695f9e5489e00eebd70022b3d623
BLAKE2b-256 91feb5974dcacecd0c0d40d6e9b4f96f5b52b30fb59d429d59b53c6ba723aac9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page