Skip to main content

Accera - An optimizing cross-compiler for compute-intensive code

Project description

Accera logo

PyPI package version Python versions MIT License

Problem at Hand

Writing highly optimized compute-intensive code in a traditional programming language is strenuous and time-consuming. Not only does it require advanced engineering skills such as fluency in Assembly language, but a deep understanding of computer architecture is also indispensable. Manual optimization of even the simplest numerical algorithms demands a significant engineering effort. Needless to say, a highly optimized numerical code is often prone to bugs, lacks readability, and offers little to no usability. Code maintenance becomes a nightmare resulting in the reimplementation of the same logic every time an architecture level change is introduced.

Accera: An Optimized Solution

Accera is a compiler that enables you to experiment with loop optimizations without hand-writing Assembly code. With Accera, these problems and impediments can be addressed in an optimized way. It is available as a Python library and supports cross-compiling to a wide range of processor targets.

Accera has THREE primary goals:

  • Performance: To guarantee the fastest implementation for any compute-intensive algorithm.
  • Readability: To ensure effective implementation of algorithms without sacrificing the readability of code.
  • Writability: To provide a user-friendly programming model, designed for agility and maintainability.

Install

To install for Linux, macOS, or Windows (requires Python 3.7-3.10):

pip install accera

See the Install Instructions for more details on installing pre-built Python 3 packages and how to build Accera from the source.

Quickstart

In this example, we will:

  • Implement matrix multiplication with a ReLU activation (matmul + ReLU), commonly used in machine learning algorithms.
  • Generate two implementations: a naive algorithm and loop-based transformations.
  • Compare the execution time of both implementations.

Run in your browser

Binder

No installation is required. This will launch a Jupyter notebook with the quickstart example running in the cloud.

Run on your machine

  1. Create a Python 3 script called quickstart.py:

    import accera as acc
    
    # define placeholder inputs/output
    A = acc.Array(role=acc.Role.INPUT, shape=(512, 512))
    B = acc.Array(role=acc.Role.INPUT, shape=(512, 512))
    C = acc.Array(role=acc.Role.INPUT_OUTPUT, shape=(512, 512))
    
    # implement the logic for matmul and relu
    matmul = acc.Nest(shape=(512, 512, 512))
    i1, j1, k1 = matmul.get_indices()
    @matmul.iteration_logic
    def _():
        C[i1, j1] += A[i1, k1] * B[k1, j1]
    
    relu = acc.Nest(shape=(512, 512))
    i2, j2 = relu.get_indices()
    @relu.iteration_logic
    def _():
        C[i2, j2] = acc.max(C[i2, j2], 0.0)
    
    package = acc.Package()
    
    # fuse the i and j indices of matmul and relu, add to the package
    schedule = acc.fuse(matmul.create_schedule(), relu.create_schedule(), partial=2)
    package.add(schedule, args=(A, B, C), base_name="matmul_relu_fusion_naive")
    
    # transform the schedule, add to the package
    i, j, f, k = schedule.get_indices()
    ii, jj = schedule.tile({
        i: 16,
        j: 16
    }) # loop tiling
    schedule.reorder(j, i, f, k, jj, ii) # loop reordering
    plan = schedule.create_plan()
    plan.unroll(ii) # loop unrolling
    package.add(plan, args=(A, B, C), base_name="matmul_relu_fusion_transformed")
    
    # build a dynamically-linked package (a .dll or .so) that exports both functions
    print(package.build(name="hello_accera", format=acc.Package.Format.HAT_DYNAMIC))
    
  2. Ensure that you have a compiler in your PATH:

    • Windows: Install Microsoft Visual Studio and run vcvars64.bat to setup the command prompt.
    • Linux/macOS: Install gcc

    Don't have a compiler handy? We recommend trying Accera in your browser instead Binder

  3. Install Accera:

    pip install accera
    
  4. Generate the library that implements two versions of matmul + ReLU:

    python quickstart.py
    
  5. To consume and compare the library functions, create a file called benchmark.py in the same location:

    import hatlib as hat
    import numpy as np
    
    # load the package
    _, functions = hat.load("hello_accera.hat")
    
    # call one of the functions with test inputs
    A_test = np.random.rand(512, 512).astype(np.float32)
    B_test = np.random.rand(512, 512).astype(np.float32)
    C_test = np.zeros((512, 512)).astype(np.float32)
    C_numpy = np.maximum(C_test + A_test @ B_test, 0.0)
    
    matmul_relu = functions["matmul_relu_fusion_transformed"]
    matmul_relu(A_test, B_test, C_test)
    
    # check correctness
    np.testing.assert_allclose(C_test, C_numpy, atol=1e-3)
    
    # benchmark all functions
    hat.run_benchmark("hello_accera.hat", batch_size=5, min_time_in_sec=5)
    
  6. Run the benchmark to get the execution time results:

    python benchmark.py
    

Next Steps

The Manual is the best introductory resource for the Accera Python programming model.

In particular, the schedule transformations describe how you can experiment with different loop transformations with just a few lines of Python code.

Finally, the .hat format is just a C header file containing the metadata. Learn more about the HAT format and benchmarking.

How it works

In a nutshell, Accera takes the Python code that defines the loop schedule and algorithm while converting it into MLIR intermediate representation (IR). Accera's compiler then takes this IR through a series of MLIR pipelines to perform transformations. The result is a binary library with a C header file. The library implements the algorithms that are defined in Python and it is compatible with the target.

To peek into the stages of IR transformation that Accera does, try replacing format=acc.Package.Format.HAT_DYNAMIC with format=acc.Package.Format.MLIR_DYNAMIC in quickstart.py, re-run the script, and search the _tmp subfolder for the intermediate *.mlir files. We plan to document these IR constructs in the future.

Documentation

Get familiar with Accera's concepts and Python constructs in the Documentation page.

Tutorials

Step-by-step examples are available on the Tutorials page. We're working on adding more complementary examples and tutorials.

Contributions

Accera is a research platform-in-progress that can certainly benefit from your contributions. We would love your feedback, recommendations, and feature requests. Not to mention that we are excited to answer your questions. Let’s collaborate! Please file a Github issue or send us a pull request. Please review the Microsoft Code of Conduct to learn more.

Credits

Accera is built using several open source libraries, including: LLVM, pybind11, toml++, tomlkit, vcpkg, pyyaml, and HAT. For testing, we used numpy and catch2.

License

This project is released under the MIT License.

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

accera-1.2.26-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (31.0 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

accera-1.2.26-cp310-cp310-macosx_11_0_x86_64.whl (33.6 MB view details)

Uploaded CPython 3.10 macOS 11.0+ x86-64

accera-1.2.26-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (31.0 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

accera-1.2.26-cp39-cp39-macosx_11_0_x86_64.whl (33.6 MB view details)

Uploaded CPython 3.9 macOS 11.0+ x86-64

accera-1.2.26-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (31.0 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

accera-1.2.26-cp38-cp38-macosx_10_15_x86_64.whl (33.6 MB view details)

Uploaded CPython 3.8 macOS 10.15+ x86-64

accera-1.2.26-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (31.0 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

accera-1.2.26-cp37-cp37m-macosx_10_15_x86_64.whl (33.6 MB view details)

Uploaded CPython 3.7m macOS 10.15+ x86-64

File details

Details for the file accera-1.2.26-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for accera-1.2.26-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 983db22af7c2f5b24e439ef3211af9e5353bed39b616627948b3c231501cd301
MD5 30e54fa9c894deed4ba383420e0fb62c
BLAKE2b-256 7b67af020d64f6db39cbbcf710242e45c5c0f3087b52c21722b34db859e064a6

See more details on using hashes here.

File details

Details for the file accera-1.2.26-cp310-cp310-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for accera-1.2.26-cp310-cp310-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 2a06c3268699b0cd8caa3471d1afd76f0d69ff81596b8e08ef2f67d2432c78c9
MD5 7140e98d28c1be57a159d464d4136a06
BLAKE2b-256 78716827fe55832f23b322ae583db5f434c23494c05468edec0e306e0d0f0c9a

See more details on using hashes here.

File details

Details for the file accera-1.2.26-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for accera-1.2.26-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 db1d74cfa4e6f78943f230c85f75687e93e385768148defe2ff928c266f524fc
MD5 570e17a53b4801d3e48b1d2f1d5b894c
BLAKE2b-256 113e7d3dfb487ff180c19b33c9bf34a0f46ba8bbc65c5d2b5356748c27b8c0c1

See more details on using hashes here.

File details

Details for the file accera-1.2.26-cp39-cp39-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for accera-1.2.26-cp39-cp39-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 10d4e665f77b885bd66768265495a46ad0d98437cad31f8316acb697d3af70a0
MD5 c788bc936f73bc7941027a685a4ee78b
BLAKE2b-256 0b7e29603d1b713d00c66ba8426547a6bd9275f999e3f4c53699c4066714e91f

See more details on using hashes here.

File details

Details for the file accera-1.2.26-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for accera-1.2.26-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ade978ccd6ed8fc0a7dd4b5d0d9083fa6d9f7f721b108d864cd794f83b7bd5ab
MD5 fc0498a87af6f1eb260d33d3483dc4e3
BLAKE2b-256 22b71564d5b402314dc7e91fef3287cc05d5cefddd21cd7e678197b9636b2c1a

See more details on using hashes here.

File details

Details for the file accera-1.2.26-cp38-cp38-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for accera-1.2.26-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 eea3e14e7baf2c45a82bd91bf0710fb475873b2a18bb5630bde02d13d7176089
MD5 df717a52e0ad19733ce40e81c6844018
BLAKE2b-256 fc8eb8a35ac751e5a6d671b1e626d3b74cf956c5294e3e9dd1c26d7f26dcf678

See more details on using hashes here.

File details

Details for the file accera-1.2.26-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for accera-1.2.26-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b6726a27081e01b388804ebeb62a78f924744b0d7f8a9b1e64bb0dd3a2b00cb2
MD5 05137ac9954d18e1afa2b23044e82a49
BLAKE2b-256 32570c10473ef24400e9c504d9534d2d8560f90740a9d7a1a280679d3b073523

See more details on using hashes here.

File details

Details for the file accera-1.2.26-cp37-cp37m-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for accera-1.2.26-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 ec924787081c29f71a7cbcce0611afd2f6d0aa372b0f9ca58c12baae3b8365d5
MD5 b29fbac4b7cbb0c475420b7f6f8df6c6
BLAKE2b-256 e8eb0c7e6535de7c81cf73f8f0cfc7f2b868ddc6659715704a6a2fa1c482a0d2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page