Skip to main content

Torch-TensorRT is a package which allows users to automatically compile PyTorch and TorchScript modules to TensorRT while remaining in PyTorch

Project description

Torch-TensorRT

Easily achieve the best inference performance for any PyTorch model on the NVIDIA platform.

Documentation pytorch cuda trt license linux_tests windows_tests


Torch-TensorRT brings the power of TensorRT to PyTorch. Accelerate inference latency by up to 5x compared to eager execution in just one line of code.

Installation

Stable versions of Torch-TensorRT are published on PyPI

pip install torch-tensorrt

Nightly versions of Torch-TensorRT are published on the PyTorch package index

pip install --pre torch-tensorrt --index-url https://download.pytorch.org/whl/nightly/cu124

Torch-TensorRT is also distributed in the ready-to-run NVIDIA NGC PyTorch Container which has all dependencies with the proper versions and example notebooks included.

For more advanced installation methods, please see here

Quickstart

Option 1: torch.compile

You can use Torch-TensorRT anywhere you use torch.compile:

import torch
import torch_tensorrt

model = MyModel().eval().cuda() # define your model here
x = torch.randn((1, 3, 224, 224)).cuda() # define what the inputs to the model will look like

optimized_model = torch.compile(model, backend="tensorrt")
optimized_model(x) # compiled on first run

optimized_model(x) # this will be fast!

Option 2: Export

If you want to optimize your model ahead-of-time and/or deploy in a C++ environment, Torch-TensorRT provides an export-style workflow that serializes an optimized module. This module can be deployed in PyTorch or with libtorch (i.e. without a Python dependency).

Step 1: Optimize + serialize

import torch
import torch_tensorrt

model = MyModel().eval().cuda() # define your model here
inputs = [torch.randn((1, 3, 224, 224)).cuda()] # define a list of representative inputs here

trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs)
torch_tensorrt.save(trt_gm, "trt.ep", inputs=inputs) # PyTorch only supports Python runtime for an ExportedProgram. For C++ deployment, use a TorchScript file
torch_tensorrt.save(trt_gm, "trt.ts", output_format="torchscript", inputs=inputs)

Step 2: Deploy

Deployment in PyTorch:
import torch
import torch_tensorrt

inputs = [torch.randn((1, 3, 224, 224)).cuda()] # your inputs go here

# You can run this in a new python session!
model = torch.export.load("trt.ep").module()
# model = torch_tensorrt.load("trt.ep").module() # this also works
model(*inputs)
Deployment in C++:
#include "torch/script.h"
#include "torch_tensorrt/torch_tensorrt.h"

auto trt_mod = torch::jit::load("trt.ts");
auto input_tensor = [...]; // fill this with your inputs
auto results = trt_mod.forward({input_tensor});

Further resources

Platform Support

Platform Support
Linux AMD64 / GPU Supported
Windows / GPU Supported (Dynamo only)
Linux aarch64 / GPU Native Compilation Supported on JetPack-4.4+ (use v1.0.0 for the time being)
Linux aarch64 / DLA Native Compilation Supported on JetPack-4.4+ (use v1.0.0 for the time being)
Linux ppc64le / GPU Not supported

Note: Refer NVIDIA L4T PyTorch NGC container for PyTorch libraries on JetPack.

Dependencies

These are the following dependencies used to verify the testcases. Torch-TensorRT can work with other versions, but the tests are not guaranteed to pass.

  • Bazel 6.3.2
  • Libtorch 2.5.0.dev (latest nightly) (built with CUDA 12.4)
  • CUDA 12.4
  • TensorRT 10.3.0.26

Deprecation Policy

Deprecation is used to inform developers that some APIs and tools are no longer recommended for use. Beginning with version 2.3, Torch-TensorRT has the following deprecation policy:

Deprecation notices are communicated in the Release Notes. Deprecated API functions will have a statement in the source documenting when they were deprecated. Deprecated methods and classes will issue deprecation warnings at runtime, if they are used. Torch-TensorRT provides a 6-month migration period after the deprecation. APIs and tools continue to work during the migration period. After the migration period ends, APIs and tools are removed in a manner consistent with semantic versioning.

Contributing

Take a look at the CONTRIBUTING.md

License

The Torch-TensorRT license can be found in the LICENSE file. It is licensed with a BSD Style licence

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

torch_tensorrt-2.5.0-cp312-cp312-win_amd64.whl (2.9 MB view details)

Uploaded CPython 3.12 Windows x86-64

torch_tensorrt-2.5.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_34_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.34+ x86-64

torch_tensorrt-2.5.0-cp311-cp311-win_amd64.whl (2.9 MB view details)

Uploaded CPython 3.11 Windows x86-64

torch_tensorrt-2.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_34_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.34+ x86-64

torch_tensorrt-2.5.0-cp310-cp310-win_amd64.whl (2.9 MB view details)

Uploaded CPython 3.10 Windows x86-64

torch_tensorrt-2.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_34_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.34+ x86-64

torch_tensorrt-2.5.0-cp39-cp39-win_amd64.whl (2.9 MB view details)

Uploaded CPython 3.9 Windows x86-64

torch_tensorrt-2.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_34_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.34+ x86-64

File details

Details for the file torch_tensorrt-2.5.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.5.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 b9e7ee1510e644106e9fb6f194a227a1394d8646fea049e5dba572f99dc38b24
MD5 923ef5334e0be429d72f836b2e25adb5
BLAKE2b-256 e070ed10a8ce0f30bc938a53fcb0066fe2fb34dac33c6f240c24d3d9276444c1

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.5.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.5.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 d317a244284f5fe2f57b78ec895ff6da0b23ff7b208a5f3001a61a77f3b9ec35
MD5 027b179949a0bfb2afdaa8d76c4340ee
BLAKE2b-256 3b63f2d800a2353c1b03d6000ff686f30dae4fa746fdf8dd85b6d4245e5593ff

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.5.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.5.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 a893b933d6432c8a07ce0bf9b394c5b200ba3d620250bef4d89b0a734189cd84
MD5 ccfdaab726a319eebc18a056c3855a87
BLAKE2b-256 10762345bee0199424846d5708254ed1c71293f4825b15c6b824d7fae32aac55

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 1b408fe06ba0e855cb6a142a2ac27815ab45f43c05d3e3880c8602429cddb27f
MD5 6408b1f067b61f75ad9f9f599a9420c1
BLAKE2b-256 87c611f4e6300bd135dc7f8160670f6b2b10618e5112c6ee368dfcc4f7dc5cfc

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.5.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.5.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 fca7984736394cfc4685460cdde5a3b0e697bd23a7aeba91e66daa0619c91170
MD5 926c43f170e9afcbd4237c1b7fcee45a
BLAKE2b-256 5f2b4662215a1b7ac311dea83ebd2dd7accb72618231d4600ea172e370c1766d

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 3b059b1e024e1ae0f37ab2da32f15077e53a18294368f74c8d92ebc24fb1c5f3
MD5 c38228b27343f568bf4318b266461bca
BLAKE2b-256 13490bd42291b2bd6bdc64e51299e731be09f3e8b8bbee939d943240c7b63dca

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.5.0-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.5.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 b0a2ed43fb047ed7294c3213c52367676d6581483f6fae045d744ce2549d7781
MD5 02d67545bf99f2b995369eaad6b0703b
BLAKE2b-256 4eea2dba63cf0f929abe83ea915f34ba973a9029fe9b26015e50b17670c79b0a

See more details on using hashes here.

File details

Details for the file torch_tensorrt-2.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for torch_tensorrt-2.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 6076cac847127bfea3cce3bb50aa7a7465510c0bf3eb085bd49dbfde1998471a
MD5 7d2fdc7fc385ae5ebb8ec5838dde75ad
BLAKE2b-256 0ac3dc2c0580d4ee49714e1dca7199dba065168759d0e375838d9f31b4cc5855

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page