A simple package to time CPU/GPU/Multi-GPU ops
Project description
Torch Simple Timing
A simple yet versatile package to time CPU/GPU/Multi-GPU ops.
- "I want to time operations once"
- That's what a
Clock
is for
- That's what a
- "I want to time the same operations multiple times"
- That's what a
Timer
is for
- That's what a
In simple terms:
- A
Clock
is an object (and context-manager) that will compute the ellapsed time between itsstart()
(or__enter__
) andstop()
(or__exit__
) - A
Timer
will internally manage clocks so that you can focus on readability and not data structures
Installation
pip install torch_simple_timing
How to use
A Clock
from torch_simple_parsing import Clock
import torch
t = torch.rand(2000, 2000)
gpu = torch.cuda.is_available()
with Clock(gpu=gpu) as context_clock:
torch.inverse(t @ t.T)
clock = Clock(gpu=gpu).start()
torch.inverse(t @ t.T)
clock.stop()
print(context_clock.duration) # 0.29688501358032227
print(clock.duration) # 0.292896032333374
More examples, including bout how to easily share data structures using a store
can be found in the documentation.
A Timer
from torch_simple_timing import Timer
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
X = torch.rand(5000, 5000, device=device)
Y = torch.rand(5000, 100, device=device)
model = torch.nn.Linear(5000, 100).to(device)
optimizer = torch.optim.Adam(model.parameters())
gpu = device.type == "cuda"
timer = Timer(gpu=gpu)
for epoch in range(10):
timer.clock("epoch").start()
for b in range(50):
x = X[b*100: (b+1)*100]
y = Y[b*100: (b+1)*100]
optimizer.zero_grad()
with timer.clock("forward", ignore=epoch>0):
p = model(x)
loss = torch.nn.functional.cross_entropy(p, y)
with timer.clock("backward", ignore=epoch>0):
loss.backward()
optimizer.step()
timer.clock("epoch").stop()
stats = timer.stats()
# use stats for display and/or logging
# wandb.summary.update(stats)
print(timer.display(stats=stats, precision=5))
epoch : 0.25064 ± 0.02728 (n=10)
forward : 0.00226 ± 0.00526 (n=50)
backward : 0.00209 ± 0.00387 (n=50)
A decorator
You can also use a decorator to time functions without much overhead in your code:
from torch_simple_timing import timeit, get_global_timer, reset_global_timer
import torch
# Use the function name as the timer name
@timeit(gpu=True)
def train():
x = torch.rand(1000, 1000, device="cuda" if torch.cuda.is_available() else "cpu")
return torch.inverse(x @ x)
# Use a custom name
@timeit("test")
def test_cpu():
return torch.inverse(torch.rand(1000, 1000) @ torch.rand(1000, 1000))
if __name__ == "__main__":
for _ in range((epochs := 10)):
train()
test_cpu()
timer = get_global_timer()
print(timer.display())
reset_global_timer()
Prints:
train : 0.045 ± 0.007 (n=10)
test : 0.046 (n= 1)
By default the @timeit
decodrator takes at least a name
, will use gpu=False
and use the global timer (torch_simple_timing.TIMER
). You can pass your own timer with @timeit(name, timer=timer)
.
See in the docs.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file torch_simple_timing-0.1.4.tar.gz
.
File metadata
- Download URL: torch_simple_timing-0.1.4.tar.gz
- Upload date:
- Size: 12.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.2 CPython/3.9.2 Darwin/22.4.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e66f1602b6c45c2d237be0e846f6f22d3d80c923011794443ef0e663a10a3f1b |
|
MD5 | 428d4c4ad0ea556975d2ceacfdcc5556 |
|
BLAKE2b-256 | 0a9cecf5670636af340b438ff9b230a83be60b529640d40b4fc21a2a29103fae |
File details
Details for the file torch_simple_timing-0.1.4-py3-none-any.whl
.
File metadata
- Download URL: torch_simple_timing-0.1.4-py3-none-any.whl
- Upload date:
- Size: 12.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.2 CPython/3.9.2 Darwin/22.4.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f9994256367d5539121b4e7f9372a8b0abd29a37af93c2827d644dcc5e032f40 |
|
MD5 | 49b69b5f3dca7c3af4ece73dff822536 |
|
BLAKE2b-256 | 1c273d70f45979089aefe0903f8bfef30f91eb1198221c317de6333720ce9861 |