Skip to main content

MLPerf Inference LoadGen python bindings

Project description

Overview {#mainpage}

Introduction

  • The LoadGen is a reusable module that efficiently and fairly measures the performance of inference systems.
  • It generates traffic for scenarios as formulated by a diverse set of experts in the MLCommons working group.
  • The scenarios emulate the workloads seen in mobile devices, autonomous vehicles, robotics, and cloud-based setups.
  • Although the LoadGen is not model or dataset aware, its strength is in its reusability with logic that is.

Integration Example and Flow

The following is an diagram of how the LoadGen can be integrated into an inference system, resembling how some of the MLPerf reference models are implemented.

  1. Benchmark knows the model, dataset, and preprocessing.
  2. Benchmark hands dataset sample IDs to LoadGen.
  3. LoadGen starts generating queries of sample IDs.
  4. Benchmark creates requests to backend.
  5. Result is post processed and forwarded to LoadGen.
  6. LoadGen outputs logs for analysis.

Useful Links

Scope of the LoadGen's Responsibilities

In Scope

  • Provide a reusable C++ library with python bindings.
  • Implement the traffic patterns of the MLPerf Inference scenarios and modes.
  • Record all traffic generated and received for later analysis and verification.
  • Summarize the results and whether performance constraints were met.
  • Target high-performance systems with efficient multi-thread friendly logging utilities.
  • Generate trust via a shared, well-tested, and community-hardened code base.

Out of Scope

The LoadGen is:

  • NOT aware of the ML model it is running against.
  • NOT aware of the data formats of the model's inputs and outputs.
  • NOT aware of how to score the accuracy of a model's outputs.
  • NOT aware of MLPerf rules regarding scenario-specific constraints.

Limitting the scope of the LoadGen in this way keeps it reusable across different models and datasets without modification. Using composition and dependency injection, the user can define their own model, datasets, and metrics.

Additionally, not hardcoding MLPerf-specific test constraints, like test duration and performance targets, allows users to use the LoadGen unmodified for custom testing and continuous integration purposes.

Submission Considerations

Upstream all local modifications

  • As a rule, no local modifications to the LoadGen's C++ library are allowed for submission.
  • Please upstream early and often to keep the playing field level.

Choose your TestSettings carefully!

  • Since the LoadGen is oblivious to the model, it can't enforce the MLPerf requirements for submission. e.g.: target percentiles and latencies.
  • For verification, the values in TestSettings are logged.
  • To help make sure your settings are spec compliant, use TestSettings::FromConfig in conjunction with the relevant config file provided with the reference models.

Responsibilities of a LoadGen User

Implement the Interfaces

  • Implement the SystemUnderTest and QuerySampleLibrary interfaces and pass them to the StartTest function.
  • Call QuerySampleComplete for every sample received by SystemUnderTest::IssueQuery.

Assess Accuracy

  • Process the mlperf_log_accuracy.json output by the LoadGen to determine the accuracy of your system.
  • For the official models, Python scripts will be provided by the MLPerf model owners for you to do this automatically.

For templates of how to do the above in detail, refer to code for the demos, tests, and reference models.

LoadGen over the Network

For reference, on a high level a submission looks like this:

The LoadGen implementation is common to all submissions, while the QSL (“Query Sample Library”) and SUT (“System Under Test”) are implemented by submitters. QSL is responsible for loading the data and includes untimed preprocessing.

A submission over the network introduces a new component “QDL” (query dispatch library) that is added to the system as presented in the following diagram:

QDL is a proxy for a load-balancer, that dispatches queries to SUT over a physical network, receives the responses and passes them back to LoadGen. It is implemented by the submitter. The interface of the QDL is the same as the API to SUT.

In scenarios using QDL, data may be compressed in QSL at the choice of the submitter in order to reduce network transmission time. Decompression is part of the timed processing in SUT. A set of approved standard compression schemes will be specified for each benchmark; additional compression schemes must be approved in advance by the Working Group.

All communication between LoadGen/QSL and SUT is via QDL, and all communication between QDL and SUT must pass over a physical network.

QDL implements the protocol to transmit queries over the network and receive responses. It also implements decompression of any response returned by the SUT, where compression of responses is allowed. Performing any part of the timed preprocessing or inference in QDL is specifically disallowed. Currently no batching is allowed in QDL, although this may be revisited in future.

The MLperf over the Network will run in Server mode and Offline mode. All LoadGen modes are expected to work as is with insignificant changes. These include running the test in performance mode, accuracy mode, find peak performance mode and compliance mode. The same applies for power measurements.

QDL details

The Query Dispatch Library is implemented by the submitter and interfaces with LoadGen using the same SUT API. All MLPerf Inference SUTs implement the mlperf::SystemUnderTest class which is defined in system_under_test.h. The QDL implements mlperf::QueryDispatchLibrary class which inherits the mlperf::SystemUnderTest class and has the same API and support all existing mlperf::SystemUnderTest methods. It has a separate header file query_dispatch_library.h. Using sut with mlperf::SystemUnderTest class in LoadGen StartTest is natively upcasting mlperf::QueryDispatchLibrary class.

QDL Query issue and response over the network

The QDL gets the queries from the LoadGen through

void IssueQuery(const std::vector<QuerySample>& samples)

The QDL dispatches the queries to the SUT over the physical media. The exact method and implementation for it are submitter specific and would not be specified at MLCommons. Submitter implementation includes all methods required to serialize the query, load balance, drive it to the Operating system and network interface card and send to the SUT.

The QDL receives the query responses over the network from the SUT. The exact method and implementation for it are submitter specific and would not be specified at MLCommons. The submitter implementation includes all methods required to receive the network data from the Network Interface card, go through the Operating system, deserialize the query response, and provide it back to the LoadGen through query completion by:

struct QuerySampleResponse {
  ResponseId id;
  uintptr_t data;
  size_t size;
};
void QuerySamplesComplete(QuerySampleResponse* responses, 
                          size_t response_count);

QDL Additional Methods

In addition to that the QDL needs to implement the following methods that are provided by the SUT interface to the LoadGen:

const std::string& Name();

The Name function returns a known string for over the Network SUTs to identify it as over the network benchmark.

void FlushQueries();

It is not specified here how the QDL would query and configure the SUT to execute the above methods. The QDL responds to the LoadGen after receiving its own response from the SUT.

Example

Refer to LON demo for a reference example illustrating usage of Loadgen over the network.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

mlcommons_loadgen-4.1.6-cp313-cp313-win_amd64.whl (389.3 kB view details)

Uploaded CPython 3.13 Windows x86-64

mlcommons_loadgen-4.1.6-cp313-cp313-win32.whl (367.7 kB view details)

Uploaded CPython 3.13 Windows x86

mlcommons_loadgen-4.1.6-cp313-cp313-macosx_11_0_arm64.whl (554.3 kB view details)

Uploaded CPython 3.13 macOS 11.0+ ARM64

mlcommons_loadgen-4.1.6-cp312-cp312-win_amd64.whl (389.3 kB view details)

Uploaded CPython 3.12 Windows x86-64

mlcommons_loadgen-4.1.6-cp312-cp312-win32.whl (367.7 kB view details)

Uploaded CPython 3.12 Windows x86

mlcommons_loadgen-4.1.6-cp312-cp312-macosx_11_0_arm64.whl (554.4 kB view details)

Uploaded CPython 3.12 macOS 11.0+ ARM64

mlcommons_loadgen-4.1.6-cp311-cp311-win_amd64.whl (389.8 kB view details)

Uploaded CPython 3.11 Windows x86-64

mlcommons_loadgen-4.1.6-cp311-cp311-win32.whl (369.0 kB view details)

Uploaded CPython 3.11 Windows x86

mlcommons_loadgen-4.1.6-cp311-cp311-macosx_11_0_arm64.whl (552.7 kB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

mlcommons_loadgen-4.1.6-cp310-cp310-win_amd64.whl (389.3 kB view details)

Uploaded CPython 3.10 Windows x86-64

mlcommons_loadgen-4.1.6-cp310-cp310-win32.whl (368.2 kB view details)

Uploaded CPython 3.10 Windows x86

mlcommons_loadgen-4.1.6-cp310-cp310-macosx_11_0_arm64.whl (551.2 kB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

mlcommons_loadgen-4.1.6-cp39-cp39-win_amd64.whl (383.0 kB view details)

Uploaded CPython 3.9 Windows x86-64

mlcommons_loadgen-4.1.6-cp39-cp39-win32.whl (368.5 kB view details)

Uploaded CPython 3.9 Windows x86

mlcommons_loadgen-4.1.6-cp39-cp39-macosx_11_0_arm64.whl (551.3 kB view details)

Uploaded CPython 3.9 macOS 11.0+ ARM64

mlcommons_loadgen-4.1.6-cp38-cp38-win_amd64.whl (389.1 kB view details)

Uploaded CPython 3.8 Windows x86-64

mlcommons_loadgen-4.1.6-cp38-cp38-win32.whl (368.3 kB view details)

Uploaded CPython 3.8 Windows x86

mlcommons_loadgen-4.1.6-cp38-cp38-macosx_11_0_arm64.whl (551.1 kB view details)

Uploaded CPython 3.8 macOS 11.0+ ARM64

mlcommons_loadgen-4.1.6-cp37-cp37m-win_amd64.whl (360.3 kB view details)

Uploaded CPython 3.7m Windows x86-64

mlcommons_loadgen-4.1.6-cp37-cp37m-win32.whl (340.7 kB view details)

Uploaded CPython 3.7m Windows x86

File details

Details for the file mlcommons_loadgen-4.1.6-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 ecb21f9369b5f193d220f4bb17a9961b5083dd7863a27fabf39c596ec9ce7350
MD5 ae26536c8885d30dae65e663f8c21181
BLAKE2b-256 4ada2a1c4c64f2611b8aec6be09414b483c4ccf6669994389abb827e3e261483

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp313-cp313-win32.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp313-cp313-win32.whl
Algorithm Hash digest
SHA256 e60d57f0d787db3b442ef6d21e83a8a9d5a4d63508be982caa6d7e624d7ee7c2
MD5 710a4d55579c7b7ffd83c30bcc39b0e9
BLAKE2b-256 a299486f3abdc8d5ce21ec18bf32eb501f9f6364c6267383f05282eec5fba15c

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9feb45a4532e11698f4acf9b2b1642b206f89f847047475124517e7cf655638c
MD5 05a540107af6313256dc8ea697dcf171
BLAKE2b-256 b25ab01b0a78571fc7d5859aaa0e7aa6cc659678391faae272e053ec57169b90

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 80e29dabdba5dbb093307d3e60ad65ef420ccb7ff403631294e9865dbe6f9e69
MD5 4c088e2e33af069ab8419466b62469c9
BLAKE2b-256 3c020bedfc4494899f2fc7843c6b841584fcebd71374a59f5e308eb2a64ca4a9

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp312-cp312-win32.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp312-cp312-win32.whl
Algorithm Hash digest
SHA256 f1377da5e71fe50e65f66621b11008db881747bd3d2928e63e6f7660e64043e3
MD5 a2d3e0294f22c310a3928d043aa32491
BLAKE2b-256 8eacc8ebdfc9efa93ff0c3de9fc56b7812cd436ac09243f1d8bf11149121eab8

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 19d6a4e5afc87a25db6716573df87931f0746e3b9ccaf796cd7608e395f3f70c
MD5 a85d754613d671afa7a75e715b05fcf2
BLAKE2b-256 d4f6f607cbbc6f21f9cb7255bc57277ca0e0e3ed2f41b6942cc0bcc70f828e75

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 02349ee4b6f307c32af2bbd956ce9b2516a91f8903ad9cb88594f20a170bba3c
MD5 ddfd89631ae882d87f4f40aa808aef03
BLAKE2b-256 f8ad1fd2c37ff86efcb01baee2badb03683b7ab9a499f9338a506c9f7d6c9324

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp311-cp311-win32.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp311-cp311-win32.whl
Algorithm Hash digest
SHA256 e24dfac2053e4c69cd456b2a0a929e8ed8e9e0fb2c72d92edf8ac1cded217ccf
MD5 c9953c3ac0b33c2a3c7aa6720f25e573
BLAKE2b-256 a10b05c4ceef07a894946e15054e1cf28d4c02069456acfe623cdf4986704abe

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d5c68d197812ab7880d5ae81890ffc9a277bcb13f7c0973c733d016be4551cee
MD5 382863b155baf93bc1d1be0511ff619a
BLAKE2b-256 80c305f8482e6519e9f316fb54f3c0ec7c59012e86f94274a67dc9c6521ab7d0

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 86aa3d8db41eefe934e5a1748ec388c0e3fcdb9d85cd7e8885ef51fefed00db2
MD5 0136554ab765b73c7979f6abe14a4a0d
BLAKE2b-256 339edf43346302ede81fd7344a24e252a458519a21531d2e7450c26145073be9

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp310-cp310-win32.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp310-cp310-win32.whl
Algorithm Hash digest
SHA256 c4155b431fac85c10e8e05445c2307c4664eab54bd71f886e1ed979a989f0454
MD5 13af396b4e2e7a30e29e9841e3c2f385
BLAKE2b-256 be0fafbce14248ec959f499149f8354583ba918ef87fbbf939375d77a1987026

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 257d1cd0a41617a576d652e7d53f0d0f5830ca43ec4ca0820f1da8920b1475c5
MD5 e646d567907f15ad385d918cbda92371
BLAKE2b-256 b904bc4b74d2445168da0d9235134b1f854a3b390ebcb300c42a17b666ac710c

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 4fa16c13116743aeb012f69cd51d165212a3db1d2dcc1412019f246f93bd76d1
MD5 c14b2f80486033009243c74cbe0c57bd
BLAKE2b-256 03b039429073302622a07eeb741b82d6ef384d00b5a87aa2888e23e6cd4f44ee

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp39-cp39-win32.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp39-cp39-win32.whl
Algorithm Hash digest
SHA256 e052d6566f7462a028f6d221153fb66694f04d4e4a14838ba864aace71e2d1c0
MD5 8a1089bb35fcf51a890665ab92e9f7a7
BLAKE2b-256 8757f6d467a25cdced1bca1a6d503a1e292cb726477d9fa3cdd4b02c1775de47

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ded16dd366a15d967429488728efc8b4ca44d219f5e1e1c6634b660fcbcff404
MD5 acec77b856445647d57fae9167b110f1
BLAKE2b-256 8943f5d7d558d50e956a44332a90351e501ed1720b6e36078075b6a9f06728c2

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 92fad38da9c16cf1f339ab8c90a2d51317de73648730102f9ac161ebe90a0a45
MD5 9620cefc8b14e01cd74d4a26562c468d
BLAKE2b-256 c48aa3c4949eeb723b95940a0b88d770205529bd9d596d2f04f245b17dc0613b

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp38-cp38-win32.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp38-cp38-win32.whl
Algorithm Hash digest
SHA256 b57a9ecbf8032ac343e7a5bfe9344dfadb00ca9f41c07168424b0226133a5d69
MD5 3fdd471658c2a66d54fd15b5e9edb729
BLAKE2b-256 4f170c90c9e22e01677827ec45f4ee2f2f6c4bfd7236ca7d1ba580a4900cb01b

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 68b11db67426aebaf6d2b7fe177786e457bbe83ae2aa3057f87ab23b07b9ecb3
MD5 d1ed17e64da6efff5c146d7bf1186655
BLAKE2b-256 8be1ba04a41f77dfa6bea7154a883b3555b70aeb567cb1a13760c2c66ef994ad

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 f57ada754571d9289e6ad4049befd3551d118e899939abf494033c6edb24429e
MD5 101fa2d7bbe1e045b9a957c9f0eab5ff
BLAKE2b-256 b4669fc02b6b0c1b66ac9c422e328ac7687f50718376278ebf8ad67e00be3c60

See more details on using hashes here.

File details

Details for the file mlcommons_loadgen-4.1.6-cp37-cp37m-win32.whl.

File metadata

File hashes

Hashes for mlcommons_loadgen-4.1.6-cp37-cp37m-win32.whl
Algorithm Hash digest
SHA256 eda8cd477c07430e915dc44c171c7e385d684fc4a7001ccabc13c6257fd02344
MD5 16970cd81847845b9f46dd78561c2158
BLAKE2b-256 2ae5abcf29b81665f5a0f63ada594135f7503b602bf4930ee46195658546b244

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page