Skip to main content

tokmon is a CLI utility to monitor OpenAI token usage and costs

Project description

tokmon 🔤🧐 - CLI utility to monitor OpenAI token costs

tokmon enables you to monitor your program's OpenAI API token usage.

You use tokmon just like you would use the time utility, but instead of execution time you get token usage and cost.

Quick install

pip install tokmon

Make sure installation worked by running

tokmon --help

To uninstall, run pip uninstall tokmon

How to use tokmon

Warning This is a debugging tool. It is not intended to be used in any consequential setting. Use your best judgement, you're on your own!

Prepend tokmon to your normal program invocation like so:

$ tokmon ./my_gpt_program --my_arg "hi"

Run and use your program just like you would normally (arguments and all). Interactive usage is supported as well.

After your program finishes running (or you ctrl^C it), tokmon will automatically generate a cost report that looks like this:

Short usage summary

tokmon cost report:
================================================================================
Monitored invocation: ./python_example.py -i
Models: ['gpt-3.5-turbo-0301']
Total Usage: {'total_prompt_tokens': 49, 'total_completion_tokens': 44, 'total_tokens': 93}
Pricing: {'gpt-3.5-turbo-0301': {'prompt_cost': 0.002, 'completion_cost': 0.002, 'per_tokens': 1000}}
Total Cost: $0.000186
================================================================================

Writing cost summary to JSON file: /tmp/tokmon_usage_summary_1681426650.json
  • If your program uses multiple OpenAI models in the same invocation, their respective usages will be reflected in the report.
  • You can run multiple instances of tokmon simultaneously. Each invocation will generate a separate usage report.
  • Pass a --json_out /your/path/report.json to get a detailed breakdown + conversation history in JSON format.

Full usage and cost summary (JSON)

{
    "total_cost": 0.0019199999999999998,
    "total_usage": {
        "total_prompt_tokens": 18,
        "total_completion_tokens": 23,
        "total_tokens": 41
    },
    "pricing_data": "{'gpt-4-0314': {'prompt_cost': 0.03, 'completion_cost': 0.06, 'per_tokens': 1000}}",
    "models": [
        "gpt-4-0314"
    ],
    "raw_data": [
        {
            "model": "gpt-4-0314",
            "usage": {
                "prompt_tokens": 18,
                "completion_tokens": 23,
                "total_tokens": 41
            },
            "cost": 0.0019199999999999998,
            "messages": [
                {
                    "role": "system",
                    "content": "You're a helpful assistant."
                },
                {
                    "role": "user",
                    "content": "hello"
                },
                {
                    "role": "assistant",
                    "content": "Hello! How can I help you today? If you have any questions or need assistance, feel free to ask."
                }
            ]
        }
    ]
}

How it works

tokmon uses the mitmproxy library to intercept HTTP requests and responses between your program and the OpenAI API. It then processes the request and response data to calculate token usage and cost based on tokmon/pricing.json.

tokmon works for programs in python / node (using OpenAI's clients), or curl (run directly, and not i.e. in a bash script).

if you manually install mitmproxy's CA certificate, it should work for all executables (note: haven't tested this.)

In most cases, tokmon relies on the usage field in OpenAI's API responses for token counts. For streaming requests, however, tokmon uses OpenAI's tiktoken library directly to count the tokens. As of writing OpenAI's API does not return usage data for streaming requests (reference.)

pricing.json

The pricing data was extracted from OpenAI's website with the help of ChatGPT.

tokmon is using tokmon/pricing.json from its package.

{   
    "last_updated": "2023-04-12",
    "data_sources": [
        "https://openai.com/pricing",
        "https://platform.openai.com/docs/models/model-endpoint-compatibility"
    ],
    "gpt-4": {"prompt_cost": 0.03, "completion_cost": 0.06, "per_tokens": 1000},
    "gpt-4-0314": {"prompt_cost": 0.03, "completion_cost": 0.06, "per_tokens": 1000},
    "gpt-4-32k": {"prompt_cost": 0.06, "completion_cost": 0.12, "per_tokens": 1000},
    "gpt-4-32k-0314": {"prompt_cost": 0.06, "completion_cost": 0.12, "per_tokens": 1000},
    "gpt-3.5-turbo": {"prompt_cost": 0.002, "completion_cost": 0.002, "per_tokens": 1000},
    "gpt-3.5-turbo-0301": {"prompt_cost": 0.002, "completion_cost": 0.002, "per_tokens": 1000},
    "text-davinci-003": {"cost": 0.02, "per_tokens": 1000},
    "text-curie-001": {"cost": 0.002, "per_tokens": 1000},
    "text-babbage-001": {"cost": 0.0005, "per_tokens": 1000},
    "text-ada-001": {"cost": 0.0004, "per_tokens": 1000},
    "text-embedding-ada-002": {"cost": 0.0004, "per_tokens": 1000}
}

You can override the default pricing with: tokmon --pricing /path/to/your/custom_pricing.json ...

This pricing JSON is incomplete (missing DALL-E, etc.), it may be incorrect, and it may go out of date.

For best results, make sure to check that you have the latest pricing.

Current Limitations

  1. Event streaming: tokmon buffers Server-Sent Events (SSE) until the data: [DONE] chunk is received. If the monitored program leverages event streaming, its behavior will be modified.
    • Status: looking into it. Pull requests welcome.

Contributing

If you'd like to contribute to the project, please follow these steps:

  1. Fork the repository.
  2. Create a new branch for your changes.
  3. Make your changes and test them.
  4. Submit a pull request with a clear description of your changes and any relevant information.

Warning

  1. tokmon comes without any warranty or guarantee whatsoever.
  2. tokmon was tested on macOS only. It might not work on other platforms.
  3. This tool may not work as intended, have unknown side effects, may output incorrect information, or not work at all.
  4. The pricing data in pricing.json may go out of date.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokmon-0.1.3.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

tokmon-0.1.3-py3-none-any.whl (14.7 kB view details)

Uploaded Python 3

File details

Details for the file tokmon-0.1.3.tar.gz.

File metadata

  • Download URL: tokmon-0.1.3.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.3

File hashes

Hashes for tokmon-0.1.3.tar.gz
Algorithm Hash digest
SHA256 51dee31e2004d0b817bd14ae190854837f39eed77a554fbedbecacfeaf06a9ab
MD5 d2a2c23b41277e2cec395e763ed1c60b
BLAKE2b-256 b1911b1b0bc217fd185aa515e19075ac9387a3dc11b55324ee61f217b6ddd7e7

See more details on using hashes here.

File details

Details for the file tokmon-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: tokmon-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 14.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.3

File hashes

Hashes for tokmon-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 22462eddc32bd33671d09e41958426277e0dbda0f5d3695d0161d3e6a2f73751
MD5 963a2c30ca4adef3d7462f97a893d863
BLAKE2b-256 e651b17c91819230b0a56c5e034766fa7c5c1695ec5b8e846a5af856f46525d2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page