Skip to main content

A napari plugin to process and analyse images with chatGPT.

Project description

Home of Omega, a napari-aware autonomous LLM-based agent specialized in image processing and analysis.

License BSD-3 PyPI Python Version tests codecov napari hub DOI

image

A napari plugin that leverages OpenAI's Large Language Model ChatGPT to implement Omega a napari-aware agent capable of performing image processing and analysis tasks in a conversational manner.

This repository started as a 'week-end project' by Loic A. Royer who leads a research group at the Chan Zuckerberg Biohub. It leverages OpenAI's ChatGPT API via the LangChain Python library, as well as napari, a fast, interactive, multi-dimensional image viewer for Python, another project initially started by Loic and Juan Nunez-Iglesias.

What is Omega?

Omega is an LLM-based and tool-armed autonomous agent that demonstrates the potential for Large Language Models (LLMs) to be applied to image processing, analysis and visualization. Can LLM-based agents write image processing code and napari widgets, correct its coding mistakes, performing follow-up analysis, and controlling the napari viewer? The answer appears to be yes.

The preprint can be downloaded here: 10.5281/zenodo.8240289

In this video, I ask Omega to segment an image using the SLIC algorithm. It makes a first attempt using the implementation in scikit-image but fails because of an inexistent 'multichannel' parameter. Realizing that, Omega tries again, and this time succeeds:

https://user-images.githubusercontent.com/1870994/235768559-ca8bfa84-21f5-47b6-b2bd-7fcc07cedd92.mp4

After loading in napari a sample 3D image of cell nuclei, I ask Omega to segment the nuclei using the Otsu method. My first request was very vague, so it just segmented foreground versus background. I then ask to segment the foreground into distinct segments for each connected component. Omega does a rookie mistake by forgetting to 'import np'. No problem; it notices, tries again, and succeeds:

https://user-images.githubusercontent.com/1870994/235769990-a281a118-1369-47aa-834a-b491f706bd48.mp4

In this video, one of my favorites, I ask Omega to make a 'Max color projection widget.' It is not a trivial task, but it manages!

https://github.com/royerlab/napari-chatgpt/assets/1870994/bb9b35a4-d0aa-4f82-9e7c-696ef5859a2f

As LLMs continue to improve, Omega will become even more adept at handling complex image processing and analysis tasks. The current version of ChatGPT, 3.5, has a cutoff date of 2021, which means that it lacks nearly two years of knowledge on the napari API and usage, as well as the latest versions of popular libraries like scikit-image, OpenCV, numpy, scipy, etc... Despite this, you can see in the videos below that it is quite capable. While ChatGPT 4.0 is a significant upgrade, it is not yet widely available.

Omega could eventually help non-experts process and analyze images, especially in the bioimage domain. It is also potentially valuable for educative purposes as it could assist in teaching image processing and analysis, making it more accessible. Although ChatGPT, which powers Omega, may not be yet on par with an expert image analyst or computer vision expert, it is just a matter of time...

Omega holds a conversation with the user and uses different tools to answer questions, download and operate on images, write widgets for napari, and more.

Omega's Built-in AI-Augmented Code Editor

The Omega AI-Augmented Code Editor is a new feature within Omega, designed to enhance the Omega's user experience. This editor is not just a text editor; it's a powerful interface that interacts with the Omega dialogue agent to generate, optimize, and manage code for advanced image analysis tasks.

Key Features

  • Code Highlighting and Completion: For ease of reading and writing, the code editor comes with built-in syntax highlighting and intelligent code completion features.
  • LLM-Augmented Tools: The editor is equipped with AI tools that assist in commenting, cleaning up, fixing, modifying, and performing safety checks on the code.
  • Persistent Code Snippets: Users can save and manage snippets of code, preserving their work across multiple Napari sessions.
  • Network Code Sharing (Code-Drop): Facilitates the sharing of code snippets across the local network, empowering collaborative work and knowledge sharing.

Usage Scenarios

  • Widget Creation: Expert users can create widgets using the Omega dialogue agent and retain them for future use.
  • Collaboration: Share custom widgets with colleagues or the community, even if they don't have access to an API key.
  • Learning: New users can learn from the AI-augmented suggestions, improving their coding skills in Python and image analysis workflows.

You can find more information in the corresponding wiki page.


Omega's Installation instructions:

Assuming you have a Python environment with a working napari installation, you can simply:

pip install napari-chatgpt

Or just install the plugin from napari's plugin installer.

For detailed instructions and variations, check this page of our wiki.

Requirements:

You need an OpenAI key; there is no way around this, I have been experimenting with other models, including open-source models, but right now, the best results, by far, are obtained with ChatGPT 4 (and to a lesser extent 3.5). Check here for details on how to get your OpenAI key. In particular, check this for how to gain access to GPT-4 models.

Usage:

Check this page of our wiki for details on how to start Omega.

Tips, Tricks, and Example prompts:

Check our guide on how to prompt Omega and some examples here.

Video Demos:

You can check the original release videos here. You can also find the latest preprint videos on Vimeo.

How does Omega work?

Check our preprint here: 10.5281/zenodo.8240289 and our wiki page on Omega's design and architecture.

Cost:

Developing the initial version of Omega cost me $13.97, hardly a fortune. OpenAI pricing on ChatGPT 4 is very reasonable at 0.01 dollars per 1K tokens, which means $1 per 750000 words.

Note: you can limit the burn rate to a certain amount of dollars per month, just in case you let Omega think over the weekend and forget to stop it (don't worry, this is actually not possible).

Disclaimer:

Do not use this software lightly; it will download libraries of its own volition and write any code that it deems necessary; it might actually do what you ask, even if it is a very bad idea. Also, beware that it might misunderstand what you ask and then do something bad in ways that elude you. For example, it is unwise to use Omega to delete 'some' files from your system; it might end up deleting more than that if you are unclear in your request.
Omega is generally safe as long as you do not make dangerous requests. To be 100% safe, and if your experiments with Omega could be potentially problematic, I recommend using this software from within a sandboxed virtual machine.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Contributing

Contributions are extremely welcome. Tests can be run with tox, please ensure the coverage at least stays the same before you submit a pull request.

License

Distributed under the terms of the BSD-3 license, "napari-chatgpt" is free and open-source software

Issues

If you encounter any problems, please file an issue along with a detailed description.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

napari-chatgpt-2024.3.13.3.tar.gz (58.7 MB view details)

Uploaded Source

Built Distribution

napari_chatgpt-2024.3.13.3-py3-none-any.whl (1.5 MB view details)

Uploaded Python 3

File details

Details for the file napari-chatgpt-2024.3.13.3.tar.gz.

File metadata

  • Download URL: napari-chatgpt-2024.3.13.3.tar.gz
  • Upload date:
  • Size: 58.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for napari-chatgpt-2024.3.13.3.tar.gz
Algorithm Hash digest
SHA256 729ec7438cb4b9751552448e97a6b02b052f2699e9a6cc89cbda48d71472aa38
MD5 82b0bc57f48528312d3195301c5bdfc8
BLAKE2b-256 b5ffce79cbf710f4f990d63aa3fb5869f426272eddf98faccd7284426581480f

See more details on using hashes here.

File details

Details for the file napari_chatgpt-2024.3.13.3-py3-none-any.whl.

File metadata

File hashes

Hashes for napari_chatgpt-2024.3.13.3-py3-none-any.whl
Algorithm Hash digest
SHA256 21d72c8382c46521d750c5b8640987c98e01d63efff3868d431d7eb6b52f9469
MD5 7006d55443a2d63c8162c628b7994ccb
BLAKE2b-256 c3038718140b614b54252094b93f9e71198d90089545a452d9fcec694f519f64

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page