Skip to main content

Automatically measure the ellipticity of cucurbituril macrocycles!

Project description

ElliptiCB[n]

Automatically measure the ellipticity of cucurbituril macrocycles

Run on Google Colab

Run the program in the cloud without installing any software.

Open In Colab

Description

ElliptiCB[n] is a collaboration between the Pluth and Harms labs at the University of Oregon

Arman Garcia, Michael Shavlik PhD, Mike Harms PhD, Mike Pluth PhD

A manuscript describing the software is forthcoming.

ElliptiCB[n] performs the following steps

  1. Extract the coordinates of all C, N, O, and H atoms from an xyz file.

  2. Identify separate molecules by finding strongly-connected components.

  3. Identify macrocycles using patterns of bonds, cycle connectivity and cycle size.

  4. Use a Principal Component Analysis to calculate the variance along both major axes of the central cycle.

  5. Calculate ellipticity. This is done by two methods:

    A. pca_ellip: $(V_{ax1}-V_{ax2})/V_{ax1}$ where $V_{ax1}$ is the PCA variance on the longest axis (length) and $V_{ax2}$ is the PCA variance on the second-longest axis (width).

    B. orig_ellip: Use the perimeter and largest carbon-to-centroid distance to infer ellipticity.

  6. Generate outputs, which include annotated structures and a spreadsheet with ellipticities.

Input

ElliptiCB[n] takes molecular structures in XYZ format. The first two lines are ignored. We assume the coordinates are in angstroms. XYZ files can be generated from other structure formats using software like Open Babel.

Output

Output of a calculation using the structure HUXMAR as input.

Ellipticity table

id size pca_ellip orig_ellip nearby_atoms bad_protons
0 20 0.121369 0.123284 2 0
1 10 0.018906 0.066456 2 0
2 14 0.075018 0.087605 10 0
  • "id": identity of the cycle (matching the visualization and table)
  • "size": number of equatorial glycoluril sp3 carbons in the macrocycle
  • "pca_ellip": ellipticity calculated by Principle Component Analysis
  • "orig_ellip" ellipticity calculated the centroid, nearest carbon, and perimeter.
  • "nearby_atoms": number of atoms from a different molecule within GUEST_SEARCH_RADIUS of the macrocycle centroid. (Useful for identifying guests).
  • "bad_protons": number of equatorial protons facing into rather than out of the macrocycle.

Annotated structure

A screenshot of the output follows. The actual output of the code is interactive. An example is here.

img

Local installation

ElliptiCB[n] can be installed locally and used as a command line tool.

To install using pip

On a terminal, run:

pip install ellipticbn

To install from source:

On a terminal, run:

git clone https://github.com/harmslab/ellipticbn.git
cd ellipticbn
python -m pip install . pyproject.toml

Run from the command line

ElliptiCB[n] takes one or more .xyz files as inputs. Assuming that HUMAR.xyz is in the working directory, running this command:

$> ellipticbn HUMXAR.xyz

Would generate the following output:

Analyzing ./CB10/HUMXAR.xyz.
3 macrocycles identified.

Calculating ellipticities for 3 macrocycles.

Results:
    id  size  pca_ellip  orig_ellip  nearby_atoms  bad_protons
0  0.0    20   0.121369    0.123284             2            0
1  1.0    10   0.018906    0.066456             2            0
2  2.0    14   0.075018    0.087605            10            0

Saving plot to ./HUMXAR.xyz.html

It will also generate HUXMAR.xyz.html (the visualization) and HUXMAR.xyz.xlsx (the ellipticity table) in the current directory.

You can also run the program on multiple xyz files:

$> ellipticbn HUMXAR.xyz LAZPIM.xyz

Would generate the following output:

Analyzing HUMXAR.xyz.
3 macrocycles identified.

Calculating ellipticities for 3 macrocycles.

Results:
    id  size  pca_ellip  orig_ellip  nearby_atoms  bad_protons
0  0.0    20   0.121369    0.123284             2            0
1  1.0    10   0.018906    0.066456             2            0
2  2.0    14   0.075018    0.087605            10            0

Saving plot to ./HUMXAR.xyz.html

Analyzing LAZPIM.xyz.
1 macrocycles identified.

Calculating ellipticities for 1 macrocycles.

Results:
    id  size  pca_ellip  orig_ellip  nearby_atoms  bad_protons
0  0.0    20    0.29813    0.212848            20            0

Saving plot to ./LAZPIM.xyz.html

In addition to the visualization html and individual ellipiticty files, this call would generate a single spreadsheet ("summary.xlsx") that has all calculated ellipticities:

id size pca_ellip orig_ellip nearby_atoms bad_protons file
0 20 0.121369 0.123284 2 0 HUMXAR.xyz
1 10 0.018906 0.066456 2 0 HUMXAR.xyz
2 14 0.075018 0.087605 10 0 HUMXAR.xyz
0 20 0.29813 0.212848 20 0 LAZPIM.xyz

One can also change the parameters used in the calculation. To see the available options, type the following in a terminal:

ellipticbn --help

As of this writing (version 2.0.1), this gives the following output:

usage: ellipticbn [-h] [--min_num_carbons MIN_NUM_CARBONS]
                  [--max_num_carbons MAX_NUM_CARBONS]
                  [--guest_search_radius GUEST_SEARCH_RADIUS]
                  [--summary_file SUMMARY_FILE] [--output_dir OUTPUT_DIR]
                  [--overwrite] [--version]
                  filename [filename ...]

    Run an ElliptiCbn calculation.

    Parameters
    ----------
    filename : str or list
        xyz file name (or list of xyz files) to read
    min_num_carbons : int, default=10
        reject any macrocycle with a central cycle that has less than
        min_num_carbons
    max_num_carbons : int, default=20
        reject any macrocycle with a central cycle that has more than
        max_num_carbons
    guest_search_radius : float, default=4
        look for guest atoms within this radius (in angstroms) of the atom
        centroid.
    summary_file : str, default="summary.xlsx"
        write all cycles to this single summary file if there is more than one
        xyz file specified.
    output_dir : str, default="."
        write output to output_dir.
    overwrite : bool, default=False
        overwrite existing output files


positional arguments:
  filename

options:
  -h, --help            show this help message and exit
  --min_num_carbons MIN_NUM_CARBONS
  --max_num_carbons MAX_NUM_CARBONS
  --guest_search_radius GUEST_SEARCH_RADIUS
  --summary_file SUMMARY_FILE
  --output_dir OUTPUT_DIR
  --overwrite
  --version             show program's version number and exit

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ellipticbn-2.1.2.tar.gz (17.7 kB view details)

Uploaded Source

Built Distribution

ellipticbn-2.1.2-py2.py3-none-any.whl (16.2 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file ellipticbn-2.1.2.tar.gz.

File metadata

  • Download URL: ellipticbn-2.1.2.tar.gz
  • Upload date:
  • Size: 17.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for ellipticbn-2.1.2.tar.gz
Algorithm Hash digest
SHA256 aad146504baa035753bf269cc24a7e5194be0a824a974679cb5e24305c86c4da
MD5 432b1aee1ebc465695e3157b744e901e
BLAKE2b-256 a8aff82c625e646b2e1f9eb30408ecf763fff5ecefc188785e404607e347a2e0

See more details on using hashes here.

File details

Details for the file ellipticbn-2.1.2-py2.py3-none-any.whl.

File metadata

  • Download URL: ellipticbn-2.1.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 16.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for ellipticbn-2.1.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 e7fa6bff76a4a091578b6cf29d887aebfb7adc7773e6904c706e298cb3b13970
MD5 93c8bfe200fd8dd2b4fc61e915b6d47e
BLAKE2b-256 ea0427da6822463e0f06f32b2e68402605421f2bae8c1380f21ff7121241a55b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page