Skip to main content

Lazily one-hot encoding bed sequences using Keras Sequence.

Project description

Travis CI build SonarCloud Quality SonarCloud Maintainability Codacy Maintainability Maintainability Pypi project Pypi total project downloads

Lazily one-hot encoding bed sequences using Keras Sequence.

How do I install this package?

As usual, just download it using pip:

pip install keras_bed_sequence

Tests Coverage

Since some software handling coverages sometimes get slightly different results, here’s three of them:

Coveralls Coverage SonarCloud Coverage Code Climate

Usage examples

The following examples are tested within the package test suite.

Classification task example

Let’s start by building an extremely simple classification task model:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from keras_mixed_sequence import MixedSequence

model = Sequential([
    Flatten(),
    Dense(1)
])
model.compile(
    optimizer="nadam",
    loss="MSE"
)

We then proceed to load the training data into Keras Sequences, using, in particular, a MixedSequence object:

import numpy as np
from keras_mixed_sequence import MixedSequence
from keras_bed_sequence import BedSequence

batch_size = 32
bed_sequence = BedSequence(
    "hg19",
    "path/to/bed/files.bed",
    batch_size
)
y = the_output_values
mixed_sequence = MixedSequence(
    x=bed_sequence,
    y=y,
    batch_size=batch_size
)

Finally, we can proceed to use the obtained MixedSequence to train our model:

model.fit_generator(
    mixed_sequence,
    steps_per_epoch=mixed_sequence.steps_per_epoch,
    epochs=2,
    verbose=0,
    shuffle=True
)

Auto-encoding task example

Let’s start by building an extremely simple auto-encoding task model:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, Reshape, Conv2DTranspose

model = Sequential([
    Reshape((200, 4, 1)),
    Conv2D(16, kernel_size=3, activation="relu"),
    Conv2DTranspose(1, kernel_size=3, activation="relu"),
    Reshape((-1, 200, 4))
])
model.compile(
    optimizer="nadam",
    loss="MSE"
)

We then proceed to load the training data into Keras Sequences, using, in particular, a MixedSequence object:

import numpy as np
from keras_mixed_sequence import MixedSequence
from keras_bed_sequence import BedSequence

batch_size = 32
bed_sequence = BedSequence(
    Genome("hg19", chromosomes=["chr1"]),
    "path/to/bed/files.bed",
    batch_size
)
mixed_sequence = MixedSequence(
    x=bed_sequence,
    y=bed_sequence,
    batch_size=batch_size
)

Finally, we can proceed to use the obtained MixedSequence to train our model:

model.fit_generator(
    mixed_sequence,
    steps_per_epoch=mixed_sequence.steps_per_epoch,
    epochs=2,
    verbose=0,
    shuffle=True
)

Multi-task example (classification + auto-encoding)

Let’s start by building an extremely simple multi-tasks model:

from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Conv2D, Reshape, Flatten, Conv2DTranspose, Input

inputs = Input(shape=(200, 4))

flattened = Flatten()(inputs)

output1 = Dense(
    units=1,
    activation="relu",
    name="output1"
)(flattened)

hidden = Reshape((200, 4, 1))(inputs)
hidden = Conv2D(16, kernel_size=3, activation="relu")(hidden)
hidden = Conv2DTranspose(1, kernel_size=3, activation="relu")(hidden)
output2 = Reshape((200, 4), name="output2")(hidden)

model = Model(
    inputs=inputs,
    outputs=[output1, output2],
    name="my_model"
)

model.compile(
    optimizer="nadam",
    loss="MSE"
)

We then proceed to load the training data into Keras Sequences, using, in particular, a MixedSequence object:

import numpy as np
from keras_mixed_sequence import MixedSequence
from keras_bed_sequence import BedSequence

batch_size = 32
bed_sequence = BedSequence(
    "hg19",
    "{cwd}/test.bed".format(
        cwd=os.path.dirname(os.path.abspath(__file__))
    ),
    batch_size
)
y = np.random.randint(
    2,
    size=(bed_sequence.samples_number, 1)
)
mixed_sequence = MixedSequence(
    bed_sequence,
    {
        "output1": y,
        "output2": bed_sequence
    },
    batch_size
)

Finally, we can proceed to use the obtained MixedSequence to train our model:

model.fit_generator(
    mixed_sequence,
    steps_per_epoch=mixed_sequence.steps_per_epoch,
    epochs=2,
    verbose=0,
    shuffle=True
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

keras_bed_sequence-1.1.0.tar.gz (7.5 kB view details)

Uploaded Source

File details

Details for the file keras_bed_sequence-1.1.0.tar.gz.

File metadata

  • Download URL: keras_bed_sequence-1.1.0.tar.gz
  • Upload date:
  • Size: 7.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for keras_bed_sequence-1.1.0.tar.gz
Algorithm Hash digest
SHA256 3e7de421829f266009a81dc997fa90b9dd4a801d03d845de2f96e536733c75bd
MD5 002a063026499cf04e0e1287335d26d7
BLAKE2b-256 07ec8451fe24180744220f06c5bca3a8a797868e04609c7244b6735fb4c0cdc1

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page