Skip to main content

Cloud AI Platform API client library

Project description

GA pypi versions

Vertex AI: Google Vertex AI is an integrated suite of machine learning tools and services for building and using ML models with AutoML or custom code. It offers both novices and experts the best workbench for the entire machine learning development lifecycle.

Quick Start

In order to use this library, you first need to go through the following steps:

  1. Select or create a Cloud Platform project.

  2. Enable billing for your project.

  3. Enable the Vertex AI API.

  4. Setup Authentication.

Installation

Install this library in a virtualenv using pip. virtualenv is a tool to create isolated Python environments. The basic problem it addresses is one of dependencies and versions, and indirectly permissions.

With virtualenv, it’s possible to install this library without needing system install permissions, and without clashing with the installed system dependencies.

Mac/Linux

pip install virtualenv
virtualenv <your-env>
source <your-env>/bin/activate
<your-env>/bin/pip install google-cloud-aiplatform

Windows

pip install virtualenv
virtualenv <your-env>
<your-env>\Scripts\activate
<your-env>\Scripts\pip.exe install google-cloud-aiplatform

Overview

This section provides a brief overview of the Vertex SDK for Python. You can also reference the notebooks in vertex-ai-samples for examples.

Importing

SDK functionality can be used from the root of the package:

from google.cloud import aiplatform

Initialization

Initialize the SDK to store common configurations that you use with the SDK.

aiplatform.init(
    # your Google Cloud Project ID or number
    # environment default used is not set
    project='my-project',

    # the Vertex AI region you will use
    # defaults to us-central1
    location='us-central1',

    # Googlge Cloud Stoage bucket in same region as location
    # used to stage artifacts
    staging_bucket='gs://my_staging_bucket',

    # custom google.auth.credentials.Credentials
    # environment default creds used if not set
    credentials=my_credentials,

    # customer managed encryption key resource name
    # will be applied to all Vertex AI resources if set
    encryption_spec_key_name=my_encryption_key_name,

    # the name of the experiment to use to track
    # logged metrics and parameters
    experiment='my-experiment',

    # description of the experiment above
    experiment_description='my experiment decsription'
)

Datasets

Vertex AI provides managed tabular, text, image, and video datasets. In the SDK, datasets can be used downstream to train models.

To create a tabular dataset:

my_dataset = aiplatform.TabularDataset.create(
    display_name="my-dataset", gcs_source=['gs://path/to/my/dataset.csv'])

You can also create and import a dataset in separate steps:

from google.cloud import aiplatform

my_dataset = aiplatform.TextDataset.create(
    display_name="my-dataset")

my_dataset.import(
    gcs_source=['gs://path/to/my/dataset.csv']
    import_schema_uri=aiplatform.schema.dataset.ioformat.text.multi_label_classification
)

To get a previously created Dataset:

dataset = aiplatform.ImageDataset('projects/my-project/location/us-central1/datasets/{DATASET_ID}')

Vertex AI supports a variety of dataset schemas. References to these schemas are available under the aiplatform.schema.dataset namespace. For more information on the supported dataset schemas please refer to the Preparing data docs.

Training

The Vertex SDK for Python allows you train Custom and AutoML Models.

You can train custom models using a custom Python script, custom Python package, or container.

Preparing Your Custom Code

Vertex AI custom training enables you to train on Vertex AI datasets and produce Vertex AI models. To do so your script must adhere to the following contract:

It must read datasets from the environment variables populated by the training service:

os.environ['AIP_DATA_FORMAT']  # provides format of data
os.environ['AIP_TRAINING_DATA_URI']  # uri to training split
os.environ['AIP_VALIDATION_DATA_URI']  # uri to validation split
os.environ['AIP_TEST_DATA_URI']  # uri to test split

Please visit Using a managed dataset in a custom training application for a detailed overview.

It must write the model artifact to the environment variable populated by the traing service:

os.environ['AIP_MODEL_DIR']

Running Training

job = aiplatform.CustomTrainingJob(
    display_name="my-training-job",
    script_path="training_script.py",
    container_uri="gcr.io/cloud-aiplatform/training/tf-cpu.2-2:latest",
    requirements=["gcsfs==0.7.1"],
    model_serving_container_image_uri="gcr.io/cloud-aiplatform/prediction/tf2-cpu.2-2:latest",
)

model = job.run(my_dataset,
                replica_count=1,
                machine_type="n1-standard-4",
                accelerator_type='NVIDIA_TESLA_K80',
                accelerator_count=1)

In the code block above my_dataset is managed dataset created in the Dataset section above. The model variable is a managed Vertex AI model that can be deployed or exported.

AutoMLs

The Vertex SDK for Python supports AutoML tabular, image, text, video, and forecasting.

To train an AutoML tabular model:

dataset = aiplatform.TabularDataset('projects/my-project/location/us-central1/datasets/{DATASET_ID}')

job = aiplatform.AutoMLTabularTrainingJob(
  display_name="train-automl",
  optimization_prediction_type="regression",
  optimization_objective="minimize-rmse",
)

model = job.run(
    dataset=dataset,
    target_column="target_column_name",
    training_fraction_split=0.6,
    validation_fraction_split=0.2,
    test_fraction_split=0.2,
    budget_milli_node_hours=1000,
    model_display_name="my-automl-model",
    disable_early_stopping=False,
)

Models

To deploy a model:

endpoint = model.deploy(machine_type="n1-standard-4",
                        min_replica_count=1,
                        max_replica_count=5
                        machine_type='n1-standard-4',
                        accelerator_type='NVIDIA_TESLA_K80',
                        accelerator_count=1)

To upload a model:

model = aiplatform.Model.upload(
    display_name='my-model',
    artifact_uri="gs://python/to/my/model/dir",
    serving_container_image_uri="gcr.io/cloud-aiplatform/prediction/tf2-cpu.2-2:latest",
)

To get a model:

model = aiplatform.Model('/projects/my-project/locations/us-central1/models/{MODEL_ID}')

Please visit Importing models to Vertex AI for a detailed overview:

Batch Prediction

To create a batch prediction job:

model = aiplatform.Model('/projects/my-project/locations/us-central1/models/{MODEL_ID}')

batch_prediction_job = model.batch_predict(
  job_display_name='my-batch-prediction-job',
  instances_format='csv'
  machine_type='n1-standard-4',
  gcs_source=['gs://path/to/my/file.csv']
  gcs_destination_prefix='gs://path/to/by/batch_prediction/results/'
)

You can also create a batch prediction job asynchronously by including the sync=False argument:

batch_prediction_job = model.batch_predict(..., sync=False)

# wait for resource to be created
batch_prediction_job.wait_for_resource_creation()

# get the state
batch_prediction_job.state

# block until job is complete
batch_prediction_job.wait()

Endpoints

To get predictions from endpoints:

endpoint.predict(instances=[[6.7, 3.1, 4.7, 1.5], [4.6, 3.1, 1.5, 0.2]])

To create an endpoint

endpoint = endpoint.create(display_name='my-endpoint')

To deploy a model to a created endpoint:

model = aiplatform.Model('/projects/my-project/locations/us-central1/models/{MODEL_ID}')

endpoint.deploy(model,
                min_replica_count=1,
                max_replica_count=5
                machine_type='n1-standard-4',
                accelerator_type='NVIDIA_TESLA_K80',
                accelerator_count=1)

To undeploy models from an endpoint:

endpoint.undeploy_all()

To delete an endpoint:

endpoint.delete()

Explainable AI: Get Metadata

To get metadata from TensorFlow 1 models:

from google.cloud.aiplatform.explain.metadata.tf.v1 import saved_model_metadata_builder

builder = saved_model_metadata_builder.SavedModelMetadataBuilder(
          'gs://python/to/my/model/dir', tags=[tf.saved_model.tag_constants.SERVING]
      )
generated_md = builder.get_metadata()

To get metadata from TensorFlow 2 models:

from google.cloud.aiplatform.explain.metadata.tf.v2 import saved_model_metadata_builder

builder = saved_model_metadata_builder.SavedModelMetadataBuilder('gs://python/to/my/model/dir')
generated_md = builder.get_metadata()

Next Steps

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

google-cloud-aiplatform-1.4.0.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

google_cloud_aiplatform-1.4.0-py2.py3-none-any.whl (1.3 MB view details)

Uploaded Python 2 Python 3

File details

Details for the file google-cloud-aiplatform-1.4.0.tar.gz.

File metadata

  • Download URL: google-cloud-aiplatform-1.4.0.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.0

File hashes

Hashes for google-cloud-aiplatform-1.4.0.tar.gz
Algorithm Hash digest
SHA256 9159666e5ab5101800ae3d45bfcecc829e9826cac7c74309f4b1b9ae8aed498c
MD5 626b8ab2f71160bca68f029f2e9ba532
BLAKE2b-256 308b78fc6cb55b76de1464a57ceae6e97bb4344093935edeaf419cc90e2a9850

See more details on using hashes here.

Provenance

File details

Details for the file google_cloud_aiplatform-1.4.0-py2.py3-none-any.whl.

File metadata

  • Download URL: google_cloud_aiplatform-1.4.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 1.3 MB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.0

File hashes

Hashes for google_cloud_aiplatform-1.4.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 c9ed7e067d16fdd94a7a4917ede43f11cf4fe68f943f85262b4ed08318758573
MD5 41878e76d90a13d608db1a96946602c8
BLAKE2b-256 3f4378e9f19734e026b3b01ef1f0d3170450de2626aa88c90bbd26ba6df81ee6

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page