Vertex AI and AutoML; A Case-Study
Vertex AI Vision is an AI-powered platform to ingest, analyze and store video data. Vertex AI Vision lets users build and deploy applications with a simplified user interface.
Using Vertex AI Vision you can build end-to-end computer image solutions by leveraging Vertex AI Vision's integration with other major components, namely Live Video Analytics, data streams, and Vision Warehouse. The Vertex AI Vision API allows you to build a high level app from low level APIs, and create and update a high level workflow that combines multiple individual API calls. You can then execute your workflow as a unit by making a single deploy request to the Vertex AI Vision platform server.
Using Vertex AI Vision, you can:
1. Ingest real-time video data.
2. Analyze data for insights using general and custom vision AI models.
3. Store insights in Vision Warehouse for simplified querying and metadata information.
Vertex AI Vision workflow:
The steps you complete to use Vertex AI Vision are as follows:
1. Ingest real-time data: Vertex AI Vision's architecture allows you to quickly and conveniently stream real-time video ingestion infrastructure in a public Cloud.
2. Analyze Data: After data is ingested, Vertex AI Vision's framework provides you with easy access and orchestration of a large and growing portfolio of general, custom, and specialized analysis models.
3. Store and query output: After your app analyzes your data you can send this information to a storage destination (Vision Warehouse or BigQuery), or receives the data live. With Vision Warehouse you can send your app output to a warehouse that generalizes your search work and serves multiple data types and use cases.
Building a scalable MLOps system with Vertex AI AutoML and Pipeline
When you build a Machine Learning (ML) product, consider at least two MLOps scenarios. First, the model is replaceable, as breakthrough algorithms are introduced in academia or industry. Second, the model itself has to evolve with the data in the changing world.
Google can handle both scenarios with the services provided by Vertex AI. For example:
1. Auto ML capability automatically identifies the best model based on your budget, data, and settings.
2. You can easily manage the dataset with Vertex Managed Datasets by creating a new dataset or adding data to an existing dataset.
3. You can build an ML pipeline to automate a series of steps that start with importing dataset and end with deploying a model using Vertex Pipelines.
Components:
Vertex AI is at the heart of this system, and it leverages Vertex Managed Datasets, AutoML, Predictions, and Pipelines. We can create and manage a dataset as it grows using Vertex Managed datasets. Vertex AutoML selects the best model without your knowing much about modeling. Vertex Predictions creates an endpoint (RestAPI) to which the client communicates.
It is a simple fully managed yet somewhat complete end-to-end MLOps workflow moves from a dataset to training a model that gets deployed. This workflow can be programmatically written in Vertex Pipelines. Vertex Pipelines outputs the specifications for an ML pipeline allowing you to re-run the pipeline whenever or wherever you want. Specify when and how to trigger the pipeline using Cloud Functions and Cloud Storage.
Cloud Functions is a serverless way to deploy your code in Google Cloud. In this particular project, it triggers the pipeline by listenning to changes on the specified Cloud Storage location. Specifically, if a new dataset is added, for example, a new span number is created, the pipeline is triggered to train the dataset, and a new model is deployed.
Workflow:
This MLOps system prepares the dataset with either Vertex Dataset's built-in user interface (UI) or any external tools based on your preference. You can upload the prepared dataset into the designated GCS bucket with a new folder named SPAN-NUMBER. Cloud Functions then detects the changes in the GCS bucket and triggers the Vertex Pipeline to run the jobs from AutoML training to endpoint deployment.
Inside the Vertex Pipeline, it checks if there is an existing dataset created previously. If the dataset is new, Vertex Pipeline creates a new Vertex Dataset by importing the dataset from the GCS location and emits the corresponding Artifcat. Otherwise, it adds the additional dataset to the existing Vertex Dataset and emits an artifcat.
When the Vertex Pipeline recognizes the dataset as a new one, it trains a new AutoML model and deploys it by creating a new endpoint. If the dataset isn't new, it tries to retrieve the model ID from Vertex Model and determines whether a new AutoML model or an updated AutoML model is needed. The second branch determines whether the AutoML model has been created. If it hasn't been created, the second branch creates a new model. Also, wheen the model is trained, the corresponding component emits artifact as well.
Conclusion:
Many people underestimate the capability of AutoML, but it is a great alternative for app and service developers who have little ML background. Vertex AI is a great platform that provides AutoML as well as Pipeline features to automate the ML workflow. In this article, Google have demonstrated how to set up and run a basic MLOps workflow, from data injection to training a model based on the previously achieved best one, to deploying the model to a Vertex AI platform. With this, we can let our ML model automatically adapt to the changes in a new dataset. What's left for you to implement is to integrate a model monitoring system to detect data/model drift.
Comments
Post a Comment