Introduction to MLflow Model Registry

In the previous blog we saw how we can track our ML model using mlflow's tracking API's. In this one we are gonna see how to use model registry from mlflow.

The MLflow Model Registry component is a centralized model store, set of APIs, and UI, to collaboratively manage the full lifecycle of an MLflow Model. It provides model lineage (which MLflow experiment and run produced the model), model versioning, model aliasing, model tagging, and annotations.


Imagine you're employed as an engineer at a company, where your manager has assigned you the responsibility of devising a system for training models and storing them in a centralized repository. This repository will serve as a crucial resource for DevOps, enabling them to retrieve and deploy your models into production seamlessly.

To accomplish this task efficiently, you can leverage the capabilities of MLflow's model registry and model tracking functionalities. The model registry acts as a reliable mechanism for storing your models in a centralized location, ensuring easy access and management.

Let's delve into how you can successfully execute this task using MLflow.


For this illustration, let's consider a scenario where we develop a random forest classifier model tailored to classify different types of iris flowers.

Hyper parameters

We'll establish the hyperparameters for our random forest classifier.

n_estimators = 10
max_depth = 5

Model Tracking

During the training process, we'll utilize MLflow tracking to monitor the model's progress. For an in-depth understanding of MLflow tracking, you can refer to blog post dedicated to this topic.

# tracking
with mlflow.start_run(experiment_id=experiment_id) as run:
    model = RandomForestClassifier(
        n_estimators=n_estimators,
        max_depth=max_depth
    )

    # training
    model.fit(X=X_train, y=y_train)

    # predictions
    predict = model.predict(X=X_test)

Model Registry

Having successfully trained and tracked our model, the next step is to register it with the MLflow model registry. This action ensures that the model is stored in a centralized location, typically where your MLflow server is hosted. Below is the code snippet to register your model:

# create registered model
client.create_registered_model(name=registered_model_name)
# assign version
client.create_model_version(
    name=registered_model_name,
    source=f"runs:/{run_id}/{log_model_path}",
    run_id=run_id
)

We assign a specific name under which our model will be registered.

The client.create_registered_model function creates a registered model without any associated version. On the other hand, client.create_model_version generates a new version for the model. Initially, new models are designated as version 1. This setup is visually represented in the MLflow UI.


Now that you've successfully completed the initial task assigned by your manager, a new task has been assigned: implementing a mechanism to update our model, ensuring we always have the latest version with all bug fixes.

For this purpose, MLflow's model registry proves to be invaluable yet again. The model registry offers versioning capabilities for registered models, enabling seamless updates. Once you've resolved any issues with your model, you can update it within the model registry. The updated model will be registered under the same name but with a new version. Subsequently, your DevOps team can effortlessly retrieve the latest model and deploy it into production.

Updating the registered model version is a straightforward process, as demonstrated by the code below:

# update version with new run id
client.create_model_version(
    name=registered_model_name,
    source=f"runs:/{run_id}/{log_model_path}",
    run_id=run_id
)

In the provided code snippet, registered_model_name corresponds to the name of our previously registered model.

The run_id refers to the updated model's new run ID.

The MLflow UI reflects these changes after updating the model version.

The complete code for our imaginary task is accessible here.