Automatic Model Deployment with MLflow and GitHub Actions

Hello There!

Miguel Arquez Abdala

~4 min read · May 4, 2024 (Updated: May 4, 2024) · Free: Yes

In this article, I'm going to show you how to deploy/promote a machine-learning model from the MLflow model registry using GitHub Actions. Here are some assumptions I make for the sake of a controlled environment:

I'm selecting the best model based on an evaluation metric (e.g. R2, precision, recall, MAP@K, etc), This means whenever the deployment runs, it will select the model run with the best possible score (and do nothing if the best model is already the deployed model). Probably this is not the best idea in some cases, maybe you're interested in running first a shadow deployment or A/B testing before promoting a model to production.
I'll use MLflow 2.12.1, or specifically, I'm interested in the Aliases Feature MLflow introduced since v2.8.0, here is documentation about this new cool feature, that gives us the flexibility to organize, deploy, and promote models.

source: mlflow UI update

As we don't use the classic Staging, Production, and Archived labels from the MLflow model registry, we're going to assign our label called champion

Nice, Now we can start creating the Python code with the logic for promoting the model and the GitHub Action for automation, the logic and rules for these steps are the following:

Search Best run on an experiment or a list of experiments
Select the Min/Max score according to your ML metric (In this case we'll use R2 from a regression model)
Assign the champion label to the new best model (if any new model is better than the current champion)
Run this Python script on a GitHub action that could be triggered in four different ways in my opinion: - Push to Main branch - Scheduled - Triggered by an external event (Model degrading alert) - Manually

MLFlow Python Module

NOTE: Keep in mind I'm assuming you already have your model experiments tracked with MLFlow, If not, check this hello-world example from MLflow DOCS and come back here after you finish that reading.

Make sure you have your credentials for the MLflow server. I'd recommend DagsHub as a free MLflow server you can experiment with. MLFlow will search for `MLFLOW_TRACKING_USERNAME` MLFLOW_TRACKING_USERNAMEand MLFLOW_TRACKIING_PASSWORKDenvironment variables. I also defined MLFLOW_TRACKING_URI as an environment variable that I can call later in the Python code.

import os
from loguru import logger
import mlflow
from mlflow import MlflowClient
from mlflow.entities.model_registry.model_version import ModelVersion
from typing import List


MLFLOW_TRACKING_URI = os.getenv("MLFLOW_TRACKING_URI")
MLFLOW_EXPERIMENT = "MLflow Medium Article Experiment" # Defined previously by you
MLFLOW_MODEL_NAME = "regression_model_medium_article"

CHAMPION_MODEL_ALIAS = "champion"

class ModelManager:
    """Model Manager for AUTOMATIC   Deployment"""

    def __init__(
        self,
        mlflow_tracking_uri: str,
        mlflow_experiment_name: str,
        mlflow_model_name: str,
    ):
        self.mlflow_tracking_uri = mlflow_tracking_uri
        self.mlflow_experiment_name = mlflow_experiment_name
        self.mlflow_model_name = mlflow_model_name
        self.client = MlflowClient()

    def search_best_model(
        self, experiment_names: List[str] = [], metric_name: str = "r2"
    ) -> str:
        """Search Best Run ID of given experiments"""
        logger.info("Searching best model...")
        runs_ = mlflow.search_runs(experiment_names=experiment_names)
        best_run = runs_.loc[runs_[f"metrics.{metric_name}"].idxmax()]

        return best_run["run_id"], f"{best_run['artifact_uri']}/model"

    def promote_model(self, run_id: str, artifact_uri: str, model_name: str) -> ModelVersion:
        """Promote a model to a new alias"""
        return mlflow.register_model(
            model_uri=f"runs:/{run_id}/{artifact_uri}", name=model_name
        )

    def run_deploy(
        self,
        run_id: str,
        artifact_uri: str,
        model_name: str,
    ) -> None:
        """Deploy a model to a new alias"""
        _new_model = self.promote_model(run_id, artifact_uri, model_name)
        print(_new_model)
        if _new_model.version == "1":
            logger.info("First model version, setting as champion model.")
            self.client.set_registered_model_alias(
                MLFLOW_MODEL_NAME, CHAMPION_MODEL_ALIAS, _new_model.version
            )
        else:
            logger.info(
                f"New model version: v{_new_model.version}, Verifying if the model is different from current champion."
            )
            champ_model = self.client.get_model_version_by_alias(
                MLFLOW_MODEL_NAME, CHAMPION_MODEL_ALIAS
            )
            if best_run_id == champ_model.run_id:
                logger.info(
                    "Best model found is already champion model, no need to update. Exiting."
                )
            else:
                logger.info(
                    """Best model is not champion model, Promoting new model.
                    """
                )
                self.client.set_registered_model_alias(
                    MLFLOW_MODEL_NAME, CHAMPION_MODEL_ALIAS, _new_model.version
                )

if __name__ == "__main__":
    logger.info("Starting Automatic Model Deployment...")
    mlflow.set_tracking_uri(
            uri=MLFLOW_TRACKING_URI,
        )
    mlflow.set_experiment(experiment_name=MLFLOW_MODEL_NAME)
    manager = ModelManager(
        mlflow_tracking_uri=MLFLOW_TRACKING_URI,
        mlflow_experiment_name=MLFLOW_EXPERIMENT,
        mlflow_model_name=MLFLOW_MODEL_NAME,
    )

    best_run_id, best_run_art_uri = manager.search_best_model(
        experiment_names=[MLFLOW_EXPERIMENT]
    )

    manager.run_deploy(
        run_id=best_run_id,
        artifact_uri=best_run_art_uri,
        model_name=MLFLOW_MODEL_NAME,
    )

    logger.info("Automatic Deployment applied successfully.")

To test the .py file you can run:

python mlflow_model_deployment.py

Automatic Deployment Github Action

For the GitHub action, Of course, you need a GitHub repository. Assuming You already solved the Modeling part, you can commit the Python code on the root of the repository as mlflow_model_deployment.pyand create the GitHub Action on the .github/workflows folder as a .yml file. Make sure you save your MLflow credentials and host URL on Github Variables and Secrets.

Additionally, I'm triggering the Action with two events, Push to main and manually, but feel free to change them if you need a better trigger for you model deployment

# Automatic deployment workflow
name: ML Auto Deployment

on:
  push:
    branches:
      - main
  workflow_dispatch:

jobs:
    shadow_deployment:
        runs-on: ubuntu-latest

        steps:
        - name: Checkout code
          uses: actions/checkout@v4

        - name: Set up python
          uses: actions/setup-python@v5
          with:
            python-version: '3.11'

        - name: Install dependencies
          run: |
            python -m pip install --upgrade pip
            pip install mlflow==2.12.1 loguru
        - name: Run Automatic deployment
          env:
            MLFLOW_TRACKING_URI: ${{ vars.MLFLOW_TRACKING_URI }}
            MLFLOW_TRACKING_USERNAME: ${{ secrets.MLFLOW_TRACKING_USERNAME }}
            MLFLOW_TRACKING_PASSWORD: ${{ secrets.MLFLOW_TRACKING_PASSWORD }}
          run: |
            python mlflow_model_deployment.py

And there it is! One simple example you can use or improve for your model deployments with the MLflow model registry and the capabilities of GitHub Actions! I hope you enjoy replicating this exercise as much as I enjoyed writing it.

#mlflow #github-actions #mlops #deployment

Automatic Model Deployment with MLflow and GitHub Actions

Hello There!

MLFlow Python Module

Automatic Deployment Github Action

Reporting a Problem