In this article, I'm going to show you how to deploy/promote a machine-learning model from the MLflow model registry using GitHub Actions. Here are some assumptions I make for the sake of a controlled environment:
- I'm selecting the best model based on an evaluation metric (e.g. R2, precision, recall, MAP@K, etc), This means whenever the deployment runs, it will select the model run with the best possible score (and do nothing if the best model is already the deployed model). Probably this is not the best idea in some cases, maybe you're interested in running first a shadow deployment or A/B testing before promoting a model to production.
- I'll use MLflow 2.12.1, or specifically, I'm interested in the Aliases Feature MLflow introduced since v2.8.0, here is documentation about this new cool feature, that gives us the flexibility to organize, deploy, and promote models.
- As we don't use the classic Staging, Production, and Archived labels from the MLflow model registry, we're going to assign our label called champion
Nice, Now we can start creating the Python code with the logic for promoting the model and the GitHub Action for automation, the logic and rules for these steps are the following:
- Search Best run on an experiment or a list of experiments
- Select the Min/Max score according to your ML metric (In this case we'll use R2 from a regression model)
- Assign the champion label to the new best model (if any new model is better than the current champion)
- Run this Python script on a GitHub action that could be triggered in four different ways in my opinion: - Push to Main branch - Scheduled - Triggered by an external event (Model degrading alert) - Manually
MLFlow Python Module
NOTE: Keep in mind I'm assuming you already have your model experiments tracked with MLFlow, If not, check this hello-world example from MLflow DOCS and come back here after you finish that reading.
Make sure you have your credentials for the MLflow server. I'd recommend DagsHub as a free MLflow server you can experiment with. MLFlow will search for `MLFLOW_TRACKING_USERNAME` MLFLOW_TRACKING_USERNAME
and MLFLOW_TRACKIING_PASSWORKD
environment variables. I also defined MLFLOW_TRACKING_URI
as an environment variable that I can call later in the Python code.
import os
from loguru import logger
import mlflow
from mlflow import MlflowClient
from mlflow.entities.model_registry.model_version import ModelVersion
from typing import List
MLFLOW_TRACKING_URI = os.getenv("MLFLOW_TRACKING_URI")
MLFLOW_EXPERIMENT = "MLflow Medium Article Experiment" # Defined previously by you
MLFLOW_MODEL_NAME = "regression_model_medium_article"
CHAMPION_MODEL_ALIAS = "champion"
class ModelManager:
"""Model Manager for AUTOMATIC Deployment"""
def __init__(
self,
mlflow_tracking_uri: str,
mlflow_experiment_name: str,
mlflow_model_name: str,
):
self.mlflow_tracking_uri = mlflow_tracking_uri
self.mlflow_experiment_name = mlflow_experiment_name
self.mlflow_model_name = mlflow_model_name
self.client = MlflowClient()
def search_best_model(
self, experiment_names: List[str] = [], metric_name: str = "r2"
) -> str:
"""Search Best Run ID of given experiments"""
logger.info("Searching best model...")
runs_ = mlflow.search_runs(experiment_names=experiment_names)
best_run = runs_.loc[runs_[f"metrics.{metric_name}"].idxmax()]
return best_run["run_id"], f"{best_run['artifact_uri']}/model"
def promote_model(self, run_id: str, artifact_uri: str, model_name: str) -> ModelVersion:
"""Promote a model to a new alias"""
return mlflow.register_model(
model_uri=f"runs:/{run_id}/{artifact_uri}", name=model_name
)
def run_deploy(
self,
run_id: str,
artifact_uri: str,
model_name: str,
) -> None:
"""Deploy a model to a new alias"""
_new_model = self.promote_model(run_id, artifact_uri, model_name)
print(_new_model)
if _new_model.version == "1":
logger.info("First model version, setting as champion model.")
self.client.set_registered_model_alias(
MLFLOW_MODEL_NAME, CHAMPION_MODEL_ALIAS, _new_model.version
)
else:
logger.info(
f"New model version: v{_new_model.version}, Verifying if the model is different from current champion."
)
champ_model = self.client.get_model_version_by_alias(
MLFLOW_MODEL_NAME, CHAMPION_MODEL_ALIAS
)
if best_run_id == champ_model.run_id:
logger.info(
"Best model found is already champion model, no need to update. Exiting."
)
else:
logger.info(
"""Best model is not champion model, Promoting new model.
"""
)
self.client.set_registered_model_alias(
MLFLOW_MODEL_NAME, CHAMPION_MODEL_ALIAS, _new_model.version
)
if __name__ == "__main__":
logger.info("Starting Automatic Model Deployment...")
mlflow.set_tracking_uri(
uri=MLFLOW_TRACKING_URI,
)
mlflow.set_experiment(experiment_name=MLFLOW_MODEL_NAME)
manager = ModelManager(
mlflow_tracking_uri=MLFLOW_TRACKING_URI,
mlflow_experiment_name=MLFLOW_EXPERIMENT,
mlflow_model_name=MLFLOW_MODEL_NAME,
)
best_run_id, best_run_art_uri = manager.search_best_model(
experiment_names=[MLFLOW_EXPERIMENT]
)
manager.run_deploy(
run_id=best_run_id,
artifact_uri=best_run_art_uri,
model_name=MLFLOW_MODEL_NAME,
)
logger.info("Automatic Deployment applied successfully.")
To test the .py
file you can run:
python mlflow_model_deployment.py
Automatic Deployment Github Action
For the GitHub action, Of course, you need a GitHub repository. Assuming You already solved the Modeling part, you can commit the Python code on the root of the repository as mlflow_model_deployment.py
and create the GitHub Action on the .github/workflows
folder as a .yml file. Make sure you save your MLflow credentials and host URL on Github Variables and Secrets.
Additionally, I'm triggering the Action with two events, Push to main and manually, but feel free to change them if you need a better trigger for you model deployment
# Automatic deployment workflow
name: ML Auto Deployment
on:
push:
branches:
- main
workflow_dispatch:
jobs:
shadow_deployment:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install mlflow==2.12.1 loguru
- name: Run Automatic deployment
env:
MLFLOW_TRACKING_URI: ${{ vars.MLFLOW_TRACKING_URI }}
MLFLOW_TRACKING_USERNAME: ${{ secrets.MLFLOW_TRACKING_USERNAME }}
MLFLOW_TRACKING_PASSWORD: ${{ secrets.MLFLOW_TRACKING_PASSWORD }}
run: |
python mlflow_model_deployment.py
And there it is! One simple example you can use or improve for your model deployments with the MLflow model registry and the capabilities of GitHub Actions! I hope you enjoy replicating this exercise as much as I enjoyed writing it.