An MLflow tracking server contains two components:

backend store
artifact store

In this section, we will tell you how to store the backend data in a database solution.

Why do we need to store the backend data in the database?

In a production-ready system, we need to store the backend data elsewhere. We can confirm that the data can be stored in a safe place. We can use the easy way to back up the database.

Supported Database

MLflow uses SQLAlchemy database URI as their connection method. You can check the content to know the support database information:

SQLAlchemy 2.0 Documentation

Tutorial

Prerequisite: Create the service before we configure the MLflow.

Part 1: Have a database.

Create the database engine:

In this tutorial, we will use PostgreSQL as our example.
- [Example] PostgreSQL database setting.

Part 2: Create the MinIO Storage platform

Step 1: Install Minio storage to store the mlflow artifact:

$ docker run -d -p 9000:9000 -p 9001:9001 \\
	-e MINIO_ACCESS_KEY=minioaccount \\
  -e MINIO_SECRET_KEY=miniopassword \\
  -v /mnt/data:/data \\
  -v /mnt/config:/root/.minio \\
  --name minio-storage-platform \\
	minio/minio server /data --console-address ":9001"

Step 2: Open the web browser and login into the MinIO platform on <IP-address>:9000.
Step 3: Create the mlflow bucket.
Step 4: You can see the result on the buckets page.

Part 1: Install the MLflow Platform

Step 1: Use Dockerfile to build the MLflow environment.

FROM python:3.8.7-slim

RUN pip install PyMySQL==1.0.2 && \\
    pip install psycopg2-binary==2.8.6 && \\
    pip install mlflow[extras]==1.30.0

RUN mkdir -p /data/mlflow/mlruns && \\
    chmod 777 /data/mlflow/mlruns

ENTRYPOINT ["mlflow", "server", \\
				"--host", "0.0.0.0", \\
				"--default-artifact-root", "s3://mlflow", \\
				"--backend-store-uri", \\
				"postgresql+psycopg2://postgres:changeme@<endpoint-url>:5432/postgres" \\
]