Deployment
Docker Deployment
The project includes a Docker setup for deploying the inference API. The Docker image automatically downloads the best model from WandB during the build process.
How It Works
The Dockerfile (Dockerfile in the project root) is configured to:
- Install all project dependencies
- Download the model with the "best" alias from WandB (
best_model:best) - Save the model to
models/best_model.ptinside the container - Start the FastAPI inference server on port 8080
The model is downloaded from WandB artifacts during the Docker build, ensuring the container always uses the best-performing model based on validation accuracy.
Building the Docker Image
To build the Docker image, you need to provide your WandB API key:
docker build -t pneumonia-api:latest \
--build-arg WANDB_API_KEY=your_wandb_api_key_here \
-f Dockerfile .
To get your WandB API key:
- Visit: https://wandb.ai/authorize
- Or check your WandB settings
Note: The Docker image tag (:latest, :best, etc.) is just a label and doesn't affect which model is downloaded. The Dockerfile always downloads best_model:best from WandB regardless of the tag name.
Running the Container
Start the container:
docker run -d -p 8080:8080 --name pneumonia-api pneumonia-api:latest
-d: Run in detached mode (background)-p 8080:8080: Map port 8080 from container to host--name pneumonia-api: Name for the container
Checking Container Status
# Check if container is running
docker ps
# View container logs
docker logs pneumonia-api
# Stop the container
docker stop pneumonia-api
# Remove the container (after stopping)
docker rm pneumonia-api
Running Inference
Once the container is running, you can make predictions:
1. Check API health:
curl http://localhost:8080/health
2. Make a prediction:
curl -X POST "http://localhost:8080/predict" \
-H "accept: application/json" \
-H "Content-Type: multipart/form-data" \
-F "file=@path/to/your/image.jpeg"
Example with a test image:
curl -X POST "http://localhost:8080/predict" \
-H "accept: application/json" \
-H "Content-Type: multipart/form-data" \
-F "file=@data/raw/chest_xray/test/NORMAL/IM-0031-0001.jpeg"
The response will be:
{
"prediction": "NORMAL",
"class_index": 0
}
GCP VM Setup and Syncing
This project includes scripts to set up and sync the project with a Google Cloud Platform (GCP) VM instance for training.
Initial Setup
- Set up the repository on the VM:
bash
./setup_vm_git.sh
This script will:
- Install git on the VM (if needed)
- Clone the repository from GitHub to
~/mlops_projecton the VM - Install
uvpackage manager -
Configure PATH for
uv -
Authenticate with Google Cloud Storage:
SSH into the VM and authenticate with your user credentials to access the DVC remote:
bash
gcloud compute ssh --zone "europe-west1-d" "instance-20260113-110032" --project "machineoperationproject"
gcloud auth application-default login
This is a one-time setup. Credentials will persist across sessions.
- Install dependencies on the VM:
bash
cd ~/mlops_project
export PATH="$HOME/.local/bin:$PATH" # If not already in PATH
uv sync
Syncing Changes
After making changes locally and pushing to GitHub, sync the VM with:
./sync_vm.sh
This script will:
- Pull the latest code changes from GitHub
- Pull the latest data files using DVC from Google Cloud Storage
Note: The sync script uses user credentials (Option 2) for GCS access. If you encounter authentication errors, re-authenticate on the VM with gcloud auth application-default login.
Scripts Overview
setup_vm_git.sh- Initial setup: clones repository, installs dependenciessync_vm.sh- Regular syncing: pulls latest code and data from GitHub and GCS
Both scripts are designed to be run from your local terminal and will remotely execute commands on the VM via SSH.