Model in a Container

Yaser Marey · January 20, 2020

Misc

In this post, I will describe a recipe for ML Software Engineers who want to publish their ML model built with Python using FastAPI and containerize it using Docker. This is one of the steps towards making the model ready to be deployed to the production environment, whether it is on-premise or on the cloud.

FastAPI is a modern, high-performance Python web framework that is perfect for building RESTful APIs. It can handle both synchronous and asynchronous requests and has built-in support for data validation, JSON serialization, authentication and authorization, and OpenAPI.

At the same time, Docker is a containerization technology that has changed the way we ship our applications forever. Now, we deploy containers that reuse most of the underlying host operating system and add only the needed layers for our software to run! Containers also help us to horizontally scale our application by starting and stopping additional instances depending on the current demand.

Create ML Model

We will not start from scratch but rather use one of the datasets provided with the scikit-learn library and build a simple regression model on top of it. We will take the Housing dataset, which contains information about different houses in Boston. This data was originally part of the UCI Machine Learning Repository. There are 506 samples and 13 feature variables in this dataset. The objective is to predict the value of the prices of the house using the given features. Here is our simple linear regression to achieve that:

import pickle
import numpy as np
import sklearn.datasets as datasets
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

def fit():
    house = datasets.load_boston()
    train_x, test_x, train_y, test_y = train_test_split(house.data, house.target, test_size=0.2, random_state=1)
    lr = LinearRegression()
    lr.fit(train_x, train_y)
    pickle.dump(lr, open('./models/lr_reg_boston.h5', 'wb'))
    return

def predict(x_to_predict):
    lm_pickled = pickle.load(open("./models/lr_reg_boston.h5", "rb"))
    result = lm_pickled.predict(np.reshape(x_to_predict, (1, -1)))
    return result

if __name__ == "__main__":
    fit()
    house_to_evaluate = [0.62739, 0., 8.14, 0., 0.538, 5.834, 56.5, 4.4986, 4., 307., 21., 395.62, 8.47]
    price = predict(house_to_evaluate)
    print(f"result of predciton of {house_to_evaluate} is {str(price[0])} 1000s USD")

Our code has two methods, fit and predicts, fit method loads the dataset and uses scikit-learn to split the data into train and test sets then it trains a linear regression algorithm and saves the resulting model to the disk. The Predict method starts by loading the trained model and then passing in a sample house vector of feature values to predict its price. Save this code to boston_housing.py and run it with python boston_housing.py to train and generate the model.

Publishing Linear Regression Model with FastAPI

FastAPI is our API server of choice for the reasons we explained earlier. Here is the code:

import uvicorn
from fastapi import FastAPI
import boston_housing
#
app = FastAPI()
#
@app.get("/")
def index():
    return 'ML Model API is alive!'
#
@app.post("/predict")
def predict(house_to_evalute: list):
    prediction = boston_housing.predict(house_to_evalute)
    return str(prediction)

if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000)

The predict method basically handles HTTP post type of requests, it expects the payload of the request to contain a vector of feature values, it then uses the model we trained and generate before to predict the price.

Containerize your FastAPI ML Model

Now, let’s create a Dockerfile for our model. If our model runs correctly then all we have to do is to write Dockerfile for it, this way we can use to build a docker image and then spin a container out of it.

We will keep our Dockerfile in the build context which is the folder containing the application and dependencies and it should read as the following:

FROM python:3.7.6
WORKDIR .
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY ./ .
EXPOSE 8000
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Here we simply start from base image pyhon:3.7.6, on top of that we install requirements using pip install, we then copy all our files from the current directory to the image root directory, we tell docker that we want to expose port 8000 from our image. Finally, we run our application using the CMD command.

One comment here is that in production you would like to use gunicorn instead of unvicorn, consequently we need to change the CMD to look like the following

Build and Run

Now we have created our Dockerfile we can build an Image for our model using docker build command as the following, we start your Docker desktop and on the root folder of our model we issue the following command from the terminal:

docker build -t housing_price_pred_model .

If we ls the docker images we find:

(venv) G:\0Yaser\repos\ml_model_containerized>docker image ls
REPOSITORY                        TAG                 IMAGE ID            CREATED             SIZE
housing_price_pred_model          latest              cc6b1383c057        52 seconds ago      1.29GB

And to run a container based on our image we use the following command:

docker run -d --name housing_price_pred_model_container -p 8000:8000 housing_price_pred_model

(venv) G:\0Yaser\repos\ml_model_containerized>docker container ls
CONTAINER ID        IMAGE                      COMMAND             CREATED             STATUS              PORTS                    NAMES
a181597082ed        housing_price_pred_model   "python app.py"     7 seconds ago       Up 6 seconds        0.0.0.0:8000->8000/tcp   housing_price_pred_model_container

Now we can go to http://localhost:8000 to check if our API server is alive:

model in a container 1