AINize + Gradio: Easiest Automatic ML model deployment
Machine Learning is great. You can train models to do a lot of fancy things, like analyze the text for you, or generate images from descriptions. However, in their raw forms, what they do is simply take numbers and return numbers, which are uninterpretable to naked human eyes. Most of the time, to show off your model, you would need some demo page that can interactively take input and generate output from and to users with ease. This has been very difficult before: you would need to
- Build inference server (REST or grpc)
- Deploy to Backend Server
- Build a frontend page and connect it to the inference server
Beling AI developers/researchers, these works might have been too much of an extra work. However, I stumpled upon two awesome projects that make these procedures much easier!
What are they?
Gradio
Gradio is an open source project that generates a simple frontend to demonstrate your ML model wrapped in a inference function. It also sets up a REST API that can be called from http requests. This is used in huggingface hub.
AINize
AINize is a free service that automatically deploys the connected github repository based on the repository’s dockerfile. You can use this to serve anything, really! (it doesn’t even have to be AI or ML related: just anything that can be dockerized)
Step 1. Setup Gradio Service
A sample Gradio service interface has this syntax:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import gradio as gr
iface = gr.Interface(
fn=model_infer_fn,
inputs=[
gr.Textbox(lines=2, placeholder="text here...", label='sentence'),
gr.Slider(minimum=0.1,
maximum=1.0,
step=0.1,
default=0.5,
label='threshold'),
],
outputs=[
gr.Textbox(type="auto", label="prediction"),
],
title='Demo',
theme='peach',
)
iface.launch()
This creates an interface with:
- 2 inputs: 1 string input and 1 slider input (float values)
- 1 output panel: a textbox
You can take a look at different options for input and outputs in the official documentation.
Also, you would need to define model_infer_fn
, which takes the inputs (1 string and 1 float) and returns a string. A simple example would be:
1
2
3
4
5
model = Model() # some model you trained
def model_infer_fn(text, threshold):
pred = model(text)
return 'positive' if pred >= treshold else 'negative'
For me, I like to create a ModelInterface
class for this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class ModelInterface:
def __init__(self, *args, **kwargs):
self.model = ModelClass(*args, **kwargs)
def infer(self, inputs):
pred = self.model(inputs)
# process output: text, image, plot, etc.
output =
return output
def interpret(self, *args):
"""
returns the contribution of each input component.
"""
pass
...
model = ModelInterface(args.model_name)
iface = gr.Interface(
fn=model.infer,
...
Putting the gr.Interface
in the main function and ModelInterface
together, the serve.py
file would look something like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
import argparse
import gradio as gr
# your trained model class
from models import ModelClass
class ModelInterface:
def __init__(self, *args, **kwargs):
self.model = ModelClass(*args, **kwargs)
def infer(self, inputs):
pred = self.model(inputs)
# process output: text, image, plot, etc.
output =
return output
def interpret(self, *args):
"""
returns the contribution of each input component.
"""
pass
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--model_name", type=str)
args = parser.parse_args()
model = ModelInterface(args.model_name)
iface = gr.Interface(
fn=model.infer,
inputs=[
gr.Textbox(lines=2, placeholder="text here...", label='sentence'),
gr.Slider(minimum=0.1,
maximum=1.0,
step=0.1,
default=0.5,
label='threshold'),
],
outputs=[
gr.Textbox(type="auto", label="prediction"),
],
title='Demo',
theme='peach',
interpretation=model.interpret,
)
iface.launch(server_name="0.0.0.0")
if __name__ == '__main__':
main()
Step 2. Dockerize
AINize works with Dockerized repositories, and finds entry point from the repository’s DOCKERFILE
in the root directory. We can setup the DOCKERFILE
as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
FROM pytorch/pytorch:1.8.1-cuda10.2-cudnn7-devel # base docker image from docker hub
RUN pip install pip --upgrade
RUN pip install gradio Jinja2 # install required packages
# setup directory
COPY . /app
WORKDIR /app
# gradio uses port 7860 by default
EXPOSE 7860
# run the serve.py
CMD ["python", "serve.py"]
Step 3. Deploy to AINize
Go to AINize dashboard.
- Copy and paste the url of your github repo to search bar.
- Select the branch for deployment.
- Wait until the setup is over.
- In the deployment dashboard, click the pencil button to edit environment variables and add 7860 to ports.
Voila! You now have deployed your service to the web, with both the frontend demo and backend REST API setup!