DocumentationAPI Reference
Back to ConsoleLog In

Step 6 – Deploying and Launching the Inference Engine (RTiC)

Installing the Inference Engine Server and Registering Your Model

Runtime Inference Container (RTiC) is Deci’s proprietary containerized deep-learning run-time inference engine that turns a model into a siloed efficient run-time server. Deci RTiC optimizes your model’s run-time production performance and enables you to interact with it using simple API commands.

RTiC is comprised of a server container and a Python client. The following describes how to deploy an RTiC server docker Image on any machine, how to run it and then how to send inference requests to it using Deci’s RTiC Python client.

Deci provides two different inference engine servers that perform the same functionality to be installed on either a CPU or a GPU machine.

To deploy the RTiC inference engine Docker –

(1) Verify compliance with the RTiC Prerequisites.

(2) Display the Deci Lab, which opens by default when you launch Deci or by clicking the Lab tab at the top of the page.

(3) Select the Optimized Version model to be deployed and click the Deploy button.

The following displays. It provides a series of simple copy/paste instructions, which enable you to pull the Deci components. These steps are described below –

(4) Step 1 – Log into the Deci Docker Registry – Log into the Deci docker registry in order to pull the RTiC inference engine. To log in, click the Copy icon in Step 1, which includes your private credentials for accessing the Deci docker registry.

Run this command in a CLI terminal on the machine on which to deploy the RTiC inference engine. This can be any machine, such as a local machine.

(5) Step 2 – Pull the RTiC Server Docker Image – Pull the RTiC inference engine Docker image, by copying the command from Step 2, as shown below –

This command is already prepared and customized for the target hardware environment of your model.

Note – Pulling the docker may take several minutes, during which the pulling process is displayed running on your screen.

Note – The example shown above is intended for a CPU target hardware environment. When you use the Deci platform, this string will be prepared for you to copy as-is according to your target hardware environment, such as a GPU.

(6) Step 3 – Run the Deci RTiC inference Engine and Register the Optimized Model – To run the RTiC inference engine with the selected optimized model inside, copy/paste the command from step 3, as shown below –

The code snippet provided in this window (shown above) launches the RTiC inference engine and registers the Deci optimized model in the RTiC server so that it is available for inference by the Python client (described below).

RTiC Prerequisites

  • Login credentials to the Deci-clients DockerHub account (provided by the Deci platform).
  • Server Machine
    Docker Container Runtime (Version 19.03 or higher)
    (Optional - For GPU Inference) – Nvidia Driver with support for CUDA 10.2
    Python 3.x
  • Client Machine – Can be the same machine as the server
    Python 3.x

Installing and Deploying the Inference Python Client

A single Deci inference Python client is provided for making calls to the Deci RTiC inference engine server.

The following describes how to install and deploy this Python client (without a docker). You can then send API requests from your application using the Python client, which will communicate with the RTiC inference engine in order to make inference requests of your optimized model.

To install and deploy the RTiC client –

(1) Install the RTiC Python client package so that it can be accessed by your application.

python3 -m pip install deci-client

(2) In order to communicate with the server, initialize a new DeciInferencerClient object, as follows.

from deci_client import DeciClient

# DEFAULT RTiC SERVER CONFIG
RTIC_SERVER_PORT = 8000
RTIC_SERVER_IP_ADDRESS = 'localhost'

client = DeciClient(rtic_host=RTIC_SERVER_IP_ADDRESS,
                    rtic_port=RTIC_SERVER_PORT)

You can change the default values in the code snippet above.

(3) In order to verify that your model has been registered with the RTiC inference engine and that you can access it from your client, use the following command –

client.rtic.get_model(model_name='Your_Model_Name')

Where the Your_Model_Name is the name of the model that appears in the Deploy Model window, as shown below –

The following is an example of a RTiC Python client response to a request from your application –

>>> client.rtic.get_model(model_name='Your_Model_Name')
{'data': {'inference_framework': 'tf',
          'inference_hw': 'gpu',
          'model_name': ‘My_Resnet-50_ONNX’, 
          'model_uuid': '95df873e-ce27-43a5-896c-9427e8d27cfd',
          'state': 'available'},
 'message': 'Successfully fetched the model.',
 'success': True}

Did this page help you?