Infery

Introduction

Welcome to Infery, Deci.AI's inference library intended to take your trained neural networks seamlessly to production. Infery provides to ability to convert deep neural networks between frameworks with a single line of code, along with the stable environment needed to do so. It then enables optimal a/synchronous inference aimed at fully utilizing your available HW resources - whether they are local or remote.

Getting Started

Infery has numerous abilities aimed at optimizing the workflow required to take a trained model to production. Before you may streamline the deployment process, you must first install Infery:

pip install --index-url=https://[USERNAME]:[TOKEN]@deci.jfrog.io/artifactory/api/pypi/deciExternal/simple infery==x.x.x

x.x.x is the infery version, for example: 4.0.4

After installing Infery, you may use its CLI to install requirements for different features, such as inference or compilation:

infery install --inference pytorch --compile onnx2tensorrt

For more in-depth instructions refer to the installation guide.

Having installed Infery you may now look into any of these vended functionalities. First import and load your model:

from infery import *

# Load an ONNX model. If the framework is not inferred correctly, please pass framework=FrameworkType.XXX kwarg.
onnx_model = infery.Model("/path/to/model.onnx")

Model Analysis

Basic analysis:

# Get metadata about the model
metadata = onnx_model.metadata

For more in-depth instructions refer to the analysis guide.

Model Compilation

Basic compilation:

trt_model = onnx_model.compile(target_framework=FrameworkType.TENSORRT)

For more in-depth instructions refer to the compilation guide.

Model Inference

Basic inference:

# Run random noise (dummy inputs) through the model
output = trt_model.predict(onnx_model.make_dummy_inputs())

# Benchmark the model's performance
benchmark_result = trt_model.benchmark(batch_size=1)

For more in-depth instructions refer to the inference guide.

Model Serving

For more in-depth instructions refer to quickstart

Support and Compatibility

Compilation

Below is a support matrices of the conversions Infery knows to perform. To check if Infery can compile (/convert) from a framework X to a framework Y, check the tile corresponding to row X and column Y. Frameworks with columns and no rows may not be converted from (i.e. can only be converted to):

	ONNX	TensorRT	TensorFlow2	TFLite	TFJS	OpenVino	CoreML	Keras	PyTorch
PyTorch	✅	✅	✅	✅	✅	✅	✅	✅	✅
TensorFlow2	✅	✅	✅	✅	✅	✅	✅	❌	❌
Keras	✅	✅	✅	✅	✅	✅	✅	✅	❌
ONNX	✅	✅	✅	✅	✅	✅	✅	✅	❌

Inference

Below is a support matrices of frameworks with which Infery knows to perform inference

PyTorch	ONNX	TensorRT	TensorFlow 2	TFLite	TFJS	Keras	OpenVino	CoreML
✅	✅	✅	✅	✅	✅	✅	✅	✅

Serving

Below is a support matrix of the protocols Infery knows to serve with

gRPC - Remote	gRPC - Shared Memory	HTTP - Remote	HTTP - Shared Memory
✅	✅	✅	✅