Skip to content

Infery

CircleCI

Introduction

Welcome to Infery, Deci.AI's inference library intended to take your trained neural networks seamlessly to production. Infery provides to ability to convert deep neural networks between frameworks with a single line of code, along with the stable environment needed to do so. It then enables optimal a/synchronous inference aimed at fully utilizing your available HW resources - whether they are local or remote.

Getting Started

Infery has numerous abilities aimed at optimizing the workflow required to take a trained model to production. Before you may streamline the deployment process, you must first install Infery:

pip install --index-url=https://[USERNAME]:[TOKEN]@deci.jfrog.io/artifactory/api/pypi/deciExternal/simple infery==x.x.x
x.x.x is the infery version, for example: 4.0.4

After installing Infery, you may use its CLI to install requirements for different features, such as inference or compilation:

infery install --inference pytorch --compile onnx2tensorrt

Having installed Infery you may now look into any of these vended functionalities. First import and load your model:

from infery import *

# Load an ONNX model. If the framework is not inferred correctly, please pass framework=FrameworkType.XXX kwarg.
onnx_model = infery.Model("/path/to/model.onnx")

Model Analysis

  • Basic analysis:
    # Get metadata about the model
    metadata = onnx_model.metadata
    
  • For more in-depth instructions refer to the analysis guide.

Model Compilation

  • Basic compilation:
    trt_model = onnx_model.compile(target_framework=FrameworkType.TENSORRT)
    
  • For more in-depth instructions refer to the compilation guide.

Model Inference

  • Basic inference:
    # Run random noise (dummy inputs) through the model
    output = trt_model.predict(onnx_model.make_dummy_inputs())
    
    # Benchmark the model's performance
    benchmark_result = trt_model.benchmark(batch_size=1)
    
  • For more in-depth instructions refer to the inference guide.

Model Serving

  • For more in-depth instructions refer to quickstart

Support and Compatibility

Compilation

Below is a support matrices of the conversions Infery knows to perform. To check if Infery can compile (/convert) from a framework X to a framework Y, check the tile corresponding to row X and column Y. Frameworks with columns and no rows may not be converted from (i.e. can only be converted to):

ONNX TensorRT TensorFlow2 TFLite TFJS OpenVino CoreML Keras PyTorch
PyTorch
TensorFlow2
Keras
ONNX

Inference

Below is a support matrices of frameworks with which Infery knows to perform inference

PyTorch ONNX TensorRT TensorFlow 2 TFLite TFJS Keras OpenVino CoreML

Serving

Below is a support matrix of the protocols Infery knows to serve with

gRPC - Remote gRPC - Shared Memory HTTP - Remote HTTP - Shared Memory