Welcome to Infery, Deci.AI's inference library intended to take your trained neural networks seamlessly to production. Infery provides to ability to convert deep neural networks between frameworks with a single line of code, along with the stable environment needed to do so. It then enables optimal a/synchronous inference aimed at fully utilizing your available HW resources - whether they are local or remote.

Getting Started

Infery has numerous abilities aimed at optimizing the workflow required to take a trained model to production. Before you may streamline the deployment process, you must first install Infery:

pip install --index-url=https://[USERNAME]:[TOKEN] infery==x.x.x
x.x.x is the infery version, for example: 4.0.4

After installing Infery, you may use its CLI to install requirements for different features, such as inference or compilation:

infery install --inference pytorch --compile onnx2tensorrt

Having installed Infery you may now look into any of these vended functionalities. First import and load your model:

from infery import *

# Load an ONNX model. If the framework is not inferred correctly, please pass framework=FrameworkType.XXX kwarg.
onnx_model = infery.Model("/path/to/model.onnx")

Model Analysis

  • Basic analysis:
    # Get metadata about the model
    metadata = onnx_model.metadata
  • For more in-depth instructions refer to the analysis guide.

Model Compilation

  • Basic compilation:
    trt_model = onnx_model.compile(target_framework=FrameworkType.TENSORRT)
  • For more in-depth instructions refer to the compilation guide.

Model Inference

  • Basic inference:
    # Run random noise (dummy inputs) through the model
    output = trt_model.predict(onnx_model.make_dummy_inputs())
    # Benchmark the model's performance
    benchmark_result = trt_model.benchmark(batch_size=1)
  • For more in-depth instructions refer to the inference guide.

Model Serving

  • For more in-depth instructions refer to quickstart

Support and Compatibility


Below is a support matrices of the conversions Infery knows to perform. To check if Infery can compile (/convert) from a framework X to a framework Y, check the tile corresponding to row X and column Y. Frameworks with columns and no rows may not be converted from (i.e. can only be converted to):

ONNX TensorRT TensorFlow2 TFLite TFJS OpenVino CoreML Keras PyTorch TorchScript TensorFlow1


Below is a support matrices of frameworks with which Infery knows to perform inference

PyTorch ONNX TensorRT TensorFlow 2 TFLite TFJS Keras OpenVino CoreML TensorFlow 1 TorchScript


Below is a support matrix of the protocols Infery knows to serve with

gRPC - Remote gRPC - Shared Memory HTTP - Remote HTTP - Shared Memory