INFERY (from the word inference) is Deci’s proprietary deep-learning run-time inference engine that turns a model into a siloed efficient runtime server and enables you to run a model from a Python package.
INFERY enables efficient inference and seamless deployment, on any hardware. INFERY is essential for overcoming the complex challenges of making deep learning models production-ready.
INFERY benefits include –
- Simplifies Deployment – Deci enables quick and simple packaging of AI models into a Python package, built for scalability and super quick deployment.
- Boosts Latency/Throughput – Deci provides inference performance acceleration of AI models and optimization for any given target hardware (CPU or GPU).
- Runs Anywhere – Deci enables model portability across common frameworks and across various types of production hosts. Deci INFERY offers inference performance optimization and model portability across multiple hardware, platforms and frameworks.
- Reduces Cost-to-Serve – Deci reduces total cost of ownership by up to 80% by maximizing hardware utilization. INFERY enables the pipelining and performance scaling of multiple models on a single host.
- Measures Your Model's Performance During Production – Deci reveals how your models really behave during production in order to enable debugging or scaling when required.
Updated 19 days ago
|Supported Deep Learning Frameworks|
|Quickstart with INFERY|
|Running Inference with INFERY|
|Measuring the Model's Performance in INFERY|