Step 4 – Optimizing Your Model

Now you are going to make the magic happen!

To optimize your model –

(1) Display the Deci Lab (which opens by default when you launch Deci) or by clicking the Lab tab at the top left of the page. The following displays –


(2) Click on the row of the model to be optimized, as shown above.

(3) Click the Optimize button. The following displays –


Alternatively, you can –

  • First, click the Optimize button and then select the baseline model from the Model dropdown menu.
    – OR –
  • Hover over the baseline model and then click the Optimize icon on the right of its row.

(4) Define the following in order to enable Deci to optimize the model for exactly what you need –

  • TARGET HARDWARE – Select the target hardware environments for which Deci is to optimize the model. The following options are provided –
  • SELECT BATCH SIZE – From the dropdown menu, select the production inference batch size for which Deci is to optimize the model. Deci automatically benchmarks for the selected batch sizes on each CPU target hardware and for all the batch sizes up to the selected batch size for each GPU target hardware.

(5) Click the Next button. The following displays –


(6) Define the optimization goals so that Deci can optimize the model for exactly what you need, as follows –

  • Main Focus – Select the primary goal for which the model will be optimized as either –
    Throughput – Optimizing the model to process the most requests.
    – OR –
    Latency – Optimizing the model to ensure the quickest response.
  • Additional Goal – This option enables you to select a secondary goal, such as to optimize the model’s size.
  • Quantization Level – Specify whether this model should go through a quantization process and the level of this quantization process. The lower the level you select here, the more accuracy may be compromised. Only 16-bit and 32-bit are available for the free version.

(7) Click the Start Optimization button. The following displays –

(8) Optimization typically takes between 3 to 10 minutes (depending on the model’s size). The optimized model is then displayed in the Lab underneath the uploaded baseline model and an email is set to notify you that it is ready.


The Deci benchmark process runs in the background in order to enable you to view the Deci score and the model’s performance metrics/benchmarks. For example, as shown below.


Note – The Deci platform ensures that the accuracy of the optimized model will be within a statistical 1% deviation of the baseline model’s accuracy. For example, if the accuracy of the baseline model that you upload is 76.13, then Deci ensures and accuracy of at least 75.37. Currently, the accuracy shown for the optimized model version shows the same value that you declared when uploading the baseline model. Until the next release of the Deci platform (coming soon), you must verify this accuracy on your own.