DocumentationAPI Reference
Back to ConsoleLog In

Measuring a Model's Performance in INFERY

To measure a model’s performance using INFERY –

(1) Run the model.benchmark command from your application, as follows –

model.benchmark(batch_size=8,
                input_dims=(3,224,224),
                repetitions=100)

The operation consists of the following parameters –

  • BATCH_SIZE is the batch size for which the measurement will be made. This should be the batch size that the model is configured to handle.
  • INPUT_DIMS is the size/shape of the input to be used to measure the model’s performance. This should be the size/shape that the model is configured to handle.
  • REPETITIONS (optional) is the number of times the measurement request will be sent in order to improve accuracy. The measurement that is presented in the Deci platform represents the average of the measurements in the responses to each of these requests.

The following is an example of a request –

model.benchmark(batch_size=24,
                input_dims=(3,224,224))

The following is an example of a response –

infery_manager -INFO- Benchmarking the model in batch size 24 and dimensions (3, 224, 224)...
<ModelBenchmarks: {
    "batch_inf_time": "8.98 ms",
    "batch_inf_time_variance": "0.08 ms",
    "memory": "2362.00 ms",
    "throughput": "2671.70 fps",
    "sample_inf_time": "0.37 ms",
    "batch_size": 24
}>

The following describes the response parameters –
'batch_size' – Specifies batch size that was used for benchmark.
'batch_inf_time' – Specifies the latency for the entire batch.
'sample_inf_time' – Specifies the latency for a single sample within the batch. equivalent to batch_inf_time divided by the batch_size.
'memory' – Specifies the memory footprint that the model utilizes while inferencing.
'throughput' – Specifies the number of requests that are processed (forward passes) per second.
'batch_inf_time_variance' – Specifies the variance of the batch inference times during benchmark. If the variance is high, we recommend increasing the number passed to 'repetitions' to make the benchmarks more reliable.


Did this page help you?