NVIDIA TensorRT Model Optimizer v0.15 Boosts Inference Performance and Expands Model Support
This blog post was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA. NVIDIA has announced the latest v0.15 release of NVIDIA TensorRT Model Optimizer, a state-of-the-art quantization toolkit of model optimization techniques including quantization, sparsity, and pruning. These techniques reduce model complexity and enable downstream inference frameworks like NVIDIA […]










