TensorRT Dynamic Shape
https://docs.nvidia.com/deeplearning/tensorrt/latest/inference-library/work-dynamic-shapes.html
The following sections provide greater detail; however, here is an overview of the steps for building an engine with dynamic shapes:
- Specify each runtime dimension of an input tensor by using -1 as a placeholder for the dimension.
- Specify one or more optimization profiles at build time that specify the permitted range of dimensions for inputs with runtime dimensions and the dimensions for which the auto-tuner will optimize. For more information, refer to the Optimization Profiles section.
From this NVIDIA thread: https://forums.developer.nvidia.com/t/dynamic-batch-size/240410/2
So I believe the ONNX model needs to support dynamic shapes already?
- https://discuss.pytorch.org/t/dynamic-input-for-onnx-js-using-a-pytorch-trained-model/93981
- Yes, you can do this, read the documentation here: https://docs.pytorch.org/docs/stable/onnx_torchscript.html#torch.onnx.export
trtexec \
--onnx=your_model.onnx \
--minShapes=raw_inputs_action:1x1x... \
--optShapes=raw_inputs_action:1x8x... \
--maxShapes=raw_inputs_action:1x32x...
optShapes
is the optimal shape
What if I want multiple optimal shapes?
Then you can use
optShapes
.
import tensorrt as trt
builder = trt.Builder(trt.Logger())
network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
parser = trt.OnnxParser(network, trt.Logger())
with open("your_model.onnx", "rb") as f:
parser.parse(f.read())
config = builder.create_builder_config()
# Profile 1: optShape = 1x8x128
profile1 = builder.create_optimization_profile()
profile1.set_shape("raw_inputs_action", min=(1, 1, 128), opt=(1, 8, 128), max=(1, 32, 128))
config.add_optimization_profile(profile1)
# Profile 2: optShape = 1x16x128
profile2 = builder.create_optimization_profile()
profile2.set_shape("raw_inputs_action", min=(1, 1, 128), opt=(1, 16, 128), max=(1, 32, 128))
config.add_optimization_profile(profile2)
# Build engine with multiple profiles
engine = builder.build_engine(network, config)