Replies: 1 comment
-
Hey, Maybe you can use the binding https://onnxruntime.ai/docs/api/c/struct_ort_api.html#a9a53edebf4ef062a41b0e74f9c6763ec |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have a yolov4 onnx model and due to 2 dynamic axis - batch and number of boxes I am unable to do batch inference while single example inference works.
Model Input/Output:
So is there a way to copy whole input buffer to device, run inference on slices of it in loop and then copy the output so as to maximize the throughput
Eg. Array [40,512,512,3] -> CopyToGPU -> Loop(inference on [1,512,512,3]) -> CopyOutputToCPU
Beta Was this translation helpful? Give feedback.
All reactions