The AMD Inference Server is in active development and this is the tentative and non-exhaustive roadmap of features we would like to add. Of course, this is subject to change based on our own assessment and on feedback from the community, both of which may affect which features take priority over others. More detailed information about the work that's ongoing and/or completed can be found in the :ref:`change log <Changelog>` and the Github roadmap.
- gRPC support (series of commits starting in :commit:`37a6aad`)
- GPU support (:pr:`34`)
The theme for 2023 is ease-of-use and performance. These two prongs are related and connected as two ways of engaging users and driving development. Ease-of-use means improving documentation and expanding testing with different models and devices to provide guides on how users can do the same. Making it easier to install and get started is a big part of that too. As test coverage expands, the question inevitably gets asked: how is it compared to the alternative? Thus, measuring performance and reliably reporting results consistently in a reproducible manner becomes important. The quality of those results should then guide what changes need to be made internally to improve performance. Having these results to compare with is also useful to maintain the numbers
- Benchmarking with MLPerf
- Refactor memory model
- Enable installation without Docker
- Expanded testing with models in Vitis AI model zoo