-
Are FP16 code optimized with AVX or other vector instructions for CPU? Thanks. MlasHalfGemmKernel<MLAS_HALF_GEMM_KERNEL_DEFAULT> doesn't seem to have any vector instructions. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
your understanding is correct. it is not optimized for avx. We have optimized kernel for ARM neon chips and in the process of refining that: |
Beta Was this translation helpful? Give feedback.
-
The OpenVINO execution provider seems to have FP16 support for CPU (CPU_FP16) But honestly I do not know details about that. |
Beta Was this translation helpful? Give feedback.
your understanding is correct. it is not optimized for avx. We have optimized kernel for ARM neon chips and in the process of refining that:
https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/mlas/lib/halfgemm_kernel_neon.cpp