Use this model with vLLM on L40 or 4090 (SM89)
#7
by
Mephisto1484
- opened
The vLLM provided by Alibaba's team has a bug for SM89 machines such as L40 or 4090, as it does not include the required Marlin-AWQ-Moe kernel.
Solution: Refer to (https://github.com/vllm-project/vllm/pull/26755) and (https://github.com/vllm-project/vllm/pull/28294), modify [CMakeLists.txt] to add SM89 support, then recompile vLLM and install it.
The official vLLM team has merged the fix for this bug, but Alibaba's team has not yet updated it in their vLLM. It has not been tested, so it is unclear whether installing from the official vLLM source code would avoid this issue.