Spaces:
Running
Running
| title: README | |
| emoji: 📊 | |
| colorFrom: blue | |
| colorTo: blue | |
| sdk: static | |
| pinned: true | |
| license: apache-2.0 | |
| thumbnail: >- | |
| https://cdn-uploads.huggingface.co/production/uploads/66f8caead3186746f4524419/Nwp5bcZfu_D51MUNCN3oO.png | |
| short_description: 'MoM: Specialized Models for Intelligent Routing' | |
|  | |
| **One fabric. Many minds.** We're introducing **MoM** (Mixture of Models)—a family of specialized routing models that power vLLM-SR's intelligent decision-making. | |
| + vLLM Semantic Router 👉: [project link](https://github.com/vllm-project/semantic-router) | |
| <!-- truncate --> | |
| ## Why MoM? | |
| vLLM-SR solves a critical problem: **how to route LLM requests to the right model at the right time**. Not every query needs the same resources—"What's the weather?" shouldn't cost as much as "Analyze this legal contract." | |