SDSAT: Accelerating LLM Inference through Speculative Decoding with Semantic Adaptive Tokens
Paper
•
2403.18647
•
Published
The 13B model of "SDSAT: Accelerating LLM Inference through Speculative Decoding with Semantic Adaptive Tokens"