Are We Using the Right Benchmark: An Evaluation Framework for Visual Token Compression Methods Paper • 2510.07143 • Published Oct 8 • 12
Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time Paper • 2408.13233 • Published Aug 23, 2024 • 24