Persona Vectors: Monitoring and Controlling Character Traits in Language Models Paper • 2507.21509 • Published Jul 29 • 32
LoX: Low-Rank Extrapolation Robustifies LLM Safety Against Fine-tuning Paper • 2506.15606 • Published Jun 18 • 1
Safe and Robust Watermark Injection with a Single OoD Image Paper • 2309.01786 • Published Sep 4, 2023
LoX: Low-Rank Extrapolation Robustifies LLM Safety Against Fine-tuning Paper • 2506.15606 • Published Jun 18 • 1
MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models Paper • 2502.14302 • Published Feb 20 • 9
MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models Paper • 2502.14302 • Published Feb 20 • 9
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN Paper • 2412.13795 • Published Dec 18, 2024 • 20
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients Paper • 2407.08296 • Published Jul 11, 2024 • 33
PDEgym Collection A collection of datasets of solutions to partial differential equations. • 21 items • Updated May 30, 2024 • 9