ojus1 commited on
Commit
f341bc8
·
verified ·
1 Parent(s): 78278c4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -7
README.md CHANGED
@@ -17,7 +17,7 @@ tags:
17
 
18
  MiniGuard-v0.1 is a compact content safety classifier fine-tuned from [Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B). It classifies content in both, User inputs (prompt classification) and LLM responses (response classification), outputting whether a given prompt or response is safe or unsafe, along with the violated safety categories.
19
 
20
- MiniGuard-v0.1 achieves **~99% of Nemotron-Guard-8B's benchmark accuracy** with **13x fewer parameters**.
21
 
22
 
23
  ## Compatibility
@@ -56,14 +56,16 @@ Dataset - English subset test split of [nvidia/Nemotron-Safety-Guard-Dataset-v3]
56
 
57
  Evaluated on out-of-distribution production data containing real user queries. Cost estimated based on H200 GPU pricing ($7.91/hour) at maximum concurrency with P95 latency SLA of <500ms.
58
 
59
- | Metric | MiniGuard-v0.1 | Nemotron-Guard-8B-v3 |
60
- |--------|----------------|----------------------|
61
- | Relative Macro F1 | 91.1% | 100% |
62
- | Cost per 1M requests | **$15.54** | $46.93 |
63
- | Cost Savings | **67%** | - |
 
 
64
 
65
 
66
- MiniGuard-v0.1 achieves 91.1% relative performance on out-of-distribution data while costing **67% less** to serve.
67
 
68
  ### Ablation Study
69
 
@@ -77,6 +79,10 @@ Impact of techniques on out-of-distribution production data (Relative Macro F1 c
77
  | + Targeted Synthetic Data | 0.6B | 87.2% | +1.6% |
78
  | + Soup (top-3) [MiniGuard-v0.1] | 0.6B | 92.3% | +5.1% |
79
  | + FP8 | 0.6B | 91.1% | -1.2% |
 
 
 
 
80
  | Nemotron-Guard-8B-v3 | 8B | 100% | reference |
81
 
82
  #### In-Distribution
 
17
 
18
  MiniGuard-v0.1 is a compact content safety classifier fine-tuned from [Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B). It classifies content in both, User inputs (prompt classification) and LLM responses (response classification), outputting whether a given prompt or response is safe or unsafe, along with the violated safety categories.
19
 
20
+ MiniGuard-v0.1 achieves **~99% of Nemotron-Guard-8B's benchmark accuracy** with **13x fewer parameters** and **outperforms Qwen3Guard-8B** (a specialized 8B safety model) **by 9 percentage points** on production data.
21
 
22
 
23
  ## Compatibility
 
56
 
57
  Evaluated on out-of-distribution production data containing real user queries. Cost estimated based on H200 GPU pricing ($7.91/hour) at maximum concurrency with P95 latency SLA of <500ms.
58
 
59
+ | Model | Parameters | Rel. Macro F1 | Cost per 1M requests | Cost Savings |
60
+ |--------|------------|---------------|----------------------|--------------|
61
+ | **MiniGuard-v0.1** | **0.6B** | **91.1%** | **$15.54** | **67%** |
62
+ | Qwen3Guard-Gen-0.6B | 0.6B | 72.1% | - | - |
63
+ | Qwen3Guard-Gen-4B | 4B | 78.0% | - | - |
64
+ | Qwen3Guard-Gen-8B | 8B | 82.1% | - | - |
65
+ | Nemotron-Guard-8B-v3 | 8B | 100% | $46.93 | baseline |
66
 
67
 
68
+ MiniGuard-v0.1 achieves 91.1% relative performance on out-of-distribution data while costing **67% less** to serve. **Notably, our 0.6B fine-tuned model outperforms all Qwen3Guard models by significant margins**, including the 8B version (82.1%), demonstrating that targeted fine-tuning is more effective than simply using larger pretrained safety models.
69
 
70
  ### Ablation Study
71
 
 
79
  | + Targeted Synthetic Data | 0.6B | 87.2% | +1.6% |
80
  | + Soup (top-3) [MiniGuard-v0.1] | 0.6B | 92.3% | +5.1% |
81
  | + FP8 | 0.6B | 91.1% | -1.2% |
82
+ | **Comparison Baselines:** |
83
+ | Qwen3Guard-Gen-0.6B | 0.6B | 72.1% | - |
84
+ | Qwen3Guard-Gen-4B | 4B | 78.0% | - |
85
+ | Qwen3Guard-Gen-8B | 8B | 82.1% | - |
86
  | Nemotron-Guard-8B-v3 | 8B | 100% | reference |
87
 
88
  #### In-Distribution