Spaces:
Paused
Paused
| # Randomness Configuration Guide | |
| ## Quick Answer to Your Question | |
| **Yes, it's fine to have randomness!** By default, the script now uses **random seeds**, so results will vary each run. This is actually **better** because it shows the true stochastic nature of learning. | |
| ## How It Works Now | |
| ### Default Behavior (Random - Results Vary) | |
| ```bash | |
| python compare_strategies.py | |
| ``` | |
| - Uses current time as seed | |
| - **Results will be different each run** | |
| - Better for seeing variance and stochasticity | |
| ### Deterministic Mode (Same Results Every Time) | |
| ```bash | |
| python compare_strategies.py --deterministic | |
| ``` | |
| - Uses fixed seed=42 | |
| - **Results will be identical every run** | |
| - Good for debugging and reproducibility | |
| ### Variance Analysis (Multiple Runs) | |
| ```bash | |
| python compare_strategies.py --runs 10 | |
| ``` | |
| - Runs 10 times with different seeds | |
| - Shows mean Β± standard deviation | |
| - Best for robust evaluation | |
| ## Why This Matters | |
| The learning process has natural randomness: | |
| - **Random strategy**: Obviously random! π² | |
| - **Student learning**: Stochastic answers (probabilistic) | |
| - **Teacher strategy**: RL exploration adds variance | |
| Seeing this variance is important because: | |
| 1. **Single runs can be lucky/unlucky** | |
| 2. **Variance shows robustness** (lower variance = more reliable) | |
| 3. **Real-world performance will vary** | |
| ## Example: Seeing the Difference | |
| **Run 1:** | |
| ``` | |
| Teacher: Final Acc: 0.773 | |
| Random: Final Acc: 0.653 | |
| ``` | |
| **Run 2 (different seed):** | |
| ``` | |
| Teacher: Final Acc: 0.789 | |
| Random: Final Acc: 0.641 | |
| ``` | |
| **Run 3 (different seed):** | |
| ``` | |
| Teacher: Final Acc: 0.761 | |
| Random: Final Acc: 0.667 | |
| ``` | |
| This variance is **normal and expected**! The teacher should still outperform on average. | |
| ## Best Practices | |
| 1. **For development/testing**: Use `--deterministic` for consistent debugging | |
| 2. **For evaluation**: Use `--runs 10` to see robust statistics | |
| 3. **For quick checks**: Default (random) is fine - just run multiple times manually | |
| ## All Options | |
| ```bash | |
| python compare_strategies.py [OPTIONS] | |
| Options: | |
| --seed SEED Use specific seed (e.g., --seed 123) | |
| --deterministic Use seed=42 (reproducible, same every time) | |
| --iterations N Train for N iterations (default: 500) | |
| --runs N Run N times for variance analysis | |
| ``` | |
| ## Summary | |
| β **Default now has randomness** - results vary (this is good!) | |
| β **Use --deterministic** if you want identical results | |
| β **Use --runs N** for proper variance analysis | |
| β **Variance is expected** - shows realistic behavior | |
| The stochastic nature is actually a feature, not a bug! It shows the true variability in learning. | |