Spaces:

fokan
/

train-modle

Running

App Files Files Community

train-modle / DEPLOYMENT_GUIDE.md

fokan

Initial clean commit: Multi-Modal Knowledge Distillation Platform

ab4e093 4 months ago

preview code

raw

history blame contribute delete

7.56 kB

	# Deployment Guide for Hugging Face Spaces

	This guide provides step-by-step instructions for deploying the Multi-Modal Knowledge Distillation application to Hugging Face Spaces.

	## 📋 Pre-Deployment Checklist

	✅ Project Structure Complete
	- All required files and directories are present
	- Python syntax validation passed
	- Frontend files are properly structured

	✅ Configuration Validated
	- `requirements.txt` contains all necessary dependencies
	- `spaces_config.yaml` is properly configured
	- API endpoints are implemented and accessible

	✅ Documentation Complete
	- Comprehensive README.md with usage instructions
	- API documentation included
	- Troubleshooting guide provided

	## 🚀 Deployment Steps

	### Step 1: Create Hugging Face Space

	1. Go to Hugging Face Spaces
	- Visit [https://huggingface.co/spaces](https://huggingface.co/spaces)
	- Click "Create new Space"

	2. Configure Space Settings
	- Space name: `multi-modal-knowledge-distillation` (or your preferred name)
	- License: MIT
	- SDK: Gradio
	- Hardware: T4 small (minimum) or T4 medium (recommended)
	- Visibility: Public or Private (your choice)

	3. Initialize Repository
	- Choose "Initialize with README"
	- Click "Create Space"

	### Step 2: Upload Project Files

	Upload all the following files to your Space repository:

	#### Core Application Files
	```
	app.py # Main FastAPI application
	requirements.txt # Python dependencies
	spaces_config.yaml # Hugging Face Spaces configuration
	README.md # Project documentation
	.gitignore # Git ignore rules
	```

	#### Source Code
	```
	src/
	├── __init__.py # Package initialization
	├── model_loader.py # Model loading utilities
	├── distillation.py # Knowledge distillation engine
	└── utils.py # Utility functions
	```

	#### Frontend Files
	```
	templates/
	└── index.html # Main web interface

	static/
	├── css/
	│ └── style.css # Application styles
	└── js/
	└── main.js # Frontend JavaScript
	```

	#### Directory Structure (will be created automatically)
	```
	uploads/ # Uploaded model files
	models/ # Trained models
	temp/ # Temporary files
	logs/ # Application logs
	```

	### Step 3: Configure Hardware

	1. Go to Space Settings
	- Click on "Settings" tab in your Space
	- Navigate to "Hardware" section

	2. Select Hardware
	- Minimum: T4 small (16GB RAM, 1x T4 GPU)
	- Recommended: T4 medium (32GB RAM, 1x T4 GPU)
	- For large models: A10G small or larger

	3. Apply Changes
	- Click "Update hardware"
	- Your Space will restart with new hardware

	### Step 4: Monitor Deployment

	1. Build Process
	- Watch the "Logs" tab for build progress
	- Build typically takes 5-10 minutes
	- Dependencies will be installed automatically

	2. Common Build Issues
	- PyTorch installation: May take several minutes
	- CUDA compatibility: Ensure PyTorch version supports your hardware
	- Memory issues: Upgrade hardware if needed

	3. Successful Deployment
	- Space status shows "Running"
	- Application is accessible via the Space URL
	- Health check endpoint responds correctly

	## 🔧 Configuration Options

	### Environment Variables

	You can set these in your Space settings:

	```bash
	# Server Configuration
	PORT=7860 # Default port (usually not needed)
	HOST=0.0.0.0 # Default host

	# Resource Limits
	MAX_FILE_SIZE=5368709120 # 5GB max file size
	MAX_MODELS=10 # Maximum teacher models
	MAX_TRAINING_TIME=3600 # 1 hour training limit

	# GPU Configuration
	CUDA_VISIBLE_DEVICES=0 # GPU device selection
	```

	### Hardware Recommendations

	\| Use Case \| Hardware \| RAM \| GPU \| Cost \|
	\|----------\|----------\|-----\|-----\|------\|
	\| Demo/Testing \| CPU Basic \| 16GB \| None \| Free \|
	\| Small Models \| T4 small \| 16GB \| T4 \| Low \|
	\| Production \| T4 medium \| 32GB \| T4 \| Medium \|
	\| Large Models \| A10G small \| 24GB \| A10G \| High \|

	## 🧪 Testing Your Deployment

	### 1. Health Check
	```bash
	curl https://your-space-name-username.hf.space/health
	```

	### 2. Web Interface
	- Visit your Space URL
	- Test file upload functionality
	- Verify model selection works
	- Check training configuration options

	### 3. API Endpoints
	Test key endpoints:
	- `GET /` - Main interface
	- `POST /upload` - File upload
	- `GET /models` - List models
	- `WebSocket /ws/{session_id}` - Real-time updates

	## 🐛 Troubleshooting

	### Build Failures

	PyTorch Installation Issues:
	```bash
	# Check if CUDA version is compatible
	# Update requirements.txt if needed
	torch==2.1.0+cu118
	```

	Memory Issues During Build:
	- Upgrade to higher hardware tier
	- Reduce dependency versions
	- Remove unnecessary packages

	### Runtime Issues

	Out of Memory:
	- Increase hardware tier
	- Reduce batch size in training
	- Implement model sharding

	Model Loading Failures:
	- Check file format compatibility
	- Verify Hugging Face model exists
	- Ensure sufficient disk space

	WebSocket Connection Issues:
	- Check browser compatibility
	- Verify firewall settings
	- Try refreshing the page

	### Performance Issues

	Slow Training:
	- Upgrade to GPU hardware
	- Increase batch size
	- Use mixed precision training

	High Memory Usage:
	- Monitor system resources
	- Implement automatic cleanup
	- Reduce model cache size

	## 📊 Monitoring and Maintenance

	### Logs and Monitoring
	- Check Space logs regularly
	- Monitor resource usage
	- Set up alerts for failures

	### Updates and Maintenance
	- Keep dependencies updated
	- Monitor for security issues
	- Regular cleanup of temporary files

	### Scaling Considerations
	- Monitor user load
	- Consider multiple Space instances
	- Implement load balancing if needed

	## 🔒 Security Best Practices

	### File Upload Security
	- Validate all uploaded files
	- Implement size limits
	- Scan for malicious content

	### API Security
	- Implement rate limiting
	- Validate all inputs
	- Use HTTPS only

	### Resource Protection
	- Monitor resource usage
	- Implement timeouts
	- Automatic cleanup procedures

	## 📈 Performance Optimization

	### Model Loading
	- Cache frequently used models
	- Implement lazy loading
	- Use model compression

	### Training Optimization
	- Use mixed precision
	- Implement gradient checkpointing
	- Optimize batch sizes

	### Frontend Performance
	- Minimize JavaScript bundle
	- Optimize CSS delivery
	- Use CDN for static assets

	## 🎯 Success Metrics

	Your deployment is successful when:

	✅ Functionality
	- All API endpoints respond correctly
	- File uploads work without errors
	- Training completes successfully
	- Model downloads work properly

	✅ Performance
	- Page loads in < 3 seconds
	- Training starts within 30 seconds
	- Real-time updates work smoothly
	- Resource usage is within limits

	✅ User Experience
	- Interface is responsive on all devices
	- Error messages are clear and helpful
	- Progress tracking works accurately
	- Documentation is accessible

	## 📞 Support and Resources

	- Hugging Face Spaces Documentation: [https://huggingface.co/docs/hub/spaces](https://huggingface.co/docs/hub/spaces)
	- FastAPI Documentation: [https://fastapi.tiangolo.com/](https://fastapi.tiangolo.com/)
	- PyTorch Documentation: [https://pytorch.org/docs/](https://pytorch.org/docs/)

	---

	Your Multi-Modal Knowledge Distillation application is now ready for production deployment! 🎉