E-Commerce Fraud Detection Model
Model Description
This is an ensemble fraud detection system trained on 1.47M e-commerce transactions with a 5.01% fraud rate.
Architecture
Weighted Ensemble Strategy (70%-30%)
- Stage 1 - Recall Specialists (70% weight): Logistic Regression + Random Forest
- Stage 2 - Precision Specialists (30% weight): Neural Network + XGBoost
Performance Metrics
| Model | Accuracy | Precision | Recall | F1-Score | AUC-ROC |
|---|---|---|---|---|---|
| Logistic Regression | 0.5723 | 0.0988 | 0.9273 | 0.1786 | 0.8619 |
| Random Forest | 0.6203 | 0.1075 | 0.8999 | 0.1920 | 0.8712 |
| Neural Network | 0.9569 | 0.7013 | 0.2442 | 0.3623 | 0.8748 |
| XGBoost | 0.9558 | 0.6632 | 0.2389 | 0.3513 | 0.8459 |
| Stacking Ensemble | 0.8973 | 0.2640 | 0.5868 | 0.3642 | 0.8731 |
Key Features
- 52 engineered features including:
- Transaction patterns (amount, quantity, frequency)
- Customer behavior (account age, transaction history)
- Temporal features (time-based patterns)
- Risk indicators (unusual patterns, high-value flags)
- Interaction features (multi-dimensional risk signals)
Training
- Resampling: ADASYN (1:1 balance)
- GPU Acceleration: RAPIDS cuML, PyTorch, XGBoost
- Threshold Optimization: F-beta score optimization
- Validation: Stratified K-Fold Cross-Validation
Usage
### Usage
## Warning: Need GPU environment with CUDA installed
```python
import joblib
import numpy as np
# Load models
lr_model = joblib.load("lr_model.pkl")
rf_model = joblib.load("rf_model.pkl")
nn_model = joblib.load("nn_model.pkl")
xgb_model = joblib.load("xgb_model.pkl")
ensemble_model = joblib.load("ensemble_model.pkl")
scaler = joblib.load("scaler.pkl")
# Prepare your data
df = ...
X = df[df.columns.difference(['Is Fraudulent'])].copy()
y = df['Is Fraudulent'].copy()
# Predict with ensemble
fraud_proba = ensemble_model.predict_proba(X)[:, 1]
fraud_pred = ensemble_model.predict(X)
# Evaluate predictions
evaluate_models([lr_model, rf_model, nn_model, xgb_model, ensemble_model], X, y, ['Logistic Regression', 'Random Forest', 'Neural Network', 'XGBoost', 'Stacking Ensemble'])
License
MIT License
Contact
COMPSCI 4AL3 - Group 34
Viransh Shah ([email protected]) Ellen Xiong ([email protected])
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support