E-Commerce Fraud Detection Model

Model Description

This is an ensemble fraud detection system trained on 1.47M e-commerce transactions with a 5.01% fraud rate.

Architecture

Weighted Ensemble Strategy (70%-30%)

  • Stage 1 - Recall Specialists (70% weight): Logistic Regression + Random Forest
  • Stage 2 - Precision Specialists (30% weight): Neural Network + XGBoost

Performance Metrics

Model Accuracy Precision Recall F1-Score AUC-ROC
Logistic Regression 0.5723 0.0988 0.9273 0.1786 0.8619
Random Forest 0.6203 0.1075 0.8999 0.1920 0.8712
Neural Network 0.9569 0.7013 0.2442 0.3623 0.8748
XGBoost 0.9558 0.6632 0.2389 0.3513 0.8459
Stacking Ensemble 0.8973 0.2640 0.5868 0.3642 0.8731

Key Features

  • 52 engineered features including:
    • Transaction patterns (amount, quantity, frequency)
    • Customer behavior (account age, transaction history)
    • Temporal features (time-based patterns)
    • Risk indicators (unusual patterns, high-value flags)
    • Interaction features (multi-dimensional risk signals)

Training

  • Resampling: ADASYN (1:1 balance)
  • GPU Acceleration: RAPIDS cuML, PyTorch, XGBoost
  • Threshold Optimization: F-beta score optimization
  • Validation: Stratified K-Fold Cross-Validation

Usage

### Usage

## Warning: Need GPU environment with CUDA installed

```python
import joblib
import numpy as np

# Load models
lr_model = joblib.load("lr_model.pkl")
rf_model = joblib.load("rf_model.pkl")
nn_model = joblib.load("nn_model.pkl")
xgb_model = joblib.load("xgb_model.pkl")
ensemble_model = joblib.load("ensemble_model.pkl")
scaler = joblib.load("scaler.pkl")

# Prepare your data
df = ...

X = df[df.columns.difference(['Is Fraudulent'])].copy()
y = df['Is Fraudulent'].copy()

# Predict with ensemble
fraud_proba = ensemble_model.predict_proba(X)[:, 1]
fraud_pred = ensemble_model.predict(X)

# Evaluate predictions
evaluate_models([lr_model, rf_model, nn_model, xgb_model, ensemble_model], X, y, ['Logistic Regression', 'Random Forest', 'Neural Network', 'XGBoost', 'Stacking Ensemble'])

License

MIT License

Contact

COMPSCI 4AL3 - Group 34

Viransh Shah ([email protected]) Ellen Xiong ([email protected])

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support