LusakaLang Multi‑Task Model (Language + Sentiment + Topic)

Model Description

LusakaLang‑MultiTask is a unified transformer model built on top of bert-base-multilingual-cased, designed to perform three tasks simultaneously:

Language Identification
Sentiment Analysis
Topic Classification

The model integrates three fine‑tuned LusakaLang checkpoints:

Kelvinmbewe/mbert_Lusaka_Language_Analysis
Kelvinmbewe/mbert_LusakaLang_Sentiment_Analysis
Kelvinmbewe/mbert_LusakaLang_Topic

All tasks share a single mBERT encoder, with three independent classifier heads.
This architecture improves efficiency, reduces memory footprint, and enables consistent predictions across tasks.

Why This Model Matters

Zambian communication is multilingual, fluid, and highly context‑dependent.
A single message may include:

English
Bemba
Nyanja
Slang
Code‑switching
Cultural idioms
Indirect emotional cues

This model is designed specifically for that environment.

It excels at:

Identifying the dominant language or code‑switching
Detecting sentiment polarity in culturally nuanced text
Classifying topics such as:
- driver behaviour
- payment issues
- app performance
- customer support
- ride availability

Training Architecture

The model uses:

Shared Encoder: mBERT
Head 1: Language classifier
Head 2: Sentiment classifier
Head 3: Topic classifier

This multi‑task setup improves generalization and reduces inference cost.

Performance Summary

Language Identification

Metric	Score
Accuracy	0.97
Macro‑F1	0.96

Sentiment Analysis (Epoch 30 — Final Checkpoint)

Metric	Score
Accuracy	0.9322
Macro‑F1	0.9216
Negative F1	0.8649
Neutral F1	0.95
Positive F1	0.95

Topic Classification

Metric	Score
Accuracy	0.91
Macro‑F1	0.90

How to Use This Model

Load the Multi‑Task Model

from transformers import AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained("Kelvinmbewe/LusakaLang-MultiTask")
model = torch.load("Kelvinmbewe/LusakaLang-MultiTask/model.pt")
model.eval()

predict_language([
    "Ndeumfwa bwino lelo",
    "Galimoto inachedwa koma driver anali bwino",
    "The service was terrible today"
])

predict_sentiment([
    "Driver was rude and unprofessional",
    "Ndimvela bwino lelo",
    "The ride was okay, nothing special"
])

predict_topic([
    "Payment failed but money was deducted",
    "Support siyankhapo, waited long",
    "Driver was over speeding"
])

@model{LusakaLangMultiTask,
  author = {Kelvin Mbewe},
  title = {LusakaLang Multi-Task Model},
  year = 2025,
  url = {https://huggingface.co/Kelvinmbewe/LusakaLang-MultiTask}
}


                         ┌──────────────────────────────────────┐
                         │         Input Text (Any Language)     │
                         └──────────────────────────────────────┘
                                         │
                                         ▼
                         ┌──────────────────────────────────────┐
                         │        Tokenizer (mBERT-based)        │
                         └──────────────────────────────────────┘
                                         │
                                         ▼
                         ┌──────────────────────────────────────┐
                         │      Shared mBERT Encoder Layer       │
                         │  (bert-base-multilingual-cased)       │
                         └──────────────────────────────────────┘
                                         │
                                         ▼
                         ┌──────────────────────────────────────┐
                         │        [CLS] Pooled Representation    │
                         └──────────────────────────────────────┘
                                         │
         ┌───────────────────────────────┼───────────────────────────────┐
         │                               │                               │
         ▼                               ▼                               ▼
┌──────────────────────┐     ┌────────────────────────┐     ┌────────────────────────┐
│  Language Head        │     │  Sentiment Head        │     │   Topic Head            │
│ (Kelvinmbewe/         │     │ (Kelvinmbewe/          │     │ (Kelvinmbewe/           │
│  mbert_Lusaka_        │     │  mbert_LusakaLang_     │     │  mbert_LusakaLang_      │
│  Language_Analysis)   │     │  Sentiment_Analysis)   │     │  Topic)                 │
└──────────────────────┘     └────────────────────────┘     └────────────────────────┘
         │                               │                               │
         ▼                               ▼                               ▼
┌──────────────────────┐     ┌────────────────────────┐     ┌────────────────────────┐
│  Language Label       │     │  Sentiment Label       │     │  Topic Label            │
│ (e.g., Bemba, Nyanja, │     │ (Negative/Neutral/     │     │ (Driver, Payment,       │
│  English, Code‑Switch)│     │  Positive)             │     │  Support, etc.)         │
└──────────────────────┘     └────────────────────────┘     └────────────────────────┘

Downloads last month: 36

Safetensors

Model size

0.2B params

Tensor type

F32

Evaluation results

accuracy on LusakaLang Language Data
test set self-reported

0.970
f1_macro on LusakaLang Language Data
test set self-reported

0.960
accuracy on LusakaLang Language Data
test set self-reported

0.932
f1_macro on LusakaLang Language Data
test set self-reported

0.922
f1_negative on LusakaLang Language Data
test set self-reported

0.865
f1_neutral on LusakaLang Language Data
test set self-reported

0.950
f1_positive on LusakaLang Language Data
test set self-reported

0.950
accuracy on LusakaLang Language Data
test set self-reported

0.910
f1_macro on LusakaLang Language Data
test set self-reported

0.900