LusakaLang Multi‑Task Model (Language + Sentiment + Topic)

Model Description

LusakaLang‑MultiTask is a unified transformer model built on top of bert-base-multilingual-cased, designed to perform three tasks simultaneously:

  1. Language Identification
  2. Sentiment Analysis
  3. Topic Classification

The model integrates three fine‑tuned LusakaLang checkpoints:

  • Kelvinmbewe/mbert_Lusaka_Language_Analysis
  • Kelvinmbewe/mbert_LusakaLang_Sentiment_Analysis
  • Kelvinmbewe/mbert_LusakaLang_Topic

All tasks share a single mBERT encoder, with three independent classifier heads.
This architecture improves efficiency, reduces memory footprint, and enables consistent predictions across tasks.


Why This Model Matters

Zambian communication is multilingual, fluid, and highly context‑dependent.
A single message may include:

  • English
  • Bemba
  • Nyanja
  • Slang
  • Code‑switching
  • Cultural idioms
  • Indirect emotional cues

This model is designed specifically for that environment.

It excels at:

  • Identifying the dominant language or code‑switching
  • Detecting sentiment polarity in culturally nuanced text
  • Classifying topics such as:
    • driver behaviour
    • payment issues
    • app performance
    • customer support
    • ride availability

Training Architecture

The model uses:

  • Shared Encoder: mBERT
  • Head 1: Language classifier
  • Head 2: Sentiment classifier
  • Head 3: Topic classifier

This multi‑task setup improves generalization and reduces inference cost.


Performance Summary

Language Identification

Metric Score
Accuracy 0.97
Macro‑F1 0.96

Sentiment Analysis (Epoch 30 β€” Final Checkpoint)

Metric Score
Accuracy 0.9322
Macro‑F1 0.9216
Negative F1 0.8649
Neutral F1 0.95
Positive F1 0.95

Topic Classification

Metric Score
Accuracy 0.91
Macro‑F1 0.90

How to Use This Model

Load the Multi‑Task Model

from transformers import AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained("Kelvinmbewe/LusakaLang-MultiTask")
model = torch.load("Kelvinmbewe/LusakaLang-MultiTask/model.pt")
model.eval()
predict_language([
    "Ndeumfwa bwino lelo",
    "Galimoto inachedwa koma driver anali bwino",
    "The service was terrible today"
])
predict_sentiment([
    "Driver was rude and unprofessional",
    "Ndimvela bwino lelo",
    "The ride was okay, nothing special"
])
predict_topic([
    "Payment failed but money was deducted",
    "Support siyankhapo, waited long",
    "Driver was over speeding"
])
@model{LusakaLangMultiTask,
  author = {Kelvin Mbewe},
  title = {LusakaLang Multi-Task Model},
  year = 2025,
  url = {https://huggingface.co/Kelvinmbewe/LusakaLang-MultiTask}
}

                         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                         β”‚         Input Text (Any Language)     β”‚
                         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                         β”‚
                                         β–Ό
                         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                         β”‚        Tokenizer (mBERT-based)        β”‚
                         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                         β”‚
                                         β–Ό
                         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                         β”‚      Shared mBERT Encoder Layer       β”‚
                         β”‚  (bert-base-multilingual-cased)       β”‚
                         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                         β”‚
                                         β–Ό
                         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                         β”‚        [CLS] Pooled Representation    β”‚
                         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                         β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚                               β”‚                               β”‚
         β–Ό                               β–Ό                               β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Language Head        β”‚     β”‚  Sentiment Head        β”‚     β”‚   Topic Head            β”‚
β”‚ (Kelvinmbewe/         β”‚     β”‚ (Kelvinmbewe/          β”‚     β”‚ (Kelvinmbewe/           β”‚
β”‚  mbert_Lusaka_        β”‚     β”‚  mbert_LusakaLang_     β”‚     β”‚  mbert_LusakaLang_      β”‚
β”‚  Language_Analysis)   β”‚     β”‚  Sentiment_Analysis)   β”‚     β”‚  Topic)                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                               β”‚                               β”‚
         β–Ό                               β–Ό                               β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Language Label       β”‚     β”‚  Sentiment Label       β”‚     β”‚  Topic Label            β”‚
β”‚ (e.g., Bemba, Nyanja, β”‚     β”‚ (Negative/Neutral/     β”‚     β”‚ (Driver, Payment,       β”‚
β”‚  English, Code‑Switch)β”‚     β”‚  Positive)             β”‚     β”‚  Support, etc.)         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Downloads last month
36
Safetensors
Model size
0.2B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Evaluation results