# 🧪 **Day 02 – Sentiment Analysis & Zero-Shot Beyond Default with Hugging Face 🤗**

This notebook contains all the code experiments for **Day 2** of my *30 Days of GenAI* challenge.

For detailed commentary and discoveries, see 👉 [Day 2 Log](https://huggingface.co/Musno/30-days-of-genai/blob/main/logs/day2.md)


---

## 📌 What’s Covered Today

- 🔍 Comparing the **default Hugging Face sentiment pipeline** with fine-tuned Arabic models
- 🧪 Testing **multiple Arabic sentiment models**, including dialect support
- 🏆 Identifying the most accurate model for Arabic sentiment tasks
- 🌍 Exploring **zero-shot classification** in multilingual and cross-lingual settings
- 🧠 Evaluating how different models handle **Arabic inputs**, **mixed label languages**, and **right-to-left (RTL)** alignment issues
- ✅ Highlighting top-performing models for real-world, multi-language use cases

Let’s dive in and benchmark some models! 🚀

---


In [60]:
from transformers import pipeline

### 🥇 Best Arabic Sentiment Model – `CAMeL-Lab/bert-base-arabic-camelbert-mix-sentiment`

After testing multiple Arabic sentiment models, this one stood out with excellent accuracy on both **Modern Standard Arabic** and **dialects** (like Egyptian).

For clarity, only the top-performing model is included here. Others showed noticeably lower accuracy or poor dialect support.

Let's load it and run a quick test. 🧪👇


In [65]:
classifier = pipeline("sentiment-analysis", model="CAMeL-Lab/bert-base-arabic-camelbert-mix-sentiment")

english = classifier("I love you")
arabic = classifier("أنا بحبك")
arabic_dialect = classifier("الواد سواق التوك توك جارنا عسل")
arabic_formal = classifier("أنا أحبك")
french = classifier("je t'aime")

print(english)
print(arabic)
print(arabic_dialect)
print(arabic_formal)
print(french)

Device set to use cpu


[{'label': 'positive', 'score': 0.616008996963501}]
[{'label': 'positive', 'score': 0.9781275987625122}]
[{'label': 'positive', 'score': 0.973617434501648}]
[{'label': 'positive', 'score': 0.9768486022949219}]
[{'label': 'positive', 'score': 0.5023519396781921}]


## 🔎 Zero‑Shot Classification with `MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7`

This powerful multilingual XNLI model performed best across our Day 2 tests. Below we’ll run through five key scenarios to see how well it handles different input/label language pairings, plus mixed‑language labels and RTL alignment.

---

### 1️⃣ Arabic Input → Arabic Labels 
**Goal:** Check pure Arabic performance. 
**What to expect:** High confidence (95–99%) on clear MSA and dialectal sentences.

---

In [49]:
classifier = pipeline("zero-shot-classification", model="MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7")
classifier(
 "أنا أحب تعلم الذكاء الاصطناعي",
 candidate_labels=["تعليم", "رياضة", "طعام"]
)

Device set to use cpu


{'sequence': 'أنا أحب تعلم الذكاء الاصطناعي',
 'labels': ['تعليم', 'رياضة', 'طعام'],
 'scores': [0.9606460332870483, 0.029150988906621933, 0.010203025303781033]}

In [38]:
output = classifier(
 "أنا أحب تعلم الذكاء الاصطناعي",
 candidate_labels=["تعليم", "رياضة", "طعام"]
)
for label, score in zip(output['labels'], output['scores']):
 print(f"{label}: {score:.3f}")

تعليم: 0.961
رياضة: 0.029
طعام: 0.010


### 2️⃣ Arabic Input → English Labels 
**Goal:** See if the model can map Arabic text into English categories. 
**What to expect:** Strong accuracy (~85–90%), showing true cross‑lingual zero‑shot ability.

---

In [39]:
classifier(
 "أنا أحب تعلم الذكاء الاصطناعي",
 candidate_labels=["education", "sports", "politics"]
)

{'sequence': 'أنا أحب تعلم الذكاء الاصطناعي',
 'labels': ['education', 'politics', 'sports'],
 'scores': [0.8637387156486511, 0.07514170557260513, 0.06111961230635643]}

### 3️⃣ English Input → English Labels 
**Goal:** Benchmark against defaults on an all‑English task. 
**What to expect:** Solid English performance (80–85%), exceeding the default pipeline.

---

In [50]:
classifier(
 "I love learning AI",
 candidate_labels=["education", "sports", "food"]
)

{'sequence': 'I love learning AI',
 'labels': ['education', 'sports', 'food'],
 'scores': [0.8431005477905273, 0.11592638492584229, 0.04097312316298485]}

### 4️⃣ English Input → Arabic Labels 
**Goal:** Reverse the second test: English text, Arabic label set. 
**What to expect:** Reliable mapping (~80–85%), far above the default’s ~30%.

---

In [41]:
classifier(
 "I love learning AI",
 candidate_labels=["طعام", "تعليم", "رياضة"]
)

{'sequence': 'I love learning AI',
 'labels': ['تعليم', 'رياضة', 'طعام'],
 'scores': [0.8412964940071106, 0.09517763555049896, 0.06352593004703522]}

In [42]:
output = classifier("I love learning AI", candidate_labels=["طعام", "تعليم", "رياضة"])
for label, score in zip(output['labels'], output['scores']):
 print(f"{label}: {score:.3f}")

تعليم: 0.841
رياضة: 0.095
طعام: 0.064


### 5️⃣ Mixed Labels (Arabic + English) 
**Goal:** Stress‑test the model with a combined RTL/LTR label set. 
**What to expect:** 
- Correct top‑label selection 
- Perfect label ordering (English left, Arabic right) 
- No RTL scoring glitches

---

Let’s fire off each test! 🚀

In [43]:
classifier(
 "أنا أحب تعلم الذكاء الاصطناعي",
 candidate_labels=["education", "رياضة", "طعام"]
)

{'sequence': 'أنا أحب تعلم الذكاء الاصطناعي',
 'labels': ['education', 'رياضة', 'طعام'],
 'scores': [0.9260158538818359, 0.05480289086699486, 0.019181348383426666]}