dhashu
/

sql-genie-full

@@ -1,13 +1,14 @@
 ---
 base_model: unsloth/meta-llama-3.1-8b-bnb-4bit
 tags:
 - text-generation
 - sql-generation
-- finetuning
 - lora
 - peft
 - unsloth
-- llama
 license: apache-2.0
 language:
 - en
@@ -15,90 +16,95 @@ language:
 # SQL-Genie (LLaMA-3.1-8B Fine-Tuned)
-## Model Overview
-**SQL-Genie** is a fine-tuned version of **LLaMA-3.1-8B**, specialized for **natural language to SQL generation**.
-The model was trained using **parameter-efficient fine-tuning (LoRA)** on a structured SQL instruction dataset, enabling accurate SQL query generation while keeping training and inference costs low.
 - **Developed by:** dhashu
 - **Base model:** `unsloth/meta-llama-3.1-8b-bnb-4bit`
 - **License:** Apache-2.0
-- **Training framework:** Unsloth + Hugging Face TRL
 ---
-## Training Methodology
-This model was fine-tuned using **LoRA (Low-Rank Adaptation)** via the **PEFT** framework.
-### Key Training Details
 - Base model loaded in **4-bit quantization** for memory efficiency
-- **LoRA adapters** applied to attention and feed-forward layers
-- Base model weights remained **frozen**
-- Only LoRA parameters were trained
-- Training performed using **Supervised Fine-Tuning (SFT)**
-This approach allows the model to learn SQL generation patterns efficiently without full model retraining.
 ---
-## Dataset
-The model was trained on a subset of the **`b-mc2/sql-create-context`** dataset, which contains:
 - Natural language questions
-- Database schema/context
 - Corresponding SQL queries
-Each sample was formatted as an instruction-style prompt to improve reasoning and output structure.
 ---
-## Performance & Efficiency
-- 🚀 **2× faster fine-tuning** using **Unsloth**
 - 💾 **Low VRAM usage** via 4-bit quantization
-- 🧠 Improved schema understanding and SQL syntax generation
 - ⚡ Suitable for real-time inference and lightweight deployments
 ---
-## Model Variants
-This repository may contain **either**:
-### 🔹 LoRA Adapter Model
-- Contains only LoRA weights
-- Requires loading the base LLaMA-3.1-8B model
-- Ideal for research and modular fine-tuning
-### 🔹 Merged Model (if applicable)
 - LoRA adapters merged into base weights
 - No PEFT required at inference time
 - Ready-to-use single checkpoint
-(Check the repository files to confirm the variant.)
 ---
-## Intended Use Cases
-- Natural language → SQL query generation
-- Database querying assistants
-- AI-powered analytics tools
-- Educational and research purposes
----
-## Limitations
-- Trained on a limited SQL dataset subset
-- Not guaranteed to generalize to all SQL dialects
-- Should be validated before production database usage
----
-## Acknowledgements
-This model was trained using **Unsloth**, enabling faster and more memory-efficient fine-tuning of large language models.
-[![Made with Unsloth](https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png)](https://github.com/unslothai/unsloth)

 ---
 base_model: unsloth/meta-llama-3.1-8b-bnb-4bit
+pipeline_tag: text-generation
 tags:
 - text-generation
 - sql-generation
+- llama
 - lora
 - peft
 - unsloth
+- transformers
 license: apache-2.0
 language:
 - en
 # SQL-Genie (LLaMA-3.1-8B Fine-Tuned)
+## 🧠 Model Overview
+**SQL-Genie** is a fine-tuned version of **LLaMA-3.1-8B**, specialized for converting **natural language questions into SQL queries**.
+The model was trained using **parameter-efficient fine-tuning (LoRA)** on a structured SQL instruction dataset, enabling strong SQL generation performance while remaining lightweight and affordable to train on limited compute (Google Colab).
 - **Developed by:** dhashu
 - **Base model:** `unsloth/meta-llama-3.1-8b-bnb-4bit`
 - **License:** Apache-2.0
+- **Training stack:** Unsloth + Hugging Face TRL
 ---
+## ⚙️ Training Methodology
+This model was trained using **LoRA (Low-Rank Adaptation)** via the **PEFT** framework.
+### Key Details
 - Base model loaded in **4-bit quantization** for memory efficiency
+- **Base weights frozen**
+- **LoRA adapters** applied to:
+  - Attention layers (`q_proj`, `k_proj`, `v_proj`, `o_proj`)
+  - Feed-forward layers (`gate_proj`, `up_proj`, `down_proj`)
+- Fine-tuned using **Supervised Fine-Tuning (SFT)**
+This approach allows efficient specialization without full model retraining.
 ---
+## 📊 Dataset
+The model was trained on a subset of the **`b-mc2/sql-create-context`** dataset, which includes:
 - Natural language questions
+- Database schema / context
 - Corresponding SQL queries
+Each sample was formatted as an **instruction-style prompt** to improve reasoning and structured output.
 ---
+## 🚀 Performance & Efficiency
+- 🚀 **2× faster fine-tuning** using Unsloth
 - 💾 **Low VRAM usage** via 4-bit quantization
+- 🧠 Improved SQL syntax and schema understanding
 - ⚡ Suitable for real-time inference and lightweight deployments
 ---
+## 🧩 Model Variants
+This repository contains a **merged model**:
+### 🔹 Merged 4-bit Model
 - LoRA adapters merged into base weights
 - No PEFT required at inference time
 - Ready-to-use single checkpoint
+- Optimized for easy deployment
 ---
+## ▶️ How to Use (Inference)
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+model_id = "dhashu/sql-genie-full"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    device_map="auto",
+    load_in_4bit=True,
+)
+prompt = """Below is an input question, context is given to help. Generate a SQL response.
+### Input: List all employees hired after 2020
+### Context: CREATE TABLE employees(id, name, hire_date)
+### SQL Response:
+"""
+inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
+outputs = model.generate(
+    **inputs,
+    max_new_tokens=128,
+    temperature=0.7,
+)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))