metadata
title: Fathom R1 Chat
emoji: 🤖
colorFrom: indigo
colorTo: blue
sdk: docker
pinned: false
Fathom R1 Chat — Full‑stack (React + FastAPI)
ChatGPT‑style UI on React + a FastAPI backend that calls FractalAIResearch/Fathom-R1-14B via transformers.
Run with Docker (GPU)
Requires an NVIDIA GPU + NVIDIA Container Toolkit.
docker build -t fathom-r1-chat .
docker run --gpus all -p 8000:8000 -e MODEL_ID=FractalAIResearch/Fathom-R1-14B -e QUANTIZE=auto fathom-r1-chat
# Open http://localhost:8000
Notes
- Model is derived from DeepSeek-R1-Distill-Qwen-14B and targets 16K context usage. Use the tokenizer chat template.
- For long answers, bump
max_new_tokensin the request. - If you need private HF access, pass
-e HUGGING_FACE_HUB_TOKEN=....
Dev mode (run separately)
# backend
cd backend
python3 -m venv .venv && source .venv/bin/activate
pip install --index-url https://download.pytorch.org/whl/cu121 torch==2.4.1+cu121
pip install -r requirements.txt
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
# frontend (new terminal)
cd frontend
npm ci
npm run dev
API
POST /api/chatwith{ messages: [{role, content}, ...], max_new_tokens, temperature, top_p }→{ reply, model }
Hardware
- 14B parameter model; for comfortable generation use >=24–40 GB VRAM or 4/8‑bit quantization on 16–24 GB GPUs.
License
- MIT (model card states MIT) and this template is MIT.