Research / README.md
Proff12's picture
Updated Readme
b9d0394 verified
metadata
title: Fathom R1 Chat
emoji: 🤖
colorFrom: indigo
colorTo: blue
sdk: docker
pinned: false

Fathom R1 Chat — Full‑stack (React + FastAPI)

ChatGPT‑style UI on React + a FastAPI backend that calls FractalAIResearch/Fathom-R1-14B via transformers.

Run with Docker (GPU)

Requires an NVIDIA GPU + NVIDIA Container Toolkit.

docker build -t fathom-r1-chat .
docker run --gpus all -p 8000:8000           -e MODEL_ID=FractalAIResearch/Fathom-R1-14B           -e QUANTIZE=auto           fathom-r1-chat
# Open http://localhost:8000

Notes

  • Model is derived from DeepSeek-R1-Distill-Qwen-14B and targets 16K context usage. Use the tokenizer chat template.
  • For long answers, bump max_new_tokens in the request.
  • If you need private HF access, pass -e HUGGING_FACE_HUB_TOKEN=....

Dev mode (run separately)

# backend
cd backend
python3 -m venv .venv && source .venv/bin/activate
pip install --index-url https://download.pytorch.org/whl/cu121 torch==2.4.1+cu121
pip install -r requirements.txt
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload

# frontend (new terminal)
cd frontend
npm ci
npm run dev

API

  • POST /api/chat with { messages: [{role, content}, ...], max_new_tokens, temperature, top_p }{ reply, model }

Hardware

  • 14B parameter model; for comfortable generation use >=24–40 GB VRAM or 4/8‑bit quantization on 16–24 GB GPUs.

License

  • MIT (model card states MIT) and this template is MIT.