Spaces:

FractalAI
/

Research

Sleeping

App Files Files Community

Research / README.md

Proff12

Updated Readme

b9d0394 verified 3 months ago

preview code

raw

history blame contribute delete

1.51 kB

metadata

title: Fathom R1 Chat
emoji: 🤖
colorFrom: indigo
colorTo: blue
sdk: docker
pinned: false

Fathom R1 Chat — Full‑stack (React + FastAPI)

ChatGPT‑style UI on React + a FastAPI backend that calls FractalAIResearch/Fathom-R1-14B via transformers.

Run with Docker (GPU)

Requires an NVIDIA GPU + NVIDIA Container Toolkit.

docker build -t fathom-r1-chat .
docker run --gpus all -p 8000:8000           -e MODEL_ID=FractalAIResearch/Fathom-R1-14B           -e QUANTIZE=auto           fathom-r1-chat
# Open http://localhost:8000

Notes

Model is derived from DeepSeek-R1-Distill-Qwen-14B and targets 16K context usage. Use the tokenizer chat template.
For long answers, bump max_new_tokens in the request.
If you need private HF access, pass -e HUGGING_FACE_HUB_TOKEN=....

Dev mode (run separately)

# backend
cd backend
python3 -m venv .venv && source .venv/bin/activate
pip install --index-url https://download.pytorch.org/whl/cu121 torch==2.4.1+cu121
pip install -r requirements.txt
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload

# frontend (new terminal)
cd frontend
npm ci
npm run dev

API

POST /api/chat with { messages: [{role, content}, ...], max_new_tokens, temperature, top_p } → { reply, model }

Hardware

14B parameter model; for comfortable generation use >=24–40 GB VRAM or 4/8‑bit quantization on 16–24 GB GPUs.

License

MIT (model card states MIT) and this template is MIT.