S1-Base-1.5-32B-128K

中文版English

Model Introduction

This repository contains the S1-Base-1.5-32B-128K general scientific large language model, developed through post-training (SFT+GRPO) based on the scientific foundation model S1-Base-32B. This model maintains scientific reasoning capabilities while significantly enhancing long context understanding and reasoning abilities, as well as complex instruction following in scientific research scenarios. The model supports a context length of 128k.

Model Weights

The S1-Base-1.5-32B-128K model is open-sourced under the Apache 2.0 license. You can download the model weights from our Huggingface or ModelScope.

Model Name Huggingface Link ModelScope Link
S1-Base-1.5-32B-128K Download Download

Model Evaluation

To comprehensively validate the capabilities of S1-Base-1.5-32B-128K, we conducted systematic evaluations across three core competencies: long context ability, instruction following ability, and scientific reasoning ability. The results are shown in the table below.

Benchmark S1-Base-1.5-32B-128K S1-Base-32B Qwen3-32B GLM-Z1-32B-0414
CLongEval 52.95 44.97 47.71 32.11
InfiniteBench 40.76 37.54 40.14 30.45
IFEval 86.88 76.53 85.00 84.87
GPQA 70.77 69.44 66.04 55.81
ChemBench 62.30 63.60 61.81 55.85
LLM-MSE 88.61 91.26 88.50 80.97
LAB bench 36.18 41.52 34.45 29.89
AIME2024 81.46 81.25 80.63 79.37
AIME25 71.25 69.58 67.50 51.25

Key Highlights:

  • 📜 Enhanced Long Context Reasoning: The model leads among base models and similar-sized models on public long-context benchmarks such as CLongEval and InfiniteBench, with significant improvements in custom long-text evaluations for real-world scenarios involving papers and web pages.
  • 🎯 Improved Complex Instruction Following: Built with a scientific literature instruction following task system covering four major categories—document understanding, structured generation, information extraction, and chart comprehension—combined with multi-dimensional constraints including length, format, and content. The model maintains leadership on benchmarks like IFEval.
  • 🔬 Stable Scientific Reasoning Capability: The model shows significant advantages on GPQA, a comprehensive scientific capability evaluation benchmark covering biology, physics, and chemistry. Performance on other scientific task evaluation benchmarks remains stable without significant fluctuations due to context expansion.
  • 👍 User Feedback Data Flywheel: Continuously optimizes model performance and user experience in real-world scenarios by incorporating user likes and dislikes feedback from the ScienceOne platform.

Deployment

We recommend using vLLM to deploy S1-Base for efficient inference and OpenAI-compatible API services.

Quick start command example:

pip install vllm  
vllm serve <your_s1_model_path> --served-model-name s1-base-1.5-32b-128k

The API request and response formats are basically consistent with OpenAI. Please refer to the official vLLM documentation for details.

Generate responses using OpenAI Python SDK:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="")
resp = client.chat.completions.create(
    model="s1-base-1.5-32b-128k",
    messages=[{"role": "user", "content": "hi"}]
)
print(resp.choices[0].message.content)

Generate responses using CURL:

curl -X POST http://localhost:8000/v1/chat/completions -d '{"model": "s1-base-1.5-32b-128k", "messages":[{"role":"user", "content": "hi"}], "skip_special_tokens": false}' -H "Content-Type: application/json"
Downloads last month
16
Safetensors
Model size
33B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ScienceOne-AI/S1-Base-1.5-32B-128K

Quantizations
2 models

Collection including ScienceOne-AI/S1-Base-1.5-32B-128K