S1-Base-1.5-32B-128K

Model Introduction

This repository contains the S1-Base-1.5-32B-128K general scientific large language model, developed through post-training (SFT+GRPO) based on the scientific foundation model S1-Base-32B. This model maintains scientific reasoning capabilities while significantly enhancing long context understanding and reasoning abilities, as well as complex instruction following in scientific research scenarios. The model supports a context length of 128k.

Model Weights

The S1-Base-1.5-32B-128K model is open-sourced under the Apache 2.0 license. You can download the model weights from our Huggingface or ModelScope.

Model Name	Huggingface Link	ModelScope Link
S1-Base-1.5-32B-128K	Download	Download

Model Evaluation

To comprehensively validate the capabilities of S1-Base-1.5-32B-128K, we conducted systematic evaluations across three core competencies: long context ability, instruction following ability, and scientific reasoning ability. The results are shown in the table below.

Benchmark	S1-Base-1.5-32B-128K	S1-Base-32B	Qwen3-32B	GLM-Z1-32B-0414
CLongEval	52.95	44.97	47.71	32.11
InfiniteBench	40.76	37.54	40.14	30.45
IFEval	86.88	76.53	85.00	84.87
GPQA	70.77	69.44	66.04	55.81
ChemBench	62.30	63.60	61.81	55.85
LLM-MSE	88.61	91.26	88.50	80.97
LAB bench	36.18	41.52	34.45	29.89
AIME2024	81.46	81.25	80.63	79.37
AIME25	71.25	69.58	67.50	51.25

Key Highlights:

📜 Enhanced Long Context Reasoning: The model leads among base models and similar-sized models on public long-context benchmarks such as CLongEval and InfiniteBench, with significant improvements in custom long-text evaluations for real-world scenarios involving papers and web pages.
🎯 Improved Complex Instruction Following: Built with a scientific literature instruction following task system covering four major categories—document understanding, structured generation, information extraction, and chart comprehension—combined with multi-dimensional constraints including length, format, and content. The model maintains leadership on benchmarks like IFEval.
🔬 Stable Scientific Reasoning Capability: The model shows significant advantages on GPQA, a comprehensive scientific capability evaluation benchmark covering biology, physics, and chemistry. Performance on other scientific task evaluation benchmarks remains stable without significant fluctuations due to context expansion.
👍 User Feedback Data Flywheel: Continuously optimizes model performance and user experience in real-world scenarios by incorporating user likes and dislikes feedback from the ScienceOne platform.

Deployment

We recommend using vLLM to deploy S1-Base for efficient inference and OpenAI-compatible API services.

Quick start command example:

pip install vllm  
vllm serve <your_s1_model_path> --served-model-name s1-base-1.5-32b-128k

The API request and response formats are basically consistent with OpenAI. Please refer to the official vLLM documentation for details.

Generate responses using OpenAI Python SDK:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="")
resp = client.chat.completions.create(
    model="s1-base-1.5-32b-128k",
    messages=[{"role": "user", "content": "hi"}]
)
print(resp.choices[0].message.content)

Generate responses using CURL:

curl -X POST http://localhost:8000/v1/chat/completions -d '{"model": "s1-base-1.5-32b-128k", "messages":[{"role":"user", "content": "hi"}], "skip_special_tokens": false}' -H "Content-Type: application/json"

Downloads last month: 16

Safetensors

Model size

33B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ScienceOne-AI/S1-Base-1.5-32B-128K

Quantizations

2 models

Collection including ScienceOne-AI/S1-Base-1.5-32B-128K

S1-Base-1.5

Collection

2 items • Updated 2 days ago