ComPO

community

AI & ML interests

None defined yet.

Recent Activity

PeterLauLukCh authored a paper 7 days ago

Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward

PeterLauLukCh authored a paper 7 days ago

GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators

PeterLauLukCh submitted a paper 13 days ago

Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward

View all activity

PeterLauLukCh

authored 2 papers 7 days ago

Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward

Paper • 2512.16912 • Published 14 days ago • 10

GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators

Paper • 2512.19682 • Published 10 days ago • 15

PeterLauLukCh

submitted a paper to Daily Papers 13 days ago

Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward

Paper • 2512.16912 • Published 14 days ago • 10

PeterLauLukCh

updated a collection 3 months ago

Ablation on Scaling

2 items • Updated Oct 4, 2025

PeterLauLukCh

updated a Space 3 months ago

README

Released Models of ComPO

PeterLauLukCh

updated a model 5 months ago

ComparisonPO/Mistral-7B-Instruct-A0.21B-ComPO

7B • Updated Aug 4, 2025 • 5

PeterLauLukCh

published a model 5 months ago

ComparisonPO/Mistral-7B-Instruct-A0.21B-ComPO

7B • Updated Aug 4, 2025 • 5

PeterLauLukCh

updated a model 5 months ago

ComparisonPO/Mistral-7B-Instruct-ComPO-3300pert-300iter-2

7B • Updated Jul 21, 2025 • 5

PeterLauLukCh

published a model 5 months ago

ComparisonPO/Mistral-7B-Instruct-ComPO-3300pert-300iter-2

7B • Updated Jul 21, 2025 • 5

PeterLauLukCh

updated a collection 6 months ago

Ablation on Scaling

2 items • Updated Oct 4, 2025

PeterLauLukCh

updated a model 6 months ago

ComparisonPO/Mistral-7B-Instruct-ComPO-3300pert-300iter

7B • Updated Jul 13, 2025 • 7

PeterLauLukCh

published a model 6 months ago

ComparisonPO/Mistral-7B-Instruct-ComPO-3300pert-300iter

7B • Updated Jul 13, 2025 • 7

PeterLauLukCh

authored 4 papers 6 months ago

Displacement-Sparse Neural Optimal Transport

Paper • 2502.01889 • Published Feb 3, 2025

Geometric Framework for 3D Cell Segmentation Correction

Paper • 2502.01890 • Published Feb 3, 2025

ComPO: Preference Alignment via Comparison Oracles

Paper • 2505.05465 • Published May 8, 2025 • 1

Spectral Policy Optimization: Coloring your Incorrect Reasoning in GRPO

Paper • 2505.11595 • Published May 16, 2025 • 1

PeterLauLukCh

updated a model 9 months ago

ComparisonPO/Gemma-2-9b-it-SimPO-ComPO

9B • Updated Apr 10, 2025 • 3

PeterLauLukCh

updated a collection 9 months ago

SimPO+ComPO

4 items • Updated Apr 10, 2025

PeterLauLukCh

published a model 9 months ago

ComparisonPO/Gemma-2-9b-it-SimPO-ComPO

9B • Updated Apr 10, 2025 • 3

PeterLauLukCh

updated a collection 9 months ago

SimPO+ComPO

4 items • Updated Apr 10, 2025