Update README.md
Browse files
README.md
CHANGED
|
@@ -8,6 +8,17 @@ library_name: sentence-transformers
|
|
| 8 |
---
|
| 9 |
<h2 align="left">Yuan-embedding-2.0-en</h2>
|
| 10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
Yuan-embedding-2.0-en 是专门为英文文本检索任务设计的嵌入模型。我们在[Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B)的基础上,针对Retrieval任务与Reranking任务进行了进一步优化。主要工作如下:
|
| 12 |
|
| 13 |
- 数据增强
|
|
@@ -18,7 +29,7 @@ Yuan-embedding-2.0-en 是专门为英文文本检索任务设计的嵌入模型
|
|
| 18 |
- Matryoshka Representation Learning
|
| 19 |
- Retrieval任务使用InfoNCE with in-batch-negative
|
| 20 |
- 针对Reranking任务使用InfoNCE with in-batch-negative
|
| 21 |
-
|
| 22 |
|
| 23 |
<h2 align="left">Usage</h2>
|
| 24 |
|
|
|
|
| 8 |
---
|
| 9 |
<h2 align="left">Yuan-embedding-2.0-en</h2>
|
| 10 |
|
| 11 |
+
Yuan-embedding-2.0-en is an embedding model specifically designed for English text retrieval tasks. Built on the foundation of [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B), we have further optimized it for both Retrieval and Reranking tasks. The key work is as follows:
|
| 12 |
+
|
| 13 |
+
- Data Augmentation
|
| 14 |
+
- Hard negative sampling: Dual evaluation is conducted using a Rerank model and LLM to filter out high-quality positive and negative samples
|
| 15 |
+
- LLM-synthesized data: The [Yuan2-M32](https://huggingface.co/IEITYuan/Yuan2-M32) is employed to perform LLM-based rewriting on the query data within the training dataset
|
| 16 |
+
- Loss function design
|
| 17 |
+
- Multi-Task loss
|
| 18 |
+
- Matryoshka Representation Learning
|
| 19 |
+
- The Retrieval tasks adopt InfoNCE with in-batch-negative
|
| 20 |
+
- The Reranking tasks adopt InfoNCE with in-batch-negative
|
| 21 |
+
|
| 22 |
Yuan-embedding-2.0-en 是专门为英文文本检索任务设计的嵌入模型。我们在[Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B)的基础上,针对Retrieval任务与Reranking任务进行了进一步优化。主要工作如下:
|
| 23 |
|
| 24 |
- 数据增强
|
|
|
|
| 29 |
- Matryoshka Representation Learning
|
| 30 |
- Retrieval任务使用InfoNCE with in-batch-negative
|
| 31 |
- 针对Reranking任务使用InfoNCE with in-batch-negative
|
| 32 |
+
|
| 33 |
|
| 34 |
<h2 align="left">Usage</h2>
|
| 35 |
|