IEIT-Yuan commited on
Commit
78cf570
·
verified ·
1 Parent(s): b2fd15d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -1
README.md CHANGED
@@ -8,6 +8,17 @@ library_name: sentence-transformers
8
  ---
9
  <h2 align="left">Yuan-embedding-2.0-en</h2>
10
 
 
 
 
 
 
 
 
 
 
 
 
11
  Yuan-embedding-2.0-en 是专门为英文文本检索任务设计的嵌入模型。我们在[Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B)的基础上,针对Retrieval任务与Reranking任务进行了进一步优化。主要工作如下:
12
 
13
  - 数据增强
@@ -18,7 +29,7 @@ Yuan-embedding-2.0-en 是专门为英文文本检索任务设计的嵌入模型
18
  - Matryoshka Representation Learning
19
  - Retrieval任务使用InfoNCE with in-batch-negative
20
  - 针对Reranking任务使用InfoNCE with in-batch-negative
21
-
22
 
23
  <h2 align="left">Usage</h2>
24
 
 
8
  ---
9
  <h2 align="left">Yuan-embedding-2.0-en</h2>
10
 
11
+ Yuan-embedding-2.0-en is an embedding model specifically designed for English text retrieval tasks. Built on the foundation of [Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B), we have further optimized it for both Retrieval and Reranking tasks. The key work is as follows:
12
+
13
+ - Data Augmentation
14
+ - Hard negative sampling: Dual evaluation is conducted using a Rerank model and LLM to filter out high-quality positive and negative samples
15
+ - LLM-synthesized data: The [Yuan2-M32](https://huggingface.co/IEITYuan/Yuan2-M32) is employed to perform LLM-based rewriting on the query data within the training dataset
16
+ - Loss function design
17
+ - Multi-Task loss
18
+ - Matryoshka Representation Learning
19
+ - The Retrieval tasks adopt InfoNCE with in-batch-negative
20
+ - The Reranking tasks adopt InfoNCE with in-batch-negative
21
+
22
  Yuan-embedding-2.0-en 是专门为英文文本检索任务设计的嵌入模型。我们在[Qwen/Qwen3-Embedding-0.6B](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B)的基础上,针对Retrieval任务与Reranking任务进行了进一步优化。主要工作如下:
23
 
24
  - 数据增强
 
29
  - Matryoshka Representation Learning
30
  - Retrieval任务使用InfoNCE with in-batch-negative
31
  - 针对Reranking任务使用InfoNCE with in-batch-negative
32
+
33
 
34
  <h2 align="left">Usage</h2>
35