codefuse-ai
/

C2LLM-7B

Feature Extraction

Model card Files Files and versions

Elvis-t9 commited on 10 days ago

Commit

fbf453a

·

verified ·

1 Parent(s): a14932e

Update README.md

Files changed (1) hide show

README.md +10 -7

README.md CHANGED Viewed

@@ -5,13 +5,11 @@
     </a>
 </div>
-# Introduction
-## C2LLM: Advanced Code Embeddings for Deep Semantic Understanding
-**C2LLMs (Code Contrastive Large Language Model)** is a powerful new model for generating code embeddings, designed to capture the deep semantics of source code.
 #### Key Features
@@ -27,7 +25,7 @@ C2LLM is designed to be a go-to model for tasks like code search and Retrieval-A
 ## Usage (**HuggingFace Transformers**)
-```plain
 from transformers import AutoModel, AutoTokenizer
 import torch
@@ -36,6 +34,9 @@ model_path = "codefuse-ai/C2LLM-7B"
 # Load the model
 model = AutoModel.from_pretrained(model_path, torch_dtype=torch.bfloat16, trust_remote_code=True)
 # Prepare the data
 sentences = ['''int r = (int) params >> 8 & 0xff;
 int p = (int) params & 0xff;
@@ -63,6 +64,8 @@ return new RangeInfo(inclusive ? tempTo : tempTo + 1, tempFrom + 1, true);
 return new RangeInfo(tempFrom, inclusive ? tempTo + 1 : tempTo, false);
 }''']
 # Get the embeddings
 embeddings = model.encode(sentences)
 ```
@@ -113,7 +116,7 @@ embeddings = model.encode(sentences)
 ## Evaluation (**MTEB**)
-```plain
 from sentence_transformers import SentenceTransformer
 from mteb.models import ModelMeta
 from mteb.cache import ResultCache
@@ -141,4 +144,4 @@ If you find this project helpful, please give it a star. It means a lot to us!
 ## Correspondence to
-Jin Qin ([email protected]), Zihan Liao ([email protected]), Ziyin Zhang ([email protected]), Hang Yu ([email protected]), Peng Di ([email protected])

     </a>
 </div>
+# A New Frontier in Code Retrieval via Adaptive Cross-Attention Pooling
+**C2LLMs (Code Contrastive Large Language Models)** are powerful new models for generating code embeddings, designed to capture the deep semantics of source code.
 #### Key Features
 ## Usage (**HuggingFace Transformers**)
+```Python
 from transformers import AutoModel, AutoTokenizer
 import torch
 # Load the model
 model = AutoModel.from_pretrained(model_path, torch_dtype=torch.bfloat16, trust_remote_code=True)
+# Prepare your custom instruction
+instruction = "xxxxx"
 # Prepare the data
 sentences = ['''int r = (int) params >> 8 & 0xff;
 int p = (int) params & 0xff;
 return new RangeInfo(tempFrom, inclusive ? tempTo + 1 : tempTo, false);
 }''']
+sentences = [instruction+sentence for sentence in sentences]
 # Get the embeddings
 embeddings = model.encode(sentences)
 ```
 ## Evaluation (**MTEB**)
+```python
 from sentence_transformers import SentenceTransformer
 from mteb.models import ModelMeta
 from mteb.cache import ResultCache
 ## Correspondence to
+Jin Qin ([email protected]), Zihan Liao ([email protected]), Ziyin Zhang ([email protected]), Hang Yu ([email protected]), Peng Di ([email protected])