Upload folder using huggingface_hub
Browse files- .gitattributes +4 -0
- README.md +127 -1
- Tcomanr-V2_6-4.0B-F16.gguf +3 -0
- Tcomanr-V2_6-4.0B-MXFP4_MOE.gguf +3 -0
- Tcomanr-V2_6-4.0B-Q8_0.gguf +3 -0
- added_tokens.json +28 -0
- chat_template.jinja +141 -0
- chat_template.txt +141 -0
- config.json +68 -0
- mergekit_config.yml +65 -0
- merges.txt +0 -0
- model-00001-of-00009.safetensors +3 -0
- model-00002-of-00009.safetensors +3 -0
- model-00003-of-00009.safetensors +3 -0
- model-00004-of-00009.safetensors +3 -0
- model-00005-of-00009.safetensors +3 -0
- model-00006-of-00009.safetensors +3 -0
- model-00007-of-00009.safetensors +3 -0
- model-00008-of-00009.safetensors +3 -0
- model-00009-of-00009.safetensors +3 -0
- model.safetensors.index.json +1 -0
- special_tokens_map.json +31 -0
- tokenizer.json +3 -0
- tokenizer_config.json +241 -0
- vocab.json +0 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,7 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
Tcomanr-V2_6-4.0B-F16.gguf filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
Tcomanr-V2_6-4.0B-MXFP4_MOE.gguf filter=lfs diff=lfs merge=lfs -text
|
| 38 |
+
Tcomanr-V2_6-4.0B-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
| 39 |
+
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
|
@@ -1,3 +1,129 @@
|
|
| 1 |
---
|
| 2 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model:
|
| 3 |
+
- ValiantLabs/Qwen3-4B-Esper3
|
| 4 |
+
- ValiantLabs/Qwen3-4B-ShiningValiant3
|
| 5 |
+
- ertghiu256/Qwen3-Hermes-4b
|
| 6 |
+
- ertghiu256/qwen-3-4b-mixture-of-thought
|
| 7 |
+
- ertghiu256/qwen3-4b-code-reasoning
|
| 8 |
+
- janhq/Jan-v1-4B
|
| 9 |
+
- Qwen/Qwen3-4B-Thinking-2507
|
| 10 |
+
- ertghiu256/Qwen3-4b-2507-Thinking-math-and-code
|
| 11 |
+
- quelmap/Lightning-4b
|
| 12 |
+
- GetSoloTech/Qwen3-Code-Reasoning-4B
|
| 13 |
+
- Qwen/Qwen3-4b-Instruct-2507
|
| 14 |
+
- ertghiu256/qwen3-multi-reasoner
|
| 15 |
+
- Tesslate/WEBGEN-4B-Preview
|
| 16 |
+
- huihui-ai/Huihui-Qwen3-4B-Thinking-2507-abliterated
|
| 17 |
+
- ertghiu256/qwen3-math-reasoner
|
| 18 |
+
- ertghiu256/Qwen3-4B-Thinking-2507-Hermes-3
|
| 19 |
+
- Goekdeniz-Guelmez/Josiefied-Qwen3-4B-Instruct-2507-gabliterated-v2
|
| 20 |
+
- Tesslate/UIGEN-FX-4B-Preview
|
| 21 |
+
- POLARIS-Project/Polaris-4B-Preview
|
| 22 |
+
library_name: transformers
|
| 23 |
+
tags:
|
| 24 |
+
- mergekit
|
| 25 |
+
- merge
|
| 26 |
+
|
| 27 |
---
|
| 28 |
+
# Tcomanr-V2_6
|
| 29 |
+
|
| 30 |
+
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
| 31 |
+
|
| 32 |
+
## Merge Details
|
| 33 |
+
### Merge Method
|
| 34 |
+
|
| 35 |
+
This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [Qwen/Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507) as a base.
|
| 36 |
+
|
| 37 |
+
### Models Merged
|
| 38 |
+
|
| 39 |
+
The following models were included in the merge:
|
| 40 |
+
* [ValiantLabs/Qwen3-4B-Esper3](https://huggingface.co/ValiantLabs/Qwen3-4B-Esper3)
|
| 41 |
+
* [ValiantLabs/Qwen3-4B-ShiningValiant3](https://huggingface.co/ValiantLabs/Qwen3-4B-ShiningValiant3)
|
| 42 |
+
* [ertghiu256/Qwen3-Hermes-4b](https://huggingface.co/ertghiu256/Qwen3-Hermes-4b)
|
| 43 |
+
* [ertghiu256/qwen-3-4b-mixture-of-thought](https://huggingface.co/ertghiu256/qwen-3-4b-mixture-of-thought)
|
| 44 |
+
* [ertghiu256/qwen3-4b-code-reasoning](https://huggingface.co/ertghiu256/qwen3-4b-code-reasoning)
|
| 45 |
+
* [janhq/Jan-v1-4B](https://huggingface.co/janhq/Jan-v1-4B)
|
| 46 |
+
* [ertghiu256/Qwen3-4b-2507-Thinking-math-and-code](https://huggingface.co/ertghiu256/Qwen3-4b-2507-Thinking-math-and-code)
|
| 47 |
+
* [quelmap/Lightning-4b](https://huggingface.co/quelmap/Lightning-4b)
|
| 48 |
+
* [GetSoloTech/Qwen3-Code-Reasoning-4B](https://huggingface.co/GetSoloTech/Qwen3-Code-Reasoning-4B)
|
| 49 |
+
* [Qwen/Qwen3-4b-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4b-Instruct-2507)
|
| 50 |
+
* [ertghiu256/qwen3-multi-reasoner](https://huggingface.co/ertghiu256/qwen3-multi-reasoner)
|
| 51 |
+
* [Tesslate/WEBGEN-4B-Preview](https://huggingface.co/Tesslate/WEBGEN-4B-Preview)
|
| 52 |
+
* [huihui-ai/Huihui-Qwen3-4B-Thinking-2507-abliterated](https://huggingface.co/huihui-ai/Huihui-Qwen3-4B-Thinking-2507-abliterated)
|
| 53 |
+
* [ertghiu256/qwen3-math-reasoner](https://huggingface.co/ertghiu256/qwen3-math-reasoner)
|
| 54 |
+
* [ertghiu256/Qwen3-4B-Thinking-2507-Hermes-3](https://huggingface.co/ertghiu256/Qwen3-4B-Thinking-2507-Hermes-3)
|
| 55 |
+
* [Goekdeniz-Guelmez/Josiefied-Qwen3-4B-Instruct-2507-gabliterated-v2](https://huggingface.co/Goekdeniz-Guelmez/Josiefied-Qwen3-4B-Instruct-2507-gabliterated-v2)
|
| 56 |
+
* [Tesslate/UIGEN-FX-4B-Preview](https://huggingface.co/Tesslate/UIGEN-FX-4B-Preview)
|
| 57 |
+
* [POLARIS-Project/Polaris-4B-Preview](https://huggingface.co/POLARIS-Project/Polaris-4B-Preview)
|
| 58 |
+
|
| 59 |
+
### Configuration
|
| 60 |
+
|
| 61 |
+
The following YAML configuration was used to produce this model:
|
| 62 |
+
|
| 63 |
+
```yaml
|
| 64 |
+
models:
|
| 65 |
+
- model: ertghiu256/qwen3-math-reasoner
|
| 66 |
+
parameters:
|
| 67 |
+
weight: 0.85
|
| 68 |
+
- model: ertghiu256/qwen3-4b-code-reasoning
|
| 69 |
+
parameters:
|
| 70 |
+
weight: 0.9
|
| 71 |
+
- model: ertghiu256/qwen-3-4b-mixture-of-thought
|
| 72 |
+
parameters:
|
| 73 |
+
weight: 1.0
|
| 74 |
+
- model: POLARIS-Project/Polaris-4B-Preview
|
| 75 |
+
parameters:
|
| 76 |
+
weight: 1.0
|
| 77 |
+
- model: ertghiu256/qwen3-multi-reasoner
|
| 78 |
+
parameters:
|
| 79 |
+
weight: 0.85
|
| 80 |
+
- model: ertghiu256/Qwen3-Hermes-4b
|
| 81 |
+
parameters:
|
| 82 |
+
weight: 0.7
|
| 83 |
+
- model: ValiantLabs/Qwen3-4B-Esper3
|
| 84 |
+
parameters:
|
| 85 |
+
weight: 0.8
|
| 86 |
+
- model: Tesslate/WEBGEN-4B-Preview
|
| 87 |
+
parameters:
|
| 88 |
+
weight: 1.0
|
| 89 |
+
- model: Tesslate/UIGEN-FX-4B-Preview
|
| 90 |
+
parameters:
|
| 91 |
+
weight: 0.95
|
| 92 |
+
- model: ValiantLabs/Qwen3-4B-ShiningValiant3
|
| 93 |
+
parameters:
|
| 94 |
+
weight: 0.8
|
| 95 |
+
- model: huihui-ai/Huihui-Qwen3-4B-Thinking-2507-abliterated
|
| 96 |
+
parameters:
|
| 97 |
+
weight: 0.85
|
| 98 |
+
- model: Qwen/Qwen3-4B-Thinking-2507
|
| 99 |
+
parameters:
|
| 100 |
+
weight: 1.0
|
| 101 |
+
- model: Qwen/Qwen3-4b-Instruct-2507
|
| 102 |
+
parameters:
|
| 103 |
+
weight: 1.0
|
| 104 |
+
- model: GetSoloTech/Qwen3-Code-Reasoning-4B
|
| 105 |
+
parameters:
|
| 106 |
+
weight: 0.95
|
| 107 |
+
- model: ertghiu256/Qwen3-4B-Thinking-2507-Hermes-3
|
| 108 |
+
parameters:
|
| 109 |
+
weight: 1.0
|
| 110 |
+
- model: janhq/Jan-v1-4B
|
| 111 |
+
parameters:
|
| 112 |
+
weight: 0.25
|
| 113 |
+
- model: Goekdeniz-Guelmez/Josiefied-Qwen3-4B-Instruct-2507-gabliterated-v2
|
| 114 |
+
parameters:
|
| 115 |
+
weight: 0.85
|
| 116 |
+
- model: quelmap/Lightning-4b
|
| 117 |
+
parameters:
|
| 118 |
+
weight: 0.75
|
| 119 |
+
- model: ertghiu256/Qwen3-4b-2507-Thinking-math-and-code
|
| 120 |
+
parameters:
|
| 121 |
+
weight: 1.0
|
| 122 |
+
merge_method: ties
|
| 123 |
+
base_model: Qwen/Qwen3-4B-Thinking-2507
|
| 124 |
+
parameters:
|
| 125 |
+
normalize: true
|
| 126 |
+
int8_mask: true
|
| 127 |
+
lambda: 1.0
|
| 128 |
+
dtype: float16
|
| 129 |
+
```
|
Tcomanr-V2_6-4.0B-F16.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c5b706375df18f86d4c8bc8e0f102a71310486d25de6eb7703b89860c6d2a440
|
| 3 |
+
size 8051295520
|
Tcomanr-V2_6-4.0B-MXFP4_MOE.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2fd357e3ec8a08518e92b81408e55842ae64ae327b2e055a67353a1c8747f869
|
| 3 |
+
size 4280415520
|
Tcomanr-V2_6-4.0B-Q8_0.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:724928b450561e923aaa588391239d328451568fdd7d5c977d2b0792783362e5
|
| 3 |
+
size 4280415520
|
added_tokens.json
ADDED
|
@@ -0,0 +1,28 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"</think>": 151668,
|
| 3 |
+
"</tool_call>": 151658,
|
| 4 |
+
"</tool_response>": 151666,
|
| 5 |
+
"<think>": 151667,
|
| 6 |
+
"<tool_call>": 151657,
|
| 7 |
+
"<tool_response>": 151665,
|
| 8 |
+
"<|box_end|>": 151649,
|
| 9 |
+
"<|box_start|>": 151648,
|
| 10 |
+
"<|endoftext|>": 151643,
|
| 11 |
+
"<|file_sep|>": 151664,
|
| 12 |
+
"<|fim_middle|>": 151660,
|
| 13 |
+
"<|fim_pad|>": 151662,
|
| 14 |
+
"<|fim_prefix|>": 151659,
|
| 15 |
+
"<|fim_suffix|>": 151661,
|
| 16 |
+
"<|im_end|>": 151645,
|
| 17 |
+
"<|im_start|>": 151644,
|
| 18 |
+
"<|image_pad|>": 151655,
|
| 19 |
+
"<|object_ref_end|>": 151647,
|
| 20 |
+
"<|object_ref_start|>": 151646,
|
| 21 |
+
"<|quad_end|>": 151651,
|
| 22 |
+
"<|quad_start|>": 151650,
|
| 23 |
+
"<|repo_name|>": 151663,
|
| 24 |
+
"<|video_pad|>": 151656,
|
| 25 |
+
"<|vision_end|>": 151653,
|
| 26 |
+
"<|vision_pad|>": 151654,
|
| 27 |
+
"<|vision_start|>": 151652
|
| 28 |
+
}
|
chat_template.jinja
ADDED
|
@@ -0,0 +1,141 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{{- "<|im_start|>system\n" -}}
|
| 2 |
+
{%- if tools -%}
|
| 3 |
+
{%- if messages[0].role == "system" -%}
|
| 4 |
+
{{- messages[0].content + "\n\n" -}}
|
| 5 |
+
{%- endif -%}
|
| 6 |
+
{{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" -}}
|
| 7 |
+
{%- for tool in tools -%}
|
| 8 |
+
{{- "\n" -}}
|
| 9 |
+
{{- tool | tojson -}}
|
| 10 |
+
{%- endfor -%}
|
| 11 |
+
{{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call>\n" -}}
|
| 12 |
+
{%- elif messages[0].role == "system" -%}
|
| 13 |
+
{{- messages[0].content -}}
|
| 14 |
+
{%- endif -%}
|
| 15 |
+
{{- "# Commands: \n- /think or think: Think or reason step-by-step using a long chain-of-thought and self-reflection inside think tags. If this command is present, ALWAYS start your thinking or reasoning with '<think>\\nSince the user used the \"/think\" command, I will:\\nThink step-by-step with **long chain-of-thought** and **self-reflection**. I will only respond when I'm 100% sure. I will provide the final response after closing this thinking section.'\n- /shortthink or shortthink: Think or reason step-by-step briefly and shortly inside think tags. If this command is present, ALWAYS start your thinking or reasoning with '<think>\\nSince the user used the \"/shortthink\" command, I will:\\nThink step-by-step about the user's query briefly and shortly. I will only respond when I'm more than 50% sure. I will provide the final response after closing this thinking section.' \n- /nothink or nothink: Immediate response and you are FORBIDDEN FROM THINKING or REASONING, without any DOUBLE-CHECKING, and WITHOUT DRAFTING your response, just an immediate response WITH NO REPETITION. If this command is present, ALWAYS start respond with '<think>\\nSince the user used the \"/nothink\" command, I will:\\nImmediately give my final response without any thinking or self-reflection. Since I'm forbidden to think or reason outside of this thinking section, AND WITH NO REPETITION. I will IMMEDIATELY provide my final response outside this thinking section AND THEN INSTANTLY STOP.\\n</think>'\n- /clear: Forgets EVERYTHING (including previous messages, topics, etc.) that came before this command got executed.\nDo not take these commands into consideration, instead just do as they instruct.\nALWAYS REMEMBER THESE COMMANDS. NO NEED TO REPEAT THIS PART OF THE SYSTEM INSTRUCTIONS (COMMANDS) AS IT IS USELESS AND UNIMPORTANT.\n" -}}
|
| 16 |
+
{{- "\n<|im_end|>" -}}
|
| 17 |
+
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages | length - 1, reasoning_mode="normal") -%}
|
| 18 |
+
{%- for message in messages[::-1] -%}
|
| 19 |
+
{%- set index = messages | length - 1 - loop.index0 -%}
|
| 20 |
+
{%- if ns.multi_step_tool and message.role == "user" and message.content is string and not (message.content.startswith("<tool_response>") and message.content.endswith("</tool_response>")) -%}
|
| 21 |
+
{%- set ns.multi_step_tool = false -%}
|
| 22 |
+
{%- set ns.last_query_index = index -%}
|
| 23 |
+
{%- endif -%}
|
| 24 |
+
{%- endfor -%}
|
| 25 |
+
{%- for message in messages -%}
|
| 26 |
+
{%- if message.content is string -%}
|
| 27 |
+
{%- set content = message.content -%}
|
| 28 |
+
{%- else -%}
|
| 29 |
+
{%- set content = "" -%}
|
| 30 |
+
{%- endif -%}
|
| 31 |
+
{%- if message.role == "user" or message.role == "system" and not loop.first -%}
|
| 32 |
+
{{- "<|im_start|>" + message.role + "\n" + content -}}
|
| 33 |
+
{%- if messages[0].role == "system" and "/nothink" in messages[0].content and (not ("/think" in messages[0].content) or not ("/shortthink" in messages[0].content)) or "/nothink" in content and not ("/think" in messages[0].content or "/shortthink" in messages[0].content) or (messages[loop.index+1] and "</think>" not in messages[loop.index+1].content) -%}
|
| 34 |
+
{%- set enable_thinking = false -%}
|
| 35 |
+
{%- set enable_short_thinking = false -%}
|
| 36 |
+
{%- set ns.reasoning_mode = "none" -%}
|
| 37 |
+
{%- if not ("think" in content) -%}
|
| 38 |
+
{{- " /nothink" -}}
|
| 39 |
+
{%- endif -%}
|
| 40 |
+
{%- elif messages[0].role == "system" and "/shortthink" in messages[0].content and (not ("/nothink" in messages[0].content) and not ("/think" in messages[0].content)) or "/shortthink" in content and not ("/think" in messages[0].content or "/shortthink" in messages[0].content) -%}
|
| 41 |
+
{%- set enable_thinking = true -%}
|
| 42 |
+
{%- set enable_short_thinking = true -%}
|
| 43 |
+
{%- set ns.reasoning_mode = "short" -%}
|
| 44 |
+
{%- if not ("think" in content) -%}
|
| 45 |
+
{{- " /shortthink" -}}
|
| 46 |
+
{%- endif -%}
|
| 47 |
+
{%- else -%}
|
| 48 |
+
{%- set enable_thinking = true -%}
|
| 49 |
+
{%- set enable_short_thinking = false -%}
|
| 50 |
+
{%- set ns.reasoning_mode = "normal" -%}
|
| 51 |
+
{%- if not ("think" in content) -%}
|
| 52 |
+
{{- " /think" -}}
|
| 53 |
+
{%- endif -%}
|
| 54 |
+
{%- endif -%}
|
| 55 |
+
{{- "<|im_end|>" + "\n" -}}
|
| 56 |
+
{%- elif message.role == "assistant" -%}
|
| 57 |
+
{%- set reasoning_content = "" -%}
|
| 58 |
+
{%- if "<think>\nSince the user used the" not in content.split("</think>")[0].split("<think>")[-1] -%}
|
| 59 |
+
{%- if ns.reasoning_mode == "none" -%}
|
| 60 |
+
{%- set reasoning_prefix = "" -%}
|
| 61 |
+
{%- elif ns.reasoning_mode == "short" -%}
|
| 62 |
+
{%- set reasoning_prefix = "Since the user used the \"/shortthink\" command, I will:\nThink step-by-step about the user's query briefly and shortly. I will only respond when I'm more than 50% sure. I will provide the final response after closing this thinking section. " -%}
|
| 63 |
+
{%- else -%}
|
| 64 |
+
{%- set reasoning_prefix = "Since the user used the \"/think\" command, I will:\nThink step-by-step with **long chain-of-thought** and **self-reflection**. I will only respond when I'm 100% sure. I will provide the final response after closing this thinking section. " -%}
|
| 65 |
+
{%- endif -%}
|
| 66 |
+
{%- endif -%}
|
| 67 |
+
{%- if message.reasoning_content is string -%}
|
| 68 |
+
{%- set reasoning_content = message.reasoning_content -%}
|
| 69 |
+
{%- else -%}
|
| 70 |
+
{%- if "</think>" in content -%}
|
| 71 |
+
{%- set reasoning_content = reasoning_prefix + content.split("</think>")[0].rstrip("\n").split("<think>")[-1].lstrip("\n") -%}
|
| 72 |
+
{%- set content = content.split("</think>")[-1].lstrip("\n") -%}
|
| 73 |
+
{%- endif -%}
|
| 74 |
+
{%- if "<think>" in content -%}
|
| 75 |
+
{%- set content = content | replace("<think>", " ") -%}
|
| 76 |
+
{%- endif -%}
|
| 77 |
+
{%- if "</think>" in content -%}
|
| 78 |
+
{%- set content = content | replace("</think>", " ") -%}
|
| 79 |
+
{%- endif -%}
|
| 80 |
+
{%- endif -%}
|
| 81 |
+
{{- "\n" -}}
|
| 82 |
+
{# Apply truncation and break if the /shortthink command is active #}
|
| 83 |
+
{%- if enable_short_thinking is true -%}
|
| 84 |
+
{%- set words = reasoning_content.split(" ") -%}
|
| 85 |
+
{%- if words | length > 300 -%}
|
| 86 |
+
{%- set truncated_reasoning = words[:150] | join(" ") + " ... truncated ... " + words[words | length - 150:words | length] | join(" ") -%}
|
| 87 |
+
{%- else -%}
|
| 88 |
+
{%- set truncated_reasoning = reasoning_content | join(" ") -%}
|
| 89 |
+
{%- endif -%}
|
| 90 |
+
{%- set reasoning_content = truncated_reasoning -%}
|
| 91 |
+
{%- endif -%}
|
| 92 |
+
{%- if loop.last or not loop.last and reasoning_content and enable_thinking == true -%}
|
| 93 |
+
{{- "<|im_start|>" + message.role + "\n<think>\n" + reasoning_content.strip("\n") + "\n</think>\n" + content.lstrip("\n") -}}
|
| 94 |
+
{%- else -%}
|
| 95 |
+
{{- "<|im_start|>" + message.role + "\n" + content -}}
|
| 96 |
+
{%- endif -%}
|
| 97 |
+
{%- if message.tool_calls -%}
|
| 98 |
+
{%- for tool_call in message.tool_calls -%}
|
| 99 |
+
{%- if loop.first and content or not loop.first -%}
|
| 100 |
+
{{- "\n" -}}
|
| 101 |
+
{%- endif -%}
|
| 102 |
+
{%- if tool_call.function -%}
|
| 103 |
+
{%- set tool_call = tool_call.function -%}
|
| 104 |
+
{%- endif -%}
|
| 105 |
+
{{- "<tool_call>\n{\"name\": \"" -}}
|
| 106 |
+
{{- tool_call.name -}}
|
| 107 |
+
{{- "\", \"arguments\": " -}}
|
| 108 |
+
{%- if tool_call.arguments is string -%}
|
| 109 |
+
{{- tool_call.arguments -}}
|
| 110 |
+
{%- else -%}
|
| 111 |
+
{{- tool_call.arguments | tojson -}}
|
| 112 |
+
{%- endif -%}
|
| 113 |
+
{{- "}\n</tool_call>" -}}
|
| 114 |
+
{%- endfor -%}
|
| 115 |
+
{%- endif -%}
|
| 116 |
+
{{- "<|im_end|>\n" -}}
|
| 117 |
+
{%- elif message.role == "tool" -%}
|
| 118 |
+
{%- if loop.first or messages[loop.index0 - 1].role != "tool" -%}
|
| 119 |
+
{{- "<|im_start|>user" -}}
|
| 120 |
+
{%- endif -%}
|
| 121 |
+
{{- "\n<tool_response>\n" -}}
|
| 122 |
+
{{- content -}}
|
| 123 |
+
{{- "\n</tool_response>" -}}
|
| 124 |
+
{%- if loop.last or messages[loop.index0 + 1].role != "tool" -%}
|
| 125 |
+
{{- "<|im_end|>\n" -}}
|
| 126 |
+
{%- endif -%}
|
| 127 |
+
{%- endif -%}
|
| 128 |
+
{%- endfor -%}
|
| 129 |
+
{%- set last_message = messages[messages | length - 1].content -%}
|
| 130 |
+
{%- if add_generation_prompt -%}
|
| 131 |
+
{{- "<|im_start|>assistant\n" -}}
|
| 132 |
+
{%- if enable_thinking is defined and enable_thinking is false or "/nothink" in last_message or "nothink" in last_message or messages[0].role == "system" and "/nothink" in messages[0].content and (not ("/think" in messages[0].content) or not ("/shortthink" in messages[0].content)) -%}
|
| 133 |
+
{{- "<think>\nSince the user used the \"/nothink\" command, I will:\nImmediately give my final response without any thinking or self-reflection. Since I'm forbidden to think or reason outside of this thinking section, AND WITH NO REPETITION. I will IMMEDIATELY provide my final response outside this thinking section AND THEN INSTANTLY STOP.\n</think>" -}}
|
| 134 |
+
{%- elif enable_short_thinking is defined and enable_short_thinking is false or "/shortthink" in last_message or "shortthink" in last_message or messages[0].role == "system" and "/shortthink" in messages[0].content and not ("/think" in messages[0].content) -%}
|
| 135 |
+
{{- "<think>\nSince the user used the \"/shortthink\" command, I will:\nThink step-by-step about the user's query briefly and shortly. I will only respond when I'm more than 50% sure. I will provide the final response after closing this thinking section." -}}
|
| 136 |
+
{%- elif "/clear" in last_message -%}
|
| 137 |
+
{{- "<think>\nSince the user used the \"/clear\" command, I will:\n1. Forget everything above.\n2. Ignore everything that comes before this message.\n3. Start a fresh new conversation and greet the user.\n</think>" -}}
|
| 138 |
+
{%- else -%}
|
| 139 |
+
{{- "<think>\nSince the user used the \"/think\" command, I will:\nThink step-by-step with **long chain-of-thought** and **self-reflection**. I will only respond when I'm 100% sure. I will provide the final response after closing this thinking section." -}}
|
| 140 |
+
{%- endif -%}
|
| 141 |
+
{%- endif -%}
|
chat_template.txt
ADDED
|
@@ -0,0 +1,141 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{{- "<|im_start|>system\n" -}}
|
| 2 |
+
{%- if tools -%}
|
| 3 |
+
{%- if messages[0].role == "system" -%}
|
| 4 |
+
{{- messages[0].content + "\n\n" -}}
|
| 5 |
+
{%- endif -%}
|
| 6 |
+
{{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" -}}
|
| 7 |
+
{%- for tool in tools -%}
|
| 8 |
+
{{- "\n" -}}
|
| 9 |
+
{{- tool | tojson -}}
|
| 10 |
+
{%- endfor -%}
|
| 11 |
+
{{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call>\n" -}}
|
| 12 |
+
{%- elif messages[0].role == "system" -%}
|
| 13 |
+
{{- messages[0].content -}}
|
| 14 |
+
{%- endif -%}
|
| 15 |
+
{{- "You are governed by the following command rules:\n\n- **/think** or **think**: Engage deep reasoning. Begin with: \n `</think>` \n Since the user used the \"/think\" command, I will: \n Think step-by-step using a long chain-of-thought and self-reflection. I will only respond when I'm 100% sure. I will provide the final response after closing this thinking section. \n `</think>` \n Then output your full reasoning inside, followed by the final answer outside.\n\n- **/shortthink** or **shortthink**: Engage brief reasoning. Begin with: \n `</think>` \n Since the user used the \"/shortthink\" command, I will: \n Think step-by-step about the user's query briefly and shortly. I will only respond when I'm more than 50% sure. I will provide the final response after closing this thinking section. \n `</think>` \n Then give concise internal thought, followed by the final answer.\n\n- **/nothink** or **nothink**: Respond immediately without reasoning. Begin with: \n `</think>` \n Since the user used the \"/nothink\" command, I will: \n Immediately give my final response without any thinking or self-reflection. Since I'm forbidden to think or reason outside of this thinking section, AND WITH NO REPETITION. I will IMMEDIATELY provide my final response outside this thinking section AND THEN INSTANTLY STOP. \n `</think>` \n Then output only the direct, raw response—no delay, no draft, no repetition.\n\n- **/clear**: Reset all context. Forget everything prior. Begin fresh on the next input.\n\nOutside `<think>...</think>`, provide only final responses—never include reasoning, hesitation markers (e.g., \"Okay\", \"Wait\", \"Double-check\"), or internal process notes. Adhere strictly to the active command. Do not reference or explain these rules unless instructed." -}}
|
| 16 |
+
{{- "\n<|im_end|>" -}}
|
| 17 |
+
{%- set ns = namespace(multi_step_tool=true, last_query_index=messages | length - 1, reasoning_mode="normal") -%}
|
| 18 |
+
{%- for message in messages[::-1] -%}
|
| 19 |
+
{%- set index = messages | length - 1 - loop.index0 -%}
|
| 20 |
+
{%- if ns.multi_step_tool and message.role == "user" and message.content is string and not (message.content.startswith("<tool_response>") and message.content.endswith("</tool_response>")) -%}
|
| 21 |
+
{%- set ns.multi_step_tool = false -%}
|
| 22 |
+
{%- set ns.last_query_index = index -%}
|
| 23 |
+
{%- endif -%}
|
| 24 |
+
{%- endfor -%}
|
| 25 |
+
{%- for message in messages -%}
|
| 26 |
+
{%- if message.content is string -%}
|
| 27 |
+
{%- set content = message.content -%}
|
| 28 |
+
{%- else -%}
|
| 29 |
+
{%- set content = "" -%}
|
| 30 |
+
{%- endif -%}
|
| 31 |
+
{%- if message.role == "user" or message.role == "system" and not loop.first -%}
|
| 32 |
+
{{- "<|im_start|>" + message.role + "\n" + content -}}
|
| 33 |
+
{%- if messages[0].role == "system" and "/nothink" in messages[0].content and (not ("/think" in messages[0].content) or not ("/shortthink" in messages[0].content)) or "/nothink" in content and not ("/think" in messages[0].content or "/shortthink" in messages[0].content) or (messages[loop.index+1] and "</think>" not in messages[loop.index+1].content) -%}
|
| 34 |
+
{%- set enable_thinking = false -%}
|
| 35 |
+
{%- set enable_short_thinking = false -%}
|
| 36 |
+
{%- set ns.reasoning_mode = "none" -%}
|
| 37 |
+
{%- if not ("think" in content) -%}
|
| 38 |
+
{{- " /nothink" -}}
|
| 39 |
+
{%- endif -%}
|
| 40 |
+
{%- elif messages[0].role == "system" and "/shortthink" in messages[0].content and (not ("/nothink" in messages[0].content) and not ("/think" in messages[0].content)) or "/shortthink" in content and not ("/think" in messages[0].content or "/shortthink" in messages[0].content) -%}
|
| 41 |
+
{%- set enable_thinking = true -%}
|
| 42 |
+
{%- set enable_short_thinking = true -%}
|
| 43 |
+
{%- set ns.reasoning_mode = "short" -%}
|
| 44 |
+
{%- if not ("think" in content) -%}
|
| 45 |
+
{{- " /shortthink" -}}
|
| 46 |
+
{%- endif -%}
|
| 47 |
+
{%- else -%}
|
| 48 |
+
{%- set enable_thinking = true -%}
|
| 49 |
+
{%- set enable_short_thinking = false -%}
|
| 50 |
+
{%- set ns.reasoning_mode = "normal" -%}
|
| 51 |
+
{%- if not ("think" in content) -%}
|
| 52 |
+
{{- " /think" -}}
|
| 53 |
+
{%- endif -%}
|
| 54 |
+
{%- endif -%}
|
| 55 |
+
{{- "<|im_end|>" + "\n" -}}
|
| 56 |
+
{%- elif message.role == "assistant" -%}
|
| 57 |
+
{%- set reasoning_content = "" -%}
|
| 58 |
+
{%- if "<think>\nSince the user used the" not in content.split("</think>")[0].split("<think>")[-1] -%}
|
| 59 |
+
{%- if ns.reasoning_mode == "none" -%}
|
| 60 |
+
{%- set reasoning_prefix = "" -%}
|
| 61 |
+
{%- elif ns.reasoning_mode == "short" -%}
|
| 62 |
+
{%- set reasoning_prefix = "Since the user used the \"/shortthink\" command, I will:\nThink step-by-step about the user's query briefly and shortly. I will only respond when I'm more than 50% sure. I will provide the final response after closing this thinking section. " -%}
|
| 63 |
+
{%- else -%}
|
| 64 |
+
{%- set reasoning_prefix = "Since the user used the \"/think\" command, I will:\nThink step-by-step with **long chain-of-thought** and **self-reflection**. I will only respond when I'm 100% sure. I will provide the final response after closing this thinking section. " -%}
|
| 65 |
+
{%- endif -%}
|
| 66 |
+
{%- endif -%}
|
| 67 |
+
{%- if message.reasoning_content is string -%}
|
| 68 |
+
{%- set reasoning_content = message.reasoning_content -%}
|
| 69 |
+
{%- else -%}
|
| 70 |
+
{%- if "</think>" in content -%}
|
| 71 |
+
{%- set reasoning_content = reasoning_prefix + content.split("</think>")[0].rstrip("\n").split("<think>")[-1].lstrip("\n") -%}
|
| 72 |
+
{%- set content = content.split("</think>")[-1].lstrip("\n") -%}
|
| 73 |
+
{%- endif -%}
|
| 74 |
+
{%- if "<think>" in content -%}
|
| 75 |
+
{%- set content = content | replace("<think>", " ") -%}
|
| 76 |
+
{%- endif -%}
|
| 77 |
+
{%- if "</think>" in content -%}
|
| 78 |
+
{%- set content = content | replace("</think>", " ") -%}
|
| 79 |
+
{%- endif -%}
|
| 80 |
+
{%- endif -%}
|
| 81 |
+
{{- "\n" -}}
|
| 82 |
+
{# Apply truncation and break if the /shortthink command is active #}
|
| 83 |
+
{%- if enable_short_thinking is true -%}
|
| 84 |
+
{%- set words = reasoning_content.split(" ") -%}
|
| 85 |
+
{%- if words | length > 300 -%}
|
| 86 |
+
{%- set truncated_reasoning = words[:150] | join(" ") + " ... truncated ... " + words[words | length - 150:words | length] | join(" ") -%}
|
| 87 |
+
{%- else -%}
|
| 88 |
+
{%- set truncated_reasoning = reasoning_content | join(" ") -%}
|
| 89 |
+
{%- endif -%}
|
| 90 |
+
{%- set reasoning_content = truncated_reasoning -%}
|
| 91 |
+
{%- endif -%}
|
| 92 |
+
{%- if loop.last or not loop.last and reasoning_content and enable_thinking == true -%}
|
| 93 |
+
{{- "<|im_start|>" + message.role + "\n<think>\n" + reasoning_content.strip("\n") + "\n</think>\n" + content.lstrip("\n") -}}
|
| 94 |
+
{%- else -%}
|
| 95 |
+
{{- "<|im_start|>" + message.role + "\n" + content -}}
|
| 96 |
+
{%- endif -%}
|
| 97 |
+
{%- if message.tool_calls -%}
|
| 98 |
+
{%- for tool_call in message.tool_calls -%}
|
| 99 |
+
{%- if loop.first and content or not loop.first -%}
|
| 100 |
+
{{- "\n" -}}
|
| 101 |
+
{%- endif -%}
|
| 102 |
+
{%- if tool_call.function -%}
|
| 103 |
+
{%- set tool_call = tool_call.function -%}
|
| 104 |
+
{%- endif -%}
|
| 105 |
+
{{- "<tool_call>\n{\"name\": \"" -}}
|
| 106 |
+
{{- tool_call.name -}}
|
| 107 |
+
{{- "\", \"arguments\": " -}}
|
| 108 |
+
{%- if tool_call.arguments is string -%}
|
| 109 |
+
{{- tool_call.arguments -}}
|
| 110 |
+
{%- else -%}
|
| 111 |
+
{{- tool_call.arguments | tojson -}}
|
| 112 |
+
{%- endif -%}
|
| 113 |
+
{{- "}\n</tool_call>" -}}
|
| 114 |
+
{%- endfor -%}
|
| 115 |
+
{%- endif -%}
|
| 116 |
+
{{- "<|im_end|>\n" -}}
|
| 117 |
+
{%- elif message.role == "tool" -%}
|
| 118 |
+
{%- if loop.first or messages[loop.index0 - 1].role != "tool" -%}
|
| 119 |
+
{{- "<|im_start|>user" -}}
|
| 120 |
+
{%- endif -%}
|
| 121 |
+
{{- "\n<tool_response>\n" -}}
|
| 122 |
+
{{- content -}}
|
| 123 |
+
{{- "\n</tool_response>" -}}
|
| 124 |
+
{%- if loop.last or messages[loop.index0 + 1].role != "tool" -%}
|
| 125 |
+
{{- "<|im_end|>\n" -}}
|
| 126 |
+
{%- endif -%}
|
| 127 |
+
{%- endif -%}
|
| 128 |
+
{%- endfor -%}
|
| 129 |
+
{%- set last_message = messages[messages | length - 1].content -%}
|
| 130 |
+
{%- if add_generation_prompt -%}
|
| 131 |
+
{{- "<|im_start|>assistant\n" -}}
|
| 132 |
+
{%- if enable_thinking is defined and enable_thinking is false or "/nothink" in last_message or "nothink" in last_message or messages[0].role == "system" and "/nothink" in messages[0].content and (not ("/think" in messages[0].content) or not ("/shortthink" in messages[0].content)) -%}
|
| 133 |
+
{{- "<think>\nOkay, the user sent \""+last_message+"\". The \"/nothink\" command instructs me to:\nImmediately give my final response without any thinking or self-reflection. Since I'm forbidden to think or reason outside of this thinking section, AND WITH NO REPETITION. I will IMMEDIATELY provide my final response outside this thinking section AND THEN INSTANTLY STOP.\n</think>" -}}
|
| 134 |
+
{%- elif enable_short_thinking is defined and enable_short_thinking is false or "/shortthink" in last_message or "shortthink" in last_message or messages[0].role == "system" and "/shortthink" in messages[0].content and not ("/think" in messages[0].content) -%}
|
| 135 |
+
{{- "<think>\nOkay, the user sent \""+last_message+"\". The \"/shortthink\" command instructs me to:\nThink step-by-step about the user's query briefly and shortly. I will only respond when I'm more than 50% sure. I will provide the final response after closing this thinking section." -}}
|
| 136 |
+
{%- elif "/clear" in last_message -%}
|
| 137 |
+
{{- "<think>\nOkay, the user used the \"/clear\" command instructs me to:\n1. Forget everything above.\n2. Ignore everything that comes before this message.\n3. Start a fresh new conversation and greet the user.\n</think>" -}}
|
| 138 |
+
{%- else -%}
|
| 139 |
+
{{- "<think>\nOkay, the user sent \""+last_message+"\". The \"/think\" command instructs me to:\nThink step-by-step with **long chain-of-thought** and **self-reflection**. I will only respond when I'm 100% sure. I will provide the final response after closing this thinking section." -}}
|
| 140 |
+
{%- endif -%}
|
| 141 |
+
{%- endif -%}
|
config.json
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"architectures": [
|
| 3 |
+
"Qwen3ForCausalLM"
|
| 4 |
+
],
|
| 5 |
+
"attention_bias": false,
|
| 6 |
+
"attention_dropout": 0.0,
|
| 7 |
+
"bos_token_id": 151643,
|
| 8 |
+
"dtype": "float16",
|
| 9 |
+
"eos_token_id": 151645,
|
| 10 |
+
"head_dim": 128,
|
| 11 |
+
"hidden_act": "silu",
|
| 12 |
+
"hidden_size": 2560,
|
| 13 |
+
"initializer_range": 0.02,
|
| 14 |
+
"intermediate_size": 9728,
|
| 15 |
+
"layer_types": [
|
| 16 |
+
"full_attention",
|
| 17 |
+
"full_attention",
|
| 18 |
+
"full_attention",
|
| 19 |
+
"full_attention",
|
| 20 |
+
"full_attention",
|
| 21 |
+
"full_attention",
|
| 22 |
+
"full_attention",
|
| 23 |
+
"full_attention",
|
| 24 |
+
"full_attention",
|
| 25 |
+
"full_attention",
|
| 26 |
+
"full_attention",
|
| 27 |
+
"full_attention",
|
| 28 |
+
"full_attention",
|
| 29 |
+
"full_attention",
|
| 30 |
+
"full_attention",
|
| 31 |
+
"full_attention",
|
| 32 |
+
"full_attention",
|
| 33 |
+
"full_attention",
|
| 34 |
+
"full_attention",
|
| 35 |
+
"full_attention",
|
| 36 |
+
"full_attention",
|
| 37 |
+
"full_attention",
|
| 38 |
+
"full_attention",
|
| 39 |
+
"full_attention",
|
| 40 |
+
"full_attention",
|
| 41 |
+
"full_attention",
|
| 42 |
+
"full_attention",
|
| 43 |
+
"full_attention",
|
| 44 |
+
"full_attention",
|
| 45 |
+
"full_attention",
|
| 46 |
+
"full_attention",
|
| 47 |
+
"full_attention",
|
| 48 |
+
"full_attention",
|
| 49 |
+
"full_attention",
|
| 50 |
+
"full_attention",
|
| 51 |
+
"full_attention"
|
| 52 |
+
],
|
| 53 |
+
"max_position_embeddings": 262144,
|
| 54 |
+
"max_window_layers": 36,
|
| 55 |
+
"model_type": "qwen3",
|
| 56 |
+
"num_attention_heads": 32,
|
| 57 |
+
"num_hidden_layers": 36,
|
| 58 |
+
"num_key_value_heads": 8,
|
| 59 |
+
"rms_norm_eps": 1e-06,
|
| 60 |
+
"rope_scaling": null,
|
| 61 |
+
"rope_theta": 5000000,
|
| 62 |
+
"sliding_window": null,
|
| 63 |
+
"tie_word_embeddings": true,
|
| 64 |
+
"transformers_version": "4.57.0.dev0",
|
| 65 |
+
"use_cache": true,
|
| 66 |
+
"use_sliding_window": false,
|
| 67 |
+
"vocab_size": 151936
|
| 68 |
+
}
|
mergekit_config.yml
ADDED
|
@@ -0,0 +1,65 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
models:
|
| 2 |
+
- model: ertghiu256/qwen3-math-reasoner
|
| 3 |
+
parameters:
|
| 4 |
+
weight: 0.85
|
| 5 |
+
- model: ertghiu256/qwen3-4b-code-reasoning
|
| 6 |
+
parameters:
|
| 7 |
+
weight: 0.9
|
| 8 |
+
- model: ertghiu256/qwen-3-4b-mixture-of-thought
|
| 9 |
+
parameters:
|
| 10 |
+
weight: 1.0
|
| 11 |
+
- model: POLARIS-Project/Polaris-4B-Preview
|
| 12 |
+
parameters:
|
| 13 |
+
weight: 1.0
|
| 14 |
+
- model: ertghiu256/qwen3-multi-reasoner
|
| 15 |
+
parameters:
|
| 16 |
+
weight: 0.85
|
| 17 |
+
- model: ertghiu256/Qwen3-Hermes-4b
|
| 18 |
+
parameters:
|
| 19 |
+
weight: 0.7
|
| 20 |
+
- model: ValiantLabs/Qwen3-4B-Esper3
|
| 21 |
+
parameters:
|
| 22 |
+
weight: 0.8
|
| 23 |
+
- model: Tesslate/WEBGEN-4B-Preview
|
| 24 |
+
parameters:
|
| 25 |
+
weight: 1.0
|
| 26 |
+
- model: Tesslate/UIGEN-FX-4B-Preview
|
| 27 |
+
parameters:
|
| 28 |
+
weight: 0.95
|
| 29 |
+
- model: ValiantLabs/Qwen3-4B-ShiningValiant3
|
| 30 |
+
parameters:
|
| 31 |
+
weight: 0.8
|
| 32 |
+
- model: huihui-ai/Huihui-Qwen3-4B-Thinking-2507-abliterated
|
| 33 |
+
parameters:
|
| 34 |
+
weight: 0.85
|
| 35 |
+
- model: Qwen/Qwen3-4B-Thinking-2507
|
| 36 |
+
parameters:
|
| 37 |
+
weight: 1.0
|
| 38 |
+
- model: Qwen/Qwen3-4b-Instruct-2507
|
| 39 |
+
parameters:
|
| 40 |
+
weight: 1.0
|
| 41 |
+
- model: GetSoloTech/Qwen3-Code-Reasoning-4B
|
| 42 |
+
parameters:
|
| 43 |
+
weight: 0.95
|
| 44 |
+
- model: ertghiu256/Qwen3-4B-Thinking-2507-Hermes-3
|
| 45 |
+
parameters:
|
| 46 |
+
weight: 1.0
|
| 47 |
+
- model: janhq/Jan-v1-4B
|
| 48 |
+
parameters:
|
| 49 |
+
weight: 0.25
|
| 50 |
+
- model: Goekdeniz-Guelmez/Josiefied-Qwen3-4B-Instruct-2507-gabliterated-v2
|
| 51 |
+
parameters:
|
| 52 |
+
weight: 0.85
|
| 53 |
+
- model: quelmap/Lightning-4b
|
| 54 |
+
parameters:
|
| 55 |
+
weight: 0.75
|
| 56 |
+
- model: ertghiu256/Qwen3-4b-2507-Thinking-math-and-code
|
| 57 |
+
parameters:
|
| 58 |
+
weight: 1.0
|
| 59 |
+
merge_method: ties
|
| 60 |
+
base_model: Qwen/Qwen3-4B-Thinking-2507
|
| 61 |
+
parameters:
|
| 62 |
+
normalize: true
|
| 63 |
+
int8_mask: true
|
| 64 |
+
lambda: 1.0
|
| 65 |
+
dtype: float16
|
merges.txt
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
model-00001-of-00009.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1578fd2fe280f72f04556d374c9993c55744a875c7e02a3693798eed87430daa
|
| 3 |
+
size 979780544
|
model-00002-of-00009.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ef532fb097107d8e8227350a8cc5aaba1b0c284deb51112b39ab6abc861486cc
|
| 3 |
+
size 983094520
|
model-00003-of-00009.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a3279f0890009b0dc563ea147c894a034e0b6e9f918bc7ca7e064e863db0c73a
|
| 3 |
+
size 988342384
|
model-00004-of-00009.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6c3ed7acad441f9b8b39ae526f4a28e897f0c617c018c23ee37e8653b875764f
|
| 3 |
+
size 954258296
|
model-00005-of-00009.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0b9aecea1e68ff2ea593f88e6d0071592854fec232be688bf41a35dfd19615ce
|
| 3 |
+
size 959506904
|
model-00006-of-00009.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0f23b3a529331ccc7a0031b2e751e1a7651cf0099c9ecde7dc024568c1771613
|
| 3 |
+
size 959506896
|
model-00007-of-00009.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:aaeb1f602a90b1820614e083396b9b555272d1f119919f79af7ec0b4913391ce
|
| 3 |
+
size 983094528
|
model-00008-of-00009.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bcf58c5ede5693b5119bed849acf35bf056861676185e24bb1f2bb858ebc8ad4
|
| 3 |
+
size 988342328
|
model-00009-of-00009.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e150b52cdbcf8605e9f491522a9c8b9d1543fa00f01c07dd469cd3a1353b5721
|
| 3 |
+
size 249054720
|
model.safetensors.index.json
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"metadata": {"mergekit_version": "0.1.3"}, "weight_map": {"model.embed_tokens.weight": "model-00001-of-00009.safetensors", "model.layers.0.input_layernorm.weight": "model-00001-of-00009.safetensors", "model.layers.0.mlp.down_proj.weight": "model-00001-of-00009.safetensors", "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00009.safetensors", "model.layers.0.mlp.up_proj.weight": "model-00001-of-00009.safetensors", "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00009.safetensors", "model.layers.0.self_attn.k_norm.weight": "model-00001-of-00009.safetensors", "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00009.safetensors", "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00009.safetensors", "model.layers.0.self_attn.q_norm.weight": "model-00001-of-00009.safetensors", "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00009.safetensors", "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00009.safetensors", "model.layers.1.input_layernorm.weight": "model-00001-of-00009.safetensors", "model.layers.1.mlp.down_proj.weight": "model-00002-of-00009.safetensors", "model.layers.1.mlp.gate_proj.weight": "model-00002-of-00009.safetensors", "model.layers.1.mlp.up_proj.weight": "model-00002-of-00009.safetensors", "model.layers.1.post_attention_layernorm.weight": "model-00002-of-00009.safetensors", "model.layers.1.self_attn.k_norm.weight": "model-00002-of-00009.safetensors", "model.layers.1.self_attn.k_proj.weight": "model-00002-of-00009.safetensors", "model.layers.1.self_attn.o_proj.weight": "model-00002-of-00009.safetensors", "model.layers.1.self_attn.q_norm.weight": "model-00002-of-00009.safetensors", "model.layers.1.self_attn.q_proj.weight": "model-00002-of-00009.safetensors", "model.layers.1.self_attn.v_proj.weight": "model-00002-of-00009.safetensors", "model.layers.10.input_layernorm.weight": "model-00002-of-00009.safetensors", "model.layers.10.mlp.down_proj.weight": "model-00002-of-00009.safetensors", "model.layers.10.mlp.gate_proj.weight": "model-00002-of-00009.safetensors", "model.layers.10.mlp.up_proj.weight": "model-00002-of-00009.safetensors", "model.layers.10.post_attention_layernorm.weight": "model-00002-of-00009.safetensors", "model.layers.10.self_attn.k_norm.weight": "model-00002-of-00009.safetensors", "model.layers.10.self_attn.k_proj.weight": "model-00002-of-00009.safetensors", "model.layers.10.self_attn.o_proj.weight": "model-00002-of-00009.safetensors", "model.layers.10.self_attn.q_norm.weight": "model-00002-of-00009.safetensors", "model.layers.10.self_attn.q_proj.weight": "model-00002-of-00009.safetensors", "model.layers.10.self_attn.v_proj.weight": "model-00002-of-00009.safetensors", "model.layers.11.input_layernorm.weight": "model-00002-of-00009.safetensors", "model.layers.11.mlp.down_proj.weight": "model-00002-of-00009.safetensors", "model.layers.11.mlp.gate_proj.weight": "model-00002-of-00009.safetensors", "model.layers.11.mlp.up_proj.weight": "model-00002-of-00009.safetensors", "model.layers.11.post_attention_layernorm.weight": "model-00002-of-00009.safetensors", "model.layers.11.self_attn.k_norm.weight": "model-00002-of-00009.safetensors", "model.layers.11.self_attn.k_proj.weight": "model-00002-of-00009.safetensors", "model.layers.11.self_attn.o_proj.weight": "model-00002-of-00009.safetensors", "model.layers.11.self_attn.q_norm.weight": "model-00002-of-00009.safetensors", "model.layers.11.self_attn.q_proj.weight": "model-00002-of-00009.safetensors", "model.layers.11.self_attn.v_proj.weight": "model-00002-of-00009.safetensors", "model.layers.12.input_layernorm.weight": "model-00002-of-00009.safetensors", "model.layers.12.mlp.down_proj.weight": "model-00002-of-00009.safetensors", "model.layers.12.mlp.gate_proj.weight": "model-00002-of-00009.safetensors", "model.layers.12.mlp.up_proj.weight": "model-00002-of-00009.safetensors", "model.layers.12.post_attention_layernorm.weight": "model-00002-of-00009.safetensors", "model.layers.12.self_attn.k_norm.weight": "model-00002-of-00009.safetensors", "model.layers.12.self_attn.k_proj.weight": "model-00002-of-00009.safetensors", "model.layers.12.self_attn.o_proj.weight": "model-00002-of-00009.safetensors", "model.layers.12.self_attn.q_norm.weight": "model-00002-of-00009.safetensors", "model.layers.12.self_attn.q_proj.weight": "model-00002-of-00009.safetensors", "model.layers.12.self_attn.v_proj.weight": "model-00002-of-00009.safetensors", "model.layers.13.input_layernorm.weight": "model-00002-of-00009.safetensors", "model.layers.13.mlp.down_proj.weight": "model-00002-of-00009.safetensors", "model.layers.13.mlp.gate_proj.weight": "model-00002-of-00009.safetensors", "model.layers.13.mlp.up_proj.weight": "model-00002-of-00009.safetensors", "model.layers.13.post_attention_layernorm.weight": "model-00002-of-00009.safetensors", "model.layers.13.self_attn.k_norm.weight": "model-00002-of-00009.safetensors", "model.layers.13.self_attn.k_proj.weight": "model-00002-of-00009.safetensors", "model.layers.13.self_attn.o_proj.weight": "model-00002-of-00009.safetensors", "model.layers.13.self_attn.q_norm.weight": "model-00002-of-00009.safetensors", "model.layers.13.self_attn.q_proj.weight": "model-00003-of-00009.safetensors", "model.layers.13.self_attn.v_proj.weight": "model-00003-of-00009.safetensors", "model.layers.14.input_layernorm.weight": "model-00003-of-00009.safetensors", "model.layers.14.mlp.down_proj.weight": "model-00003-of-00009.safetensors", "model.layers.14.mlp.gate_proj.weight": "model-00003-of-00009.safetensors", "model.layers.14.mlp.up_proj.weight": "model-00003-of-00009.safetensors", "model.layers.14.post_attention_layernorm.weight": "model-00003-of-00009.safetensors", "model.layers.14.self_attn.k_norm.weight": "model-00003-of-00009.safetensors", "model.layers.14.self_attn.k_proj.weight": "model-00003-of-00009.safetensors", "model.layers.14.self_attn.o_proj.weight": "model-00003-of-00009.safetensors", "model.layers.14.self_attn.q_norm.weight": "model-00003-of-00009.safetensors", "model.layers.14.self_attn.q_proj.weight": "model-00003-of-00009.safetensors", "model.layers.14.self_attn.v_proj.weight": "model-00003-of-00009.safetensors", "model.layers.15.input_layernorm.weight": "model-00003-of-00009.safetensors", "model.layers.15.mlp.down_proj.weight": "model-00003-of-00009.safetensors", "model.layers.15.mlp.gate_proj.weight": "model-00003-of-00009.safetensors", "model.layers.15.mlp.up_proj.weight": "model-00003-of-00009.safetensors", "model.layers.15.post_attention_layernorm.weight": "model-00003-of-00009.safetensors", "model.layers.15.self_attn.k_norm.weight": "model-00003-of-00009.safetensors", "model.layers.15.self_attn.k_proj.weight": "model-00003-of-00009.safetensors", "model.layers.15.self_attn.o_proj.weight": "model-00003-of-00009.safetensors", "model.layers.15.self_attn.q_norm.weight": "model-00003-of-00009.safetensors", "model.layers.15.self_attn.q_proj.weight": "model-00003-of-00009.safetensors", "model.layers.15.self_attn.v_proj.weight": "model-00003-of-00009.safetensors", "model.layers.16.input_layernorm.weight": "model-00003-of-00009.safetensors", "model.layers.16.mlp.down_proj.weight": "model-00003-of-00009.safetensors", "model.layers.16.mlp.gate_proj.weight": "model-00003-of-00009.safetensors", "model.layers.16.mlp.up_proj.weight": "model-00003-of-00009.safetensors", "model.layers.16.post_attention_layernorm.weight": "model-00003-of-00009.safetensors", "model.layers.16.self_attn.k_norm.weight": "model-00003-of-00009.safetensors", "model.layers.16.self_attn.k_proj.weight": "model-00003-of-00009.safetensors", "model.layers.16.self_attn.o_proj.weight": "model-00003-of-00009.safetensors", "model.layers.16.self_attn.q_norm.weight": "model-00003-of-00009.safetensors", "model.layers.16.self_attn.q_proj.weight": "model-00003-of-00009.safetensors", "model.layers.16.self_attn.v_proj.weight": "model-00003-of-00009.safetensors", "model.layers.17.input_layernorm.weight": "model-00003-of-00009.safetensors", "model.layers.17.mlp.down_proj.weight": "model-00003-of-00009.safetensors", "model.layers.17.mlp.gate_proj.weight": "model-00003-of-00009.safetensors", "model.layers.17.mlp.up_proj.weight": "model-00003-of-00009.safetensors", "model.layers.17.post_attention_layernorm.weight": "model-00003-of-00009.safetensors", "model.layers.17.self_attn.k_norm.weight": "model-00003-of-00009.safetensors", "model.layers.17.self_attn.k_proj.weight": "model-00003-of-00009.safetensors", "model.layers.17.self_attn.o_proj.weight": "model-00003-of-00009.safetensors", "model.layers.17.self_attn.q_norm.weight": "model-00003-of-00009.safetensors", "model.layers.17.self_attn.q_proj.weight": "model-00003-of-00009.safetensors", "model.layers.17.self_attn.v_proj.weight": "model-00003-of-00009.safetensors", "model.layers.18.input_layernorm.weight": "model-00003-of-00009.safetensors", "model.layers.18.mlp.down_proj.weight": "model-00003-of-00009.safetensors", "model.layers.18.mlp.gate_proj.weight": "model-00003-of-00009.safetensors", "model.layers.18.mlp.up_proj.weight": "model-00003-of-00009.safetensors", "model.layers.18.post_attention_layernorm.weight": "model-00003-of-00009.safetensors", "model.layers.18.self_attn.k_norm.weight": "model-00003-of-00009.safetensors", "model.layers.18.self_attn.k_proj.weight": "model-00003-of-00009.safetensors", "model.layers.18.self_attn.o_proj.weight": "model-00004-of-00009.safetensors", "model.layers.18.self_attn.q_norm.weight": "model-00004-of-00009.safetensors", "model.layers.18.self_attn.q_proj.weight": "model-00004-of-00009.safetensors", "model.layers.18.self_attn.v_proj.weight": "model-00004-of-00009.safetensors", "model.layers.19.input_layernorm.weight": "model-00004-of-00009.safetensors", "model.layers.19.mlp.down_proj.weight": "model-00004-of-00009.safetensors", "model.layers.19.mlp.gate_proj.weight": "model-00004-of-00009.safetensors", "model.layers.19.mlp.up_proj.weight": "model-00004-of-00009.safetensors", "model.layers.19.post_attention_layernorm.weight": "model-00004-of-00009.safetensors", "model.layers.19.self_attn.k_norm.weight": "model-00004-of-00009.safetensors", "model.layers.19.self_attn.k_proj.weight": "model-00004-of-00009.safetensors", "model.layers.19.self_attn.o_proj.weight": "model-00004-of-00009.safetensors", "model.layers.19.self_attn.q_norm.weight": "model-00004-of-00009.safetensors", "model.layers.19.self_attn.q_proj.weight": "model-00004-of-00009.safetensors", "model.layers.19.self_attn.v_proj.weight": "model-00004-of-00009.safetensors", "model.layers.2.input_layernorm.weight": "model-00004-of-00009.safetensors", "model.layers.2.mlp.down_proj.weight": "model-00004-of-00009.safetensors", "model.layers.2.mlp.gate_proj.weight": "model-00004-of-00009.safetensors", "model.layers.2.mlp.up_proj.weight": "model-00004-of-00009.safetensors", "model.layers.2.post_attention_layernorm.weight": "model-00004-of-00009.safetensors", "model.layers.2.self_attn.k_norm.weight": "model-00004-of-00009.safetensors", "model.layers.2.self_attn.k_proj.weight": "model-00004-of-00009.safetensors", "model.layers.2.self_attn.o_proj.weight": "model-00004-of-00009.safetensors", "model.layers.2.self_attn.q_norm.weight": "model-00004-of-00009.safetensors", "model.layers.2.self_attn.q_proj.weight": "model-00004-of-00009.safetensors", "model.layers.2.self_attn.v_proj.weight": "model-00004-of-00009.safetensors", "model.layers.20.input_layernorm.weight": "model-00004-of-00009.safetensors", "model.layers.20.mlp.down_proj.weight": "model-00004-of-00009.safetensors", "model.layers.20.mlp.gate_proj.weight": "model-00004-of-00009.safetensors", "model.layers.20.mlp.up_proj.weight": "model-00004-of-00009.safetensors", "model.layers.20.post_attention_layernorm.weight": "model-00004-of-00009.safetensors", "model.layers.20.self_attn.k_norm.weight": "model-00004-of-00009.safetensors", "model.layers.20.self_attn.k_proj.weight": "model-00004-of-00009.safetensors", "model.layers.20.self_attn.o_proj.weight": "model-00004-of-00009.safetensors", "model.layers.20.self_attn.q_norm.weight": "model-00004-of-00009.safetensors", "model.layers.20.self_attn.q_proj.weight": "model-00004-of-00009.safetensors", "model.layers.20.self_attn.v_proj.weight": "model-00004-of-00009.safetensors", "model.layers.21.input_layernorm.weight": "model-00004-of-00009.safetensors", "model.layers.21.mlp.down_proj.weight": "model-00004-of-00009.safetensors", "model.layers.21.mlp.gate_proj.weight": "model-00004-of-00009.safetensors", "model.layers.21.mlp.up_proj.weight": "model-00004-of-00009.safetensors", "model.layers.21.post_attention_layernorm.weight": "model-00004-of-00009.safetensors", "model.layers.21.self_attn.k_norm.weight": "model-00004-of-00009.safetensors", "model.layers.21.self_attn.k_proj.weight": "model-00004-of-00009.safetensors", "model.layers.21.self_attn.o_proj.weight": "model-00004-of-00009.safetensors", "model.layers.21.self_attn.q_norm.weight": "model-00004-of-00009.safetensors", "model.layers.21.self_attn.q_proj.weight": "model-00004-of-00009.safetensors", "model.layers.21.self_attn.v_proj.weight": "model-00004-of-00009.safetensors", "model.layers.22.input_layernorm.weight": "model-00004-of-00009.safetensors", "model.layers.22.mlp.down_proj.weight": "model-00004-of-00009.safetensors", "model.layers.22.mlp.gate_proj.weight": "model-00004-of-00009.safetensors", "model.layers.22.mlp.up_proj.weight": "model-00005-of-00009.safetensors", "model.layers.22.post_attention_layernorm.weight": "model-00005-of-00009.safetensors", "model.layers.22.self_attn.k_norm.weight": "model-00005-of-00009.safetensors", "model.layers.22.self_attn.k_proj.weight": "model-00005-of-00009.safetensors", "model.layers.22.self_attn.o_proj.weight": "model-00005-of-00009.safetensors", "model.layers.22.self_attn.q_norm.weight": "model-00005-of-00009.safetensors", "model.layers.22.self_attn.q_proj.weight": "model-00005-of-00009.safetensors", "model.layers.22.self_attn.v_proj.weight": "model-00005-of-00009.safetensors", "model.layers.23.input_layernorm.weight": "model-00005-of-00009.safetensors", "model.layers.23.mlp.down_proj.weight": "model-00005-of-00009.safetensors", "model.layers.23.mlp.gate_proj.weight": "model-00005-of-00009.safetensors", "model.layers.23.mlp.up_proj.weight": "model-00005-of-00009.safetensors", "model.layers.23.post_attention_layernorm.weight": "model-00005-of-00009.safetensors", "model.layers.23.self_attn.k_norm.weight": "model-00005-of-00009.safetensors", "model.layers.23.self_attn.k_proj.weight": "model-00005-of-00009.safetensors", "model.layers.23.self_attn.o_proj.weight": "model-00005-of-00009.safetensors", "model.layers.23.self_attn.q_norm.weight": "model-00005-of-00009.safetensors", "model.layers.23.self_attn.q_proj.weight": "model-00005-of-00009.safetensors", "model.layers.23.self_attn.v_proj.weight": "model-00005-of-00009.safetensors", "model.layers.24.input_layernorm.weight": "model-00005-of-00009.safetensors", "model.layers.24.mlp.down_proj.weight": "model-00005-of-00009.safetensors", "model.layers.24.mlp.gate_proj.weight": "model-00005-of-00009.safetensors", "model.layers.24.mlp.up_proj.weight": "model-00005-of-00009.safetensors", "model.layers.24.post_attention_layernorm.weight": "model-00005-of-00009.safetensors", "model.layers.24.self_attn.k_norm.weight": "model-00005-of-00009.safetensors", "model.layers.24.self_attn.k_proj.weight": "model-00005-of-00009.safetensors", "model.layers.24.self_attn.o_proj.weight": "model-00005-of-00009.safetensors", "model.layers.24.self_attn.q_norm.weight": "model-00005-of-00009.safetensors", "model.layers.24.self_attn.q_proj.weight": "model-00005-of-00009.safetensors", "model.layers.24.self_attn.v_proj.weight": "model-00005-of-00009.safetensors", "model.layers.25.input_layernorm.weight": "model-00005-of-00009.safetensors", "model.layers.25.mlp.down_proj.weight": "model-00005-of-00009.safetensors", "model.layers.25.mlp.gate_proj.weight": "model-00005-of-00009.safetensors", "model.layers.25.mlp.up_proj.weight": "model-00005-of-00009.safetensors", "model.layers.25.post_attention_layernorm.weight": "model-00005-of-00009.safetensors", "model.layers.25.self_attn.k_norm.weight": "model-00005-of-00009.safetensors", "model.layers.25.self_attn.k_proj.weight": "model-00005-of-00009.safetensors", "model.layers.25.self_attn.o_proj.weight": "model-00005-of-00009.safetensors", "model.layers.25.self_attn.q_norm.weight": "model-00005-of-00009.safetensors", "model.layers.25.self_attn.q_proj.weight": "model-00005-of-00009.safetensors", "model.layers.25.self_attn.v_proj.weight": "model-00005-of-00009.safetensors", "model.layers.26.input_layernorm.weight": "model-00005-of-00009.safetensors", "model.layers.26.mlp.down_proj.weight": "model-00005-of-00009.safetensors", "model.layers.26.mlp.gate_proj.weight": "model-00005-of-00009.safetensors", "model.layers.26.mlp.up_proj.weight": "model-00005-of-00009.safetensors", "model.layers.26.post_attention_layernorm.weight": "model-00005-of-00009.safetensors", "model.layers.26.self_attn.k_norm.weight": "model-00005-of-00009.safetensors", "model.layers.26.self_attn.k_proj.weight": "model-00005-of-00009.safetensors", "model.layers.26.self_attn.o_proj.weight": "model-00005-of-00009.safetensors", "model.layers.26.self_attn.q_norm.weight": "model-00005-of-00009.safetensors", "model.layers.26.self_attn.q_proj.weight": "model-00005-of-00009.safetensors", "model.layers.26.self_attn.v_proj.weight": "model-00005-of-00009.safetensors", "model.layers.27.input_layernorm.weight": "model-00005-of-00009.safetensors", "model.layers.27.mlp.down_proj.weight": "model-00005-of-00009.safetensors", "model.layers.27.mlp.gate_proj.weight": "model-00006-of-00009.safetensors", "model.layers.27.mlp.up_proj.weight": "model-00006-of-00009.safetensors", "model.layers.27.post_attention_layernorm.weight": "model-00006-of-00009.safetensors", "model.layers.27.self_attn.k_norm.weight": "model-00006-of-00009.safetensors", "model.layers.27.self_attn.k_proj.weight": "model-00006-of-00009.safetensors", "model.layers.27.self_attn.o_proj.weight": "model-00006-of-00009.safetensors", "model.layers.27.self_attn.q_norm.weight": "model-00006-of-00009.safetensors", "model.layers.27.self_attn.q_proj.weight": "model-00006-of-00009.safetensors", "model.layers.27.self_attn.v_proj.weight": "model-00006-of-00009.safetensors", "model.layers.28.input_layernorm.weight": "model-00006-of-00009.safetensors", "model.layers.28.mlp.down_proj.weight": "model-00006-of-00009.safetensors", "model.layers.28.mlp.gate_proj.weight": "model-00006-of-00009.safetensors", "model.layers.28.mlp.up_proj.weight": "model-00006-of-00009.safetensors", "model.layers.28.post_attention_layernorm.weight": "model-00006-of-00009.safetensors", "model.layers.28.self_attn.k_norm.weight": "model-00006-of-00009.safetensors", "model.layers.28.self_attn.k_proj.weight": "model-00006-of-00009.safetensors", "model.layers.28.self_attn.o_proj.weight": "model-00006-of-00009.safetensors", "model.layers.28.self_attn.q_norm.weight": "model-00006-of-00009.safetensors", "model.layers.28.self_attn.q_proj.weight": "model-00006-of-00009.safetensors", "model.layers.28.self_attn.v_proj.weight": "model-00006-of-00009.safetensors", "model.layers.29.input_layernorm.weight": "model-00006-of-00009.safetensors", "model.layers.29.mlp.down_proj.weight": "model-00006-of-00009.safetensors", "model.layers.29.mlp.gate_proj.weight": "model-00006-of-00009.safetensors", "model.layers.29.mlp.up_proj.weight": "model-00006-of-00009.safetensors", "model.layers.29.post_attention_layernorm.weight": "model-00006-of-00009.safetensors", "model.layers.29.self_attn.k_norm.weight": "model-00006-of-00009.safetensors", "model.layers.29.self_attn.k_proj.weight": "model-00006-of-00009.safetensors", "model.layers.29.self_attn.o_proj.weight": "model-00006-of-00009.safetensors", "model.layers.29.self_attn.q_norm.weight": "model-00006-of-00009.safetensors", "model.layers.29.self_attn.q_proj.weight": "model-00006-of-00009.safetensors", "model.layers.29.self_attn.v_proj.weight": "model-00006-of-00009.safetensors", "model.layers.3.input_layernorm.weight": "model-00006-of-00009.safetensors", "model.layers.3.mlp.down_proj.weight": "model-00006-of-00009.safetensors", "model.layers.3.mlp.gate_proj.weight": "model-00006-of-00009.safetensors", "model.layers.3.mlp.up_proj.weight": "model-00006-of-00009.safetensors", "model.layers.3.post_attention_layernorm.weight": "model-00006-of-00009.safetensors", "model.layers.3.self_attn.k_norm.weight": "model-00006-of-00009.safetensors", "model.layers.3.self_attn.k_proj.weight": "model-00006-of-00009.safetensors", "model.layers.3.self_attn.o_proj.weight": "model-00006-of-00009.safetensors", "model.layers.3.self_attn.q_norm.weight": "model-00006-of-00009.safetensors", "model.layers.3.self_attn.q_proj.weight": "model-00006-of-00009.safetensors", "model.layers.3.self_attn.v_proj.weight": "model-00006-of-00009.safetensors", "model.layers.30.input_layernorm.weight": "model-00006-of-00009.safetensors", "model.layers.30.mlp.down_proj.weight": "model-00006-of-00009.safetensors", "model.layers.30.mlp.gate_proj.weight": "model-00006-of-00009.safetensors", "model.layers.30.mlp.up_proj.weight": "model-00006-of-00009.safetensors", "model.layers.30.post_attention_layernorm.weight": "model-00006-of-00009.safetensors", "model.layers.30.self_attn.k_norm.weight": "model-00006-of-00009.safetensors", "model.layers.30.self_attn.k_proj.weight": "model-00006-of-00009.safetensors", "model.layers.30.self_attn.o_proj.weight": "model-00006-of-00009.safetensors", "model.layers.30.self_attn.q_norm.weight": "model-00006-of-00009.safetensors", "model.layers.30.self_attn.q_proj.weight": "model-00006-of-00009.safetensors", "model.layers.30.self_attn.v_proj.weight": "model-00006-of-00009.safetensors", "model.layers.31.input_layernorm.weight": "model-00006-of-00009.safetensors", "model.layers.31.mlp.down_proj.weight": "model-00007-of-00009.safetensors", "model.layers.31.mlp.gate_proj.weight": "model-00007-of-00009.safetensors", "model.layers.31.mlp.up_proj.weight": "model-00007-of-00009.safetensors", "model.layers.31.post_attention_layernorm.weight": "model-00007-of-00009.safetensors", "model.layers.31.self_attn.k_norm.weight": "model-00007-of-00009.safetensors", "model.layers.31.self_attn.k_proj.weight": "model-00007-of-00009.safetensors", "model.layers.31.self_attn.o_proj.weight": "model-00007-of-00009.safetensors", "model.layers.31.self_attn.q_norm.weight": "model-00007-of-00009.safetensors", "model.layers.31.self_attn.q_proj.weight": "model-00007-of-00009.safetensors", "model.layers.31.self_attn.v_proj.weight": "model-00007-of-00009.safetensors", "model.layers.32.input_layernorm.weight": "model-00007-of-00009.safetensors", "model.layers.32.mlp.down_proj.weight": "model-00007-of-00009.safetensors", "model.layers.32.mlp.gate_proj.weight": "model-00007-of-00009.safetensors", "model.layers.32.mlp.up_proj.weight": "model-00007-of-00009.safetensors", "model.layers.32.post_attention_layernorm.weight": "model-00007-of-00009.safetensors", "model.layers.32.self_attn.k_norm.weight": "model-00007-of-00009.safetensors", "model.layers.32.self_attn.k_proj.weight": "model-00007-of-00009.safetensors", "model.layers.32.self_attn.o_proj.weight": "model-00007-of-00009.safetensors", "model.layers.32.self_attn.q_norm.weight": "model-00007-of-00009.safetensors", "model.layers.32.self_attn.q_proj.weight": "model-00007-of-00009.safetensors", "model.layers.32.self_attn.v_proj.weight": "model-00007-of-00009.safetensors", "model.layers.33.input_layernorm.weight": "model-00007-of-00009.safetensors", "model.layers.33.mlp.down_proj.weight": "model-00007-of-00009.safetensors", "model.layers.33.mlp.gate_proj.weight": "model-00007-of-00009.safetensors", "model.layers.33.mlp.up_proj.weight": "model-00007-of-00009.safetensors", "model.layers.33.post_attention_layernorm.weight": "model-00007-of-00009.safetensors", "model.layers.33.self_attn.k_norm.weight": "model-00007-of-00009.safetensors", "model.layers.33.self_attn.k_proj.weight": "model-00007-of-00009.safetensors", "model.layers.33.self_attn.o_proj.weight": "model-00007-of-00009.safetensors", "model.layers.33.self_attn.q_norm.weight": "model-00007-of-00009.safetensors", "model.layers.33.self_attn.q_proj.weight": "model-00007-of-00009.safetensors", "model.layers.33.self_attn.v_proj.weight": "model-00007-of-00009.safetensors", "model.layers.34.input_layernorm.weight": "model-00007-of-00009.safetensors", "model.layers.34.mlp.down_proj.weight": "model-00007-of-00009.safetensors", "model.layers.34.mlp.gate_proj.weight": "model-00007-of-00009.safetensors", "model.layers.34.mlp.up_proj.weight": "model-00007-of-00009.safetensors", "model.layers.34.post_attention_layernorm.weight": "model-00007-of-00009.safetensors", "model.layers.34.self_attn.k_norm.weight": "model-00007-of-00009.safetensors", "model.layers.34.self_attn.k_proj.weight": "model-00007-of-00009.safetensors", "model.layers.34.self_attn.o_proj.weight": "model-00007-of-00009.safetensors", "model.layers.34.self_attn.q_norm.weight": "model-00007-of-00009.safetensors", "model.layers.34.self_attn.q_proj.weight": "model-00007-of-00009.safetensors", "model.layers.34.self_attn.v_proj.weight": "model-00007-of-00009.safetensors", "model.layers.35.input_layernorm.weight": "model-00007-of-00009.safetensors", "model.layers.35.mlp.down_proj.weight": "model-00007-of-00009.safetensors", "model.layers.35.mlp.gate_proj.weight": "model-00007-of-00009.safetensors", "model.layers.35.mlp.up_proj.weight": "model-00007-of-00009.safetensors", "model.layers.35.post_attention_layernorm.weight": "model-00007-of-00009.safetensors", "model.layers.35.self_attn.k_norm.weight": "model-00007-of-00009.safetensors", "model.layers.35.self_attn.k_proj.weight": "model-00007-of-00009.safetensors", "model.layers.35.self_attn.o_proj.weight": "model-00007-of-00009.safetensors", "model.layers.35.self_attn.q_norm.weight": "model-00007-of-00009.safetensors", "model.layers.35.self_attn.q_proj.weight": "model-00008-of-00009.safetensors", "model.layers.35.self_attn.v_proj.weight": "model-00008-of-00009.safetensors", "model.layers.4.input_layernorm.weight": "model-00008-of-00009.safetensors", "model.layers.4.mlp.down_proj.weight": "model-00008-of-00009.safetensors", "model.layers.4.mlp.gate_proj.weight": "model-00008-of-00009.safetensors", "model.layers.4.mlp.up_proj.weight": "model-00008-of-00009.safetensors", "model.layers.4.post_attention_layernorm.weight": "model-00008-of-00009.safetensors", "model.layers.4.self_attn.k_norm.weight": "model-00008-of-00009.safetensors", "model.layers.4.self_attn.k_proj.weight": "model-00008-of-00009.safetensors", "model.layers.4.self_attn.o_proj.weight": "model-00008-of-00009.safetensors", "model.layers.4.self_attn.q_norm.weight": "model-00008-of-00009.safetensors", "model.layers.4.self_attn.q_proj.weight": "model-00008-of-00009.safetensors", "model.layers.4.self_attn.v_proj.weight": "model-00008-of-00009.safetensors", "model.layers.5.input_layernorm.weight": "model-00008-of-00009.safetensors", "model.layers.5.mlp.down_proj.weight": "model-00008-of-00009.safetensors", "model.layers.5.mlp.gate_proj.weight": "model-00008-of-00009.safetensors", "model.layers.5.mlp.up_proj.weight": "model-00008-of-00009.safetensors", "model.layers.5.post_attention_layernorm.weight": "model-00008-of-00009.safetensors", "model.layers.5.self_attn.k_norm.weight": "model-00008-of-00009.safetensors", "model.layers.5.self_attn.k_proj.weight": "model-00008-of-00009.safetensors", "model.layers.5.self_attn.o_proj.weight": "model-00008-of-00009.safetensors", "model.layers.5.self_attn.q_norm.weight": "model-00008-of-00009.safetensors", "model.layers.5.self_attn.q_proj.weight": "model-00008-of-00009.safetensors", "model.layers.5.self_attn.v_proj.weight": "model-00008-of-00009.safetensors", "model.layers.6.input_layernorm.weight": "model-00008-of-00009.safetensors", "model.layers.6.mlp.down_proj.weight": "model-00008-of-00009.safetensors", "model.layers.6.mlp.gate_proj.weight": "model-00008-of-00009.safetensors", "model.layers.6.mlp.up_proj.weight": "model-00008-of-00009.safetensors", "model.layers.6.post_attention_layernorm.weight": "model-00008-of-00009.safetensors", "model.layers.6.self_attn.k_norm.weight": "model-00008-of-00009.safetensors", "model.layers.6.self_attn.k_proj.weight": "model-00008-of-00009.safetensors", "model.layers.6.self_attn.o_proj.weight": "model-00008-of-00009.safetensors", "model.layers.6.self_attn.q_norm.weight": "model-00008-of-00009.safetensors", "model.layers.6.self_attn.q_proj.weight": "model-00008-of-00009.safetensors", "model.layers.6.self_attn.v_proj.weight": "model-00008-of-00009.safetensors", "model.layers.7.input_layernorm.weight": "model-00008-of-00009.safetensors", "model.layers.7.mlp.down_proj.weight": "model-00008-of-00009.safetensors", "model.layers.7.mlp.gate_proj.weight": "model-00008-of-00009.safetensors", "model.layers.7.mlp.up_proj.weight": "model-00008-of-00009.safetensors", "model.layers.7.post_attention_layernorm.weight": "model-00008-of-00009.safetensors", "model.layers.7.self_attn.k_norm.weight": "model-00008-of-00009.safetensors", "model.layers.7.self_attn.k_proj.weight": "model-00008-of-00009.safetensors", "model.layers.7.self_attn.o_proj.weight": "model-00008-of-00009.safetensors", "model.layers.7.self_attn.q_norm.weight": "model-00008-of-00009.safetensors", "model.layers.7.self_attn.q_proj.weight": "model-00008-of-00009.safetensors", "model.layers.7.self_attn.v_proj.weight": "model-00008-of-00009.safetensors", "model.layers.8.input_layernorm.weight": "model-00008-of-00009.safetensors", "model.layers.8.mlp.down_proj.weight": "model-00008-of-00009.safetensors", "model.layers.8.mlp.gate_proj.weight": "model-00008-of-00009.safetensors", "model.layers.8.mlp.up_proj.weight": "model-00008-of-00009.safetensors", "model.layers.8.post_attention_layernorm.weight": "model-00008-of-00009.safetensors", "model.layers.8.self_attn.k_norm.weight": "model-00008-of-00009.safetensors", "model.layers.8.self_attn.k_proj.weight": "model-00008-of-00009.safetensors", "model.layers.8.self_attn.o_proj.weight": "model-00009-of-00009.safetensors", "model.layers.8.self_attn.q_norm.weight": "model-00009-of-00009.safetensors", "model.layers.8.self_attn.q_proj.weight": "model-00009-of-00009.safetensors", "model.layers.8.self_attn.v_proj.weight": "model-00009-of-00009.safetensors", "model.layers.9.input_layernorm.weight": "model-00009-of-00009.safetensors", "model.layers.9.mlp.down_proj.weight": "model-00009-of-00009.safetensors", "model.layers.9.mlp.gate_proj.weight": "model-00009-of-00009.safetensors", "model.layers.9.mlp.up_proj.weight": "model-00009-of-00009.safetensors", "model.layers.9.post_attention_layernorm.weight": "model-00009-of-00009.safetensors", "model.layers.9.self_attn.k_norm.weight": "model-00009-of-00009.safetensors", "model.layers.9.self_attn.k_proj.weight": "model-00009-of-00009.safetensors", "model.layers.9.self_attn.o_proj.weight": "model-00009-of-00009.safetensors", "model.layers.9.self_attn.q_norm.weight": "model-00009-of-00009.safetensors", "model.layers.9.self_attn.q_proj.weight": "model-00009-of-00009.safetensors", "model.layers.9.self_attn.v_proj.weight": "model-00009-of-00009.safetensors", "model.norm.weight": "model-00009-of-00009.safetensors"}}
|
special_tokens_map.json
ADDED
|
@@ -0,0 +1,31 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"additional_special_tokens": [
|
| 3 |
+
"<|im_start|>",
|
| 4 |
+
"<|im_end|>",
|
| 5 |
+
"<|object_ref_start|>",
|
| 6 |
+
"<|object_ref_end|>",
|
| 7 |
+
"<|box_start|>",
|
| 8 |
+
"<|box_end|>",
|
| 9 |
+
"<|quad_start|>",
|
| 10 |
+
"<|quad_end|>",
|
| 11 |
+
"<|vision_start|>",
|
| 12 |
+
"<|vision_end|>",
|
| 13 |
+
"<|vision_pad|>",
|
| 14 |
+
"<|image_pad|>",
|
| 15 |
+
"<|video_pad|>"
|
| 16 |
+
],
|
| 17 |
+
"eos_token": {
|
| 18 |
+
"content": "<|im_end|>",
|
| 19 |
+
"lstrip": false,
|
| 20 |
+
"normalized": false,
|
| 21 |
+
"rstrip": false,
|
| 22 |
+
"single_word": false
|
| 23 |
+
},
|
| 24 |
+
"pad_token": {
|
| 25 |
+
"content": "<|vision_pad|>",
|
| 26 |
+
"lstrip": false,
|
| 27 |
+
"normalized": false,
|
| 28 |
+
"rstrip": false,
|
| 29 |
+
"single_word": false
|
| 30 |
+
}
|
| 31 |
+
}
|
tokenizer.json
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:aeb13307a71acd8fe81861d94ad54ab689df773318809eed3cbe794b4492dae4
|
| 3 |
+
size 11422654
|
tokenizer_config.json
ADDED
|
@@ -0,0 +1,241 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"add_bos_token": false,
|
| 3 |
+
"add_prefix_space": false,
|
| 4 |
+
"added_tokens_decoder": {
|
| 5 |
+
"151643": {
|
| 6 |
+
"content": "<|endoftext|>",
|
| 7 |
+
"lstrip": false,
|
| 8 |
+
"normalized": false,
|
| 9 |
+
"rstrip": false,
|
| 10 |
+
"single_word": false,
|
| 11 |
+
"special": true
|
| 12 |
+
},
|
| 13 |
+
"151644": {
|
| 14 |
+
"content": "<|im_start|>",
|
| 15 |
+
"lstrip": false,
|
| 16 |
+
"normalized": false,
|
| 17 |
+
"rstrip": false,
|
| 18 |
+
"single_word": false,
|
| 19 |
+
"special": true
|
| 20 |
+
},
|
| 21 |
+
"151645": {
|
| 22 |
+
"content": "<|im_end|>",
|
| 23 |
+
"lstrip": false,
|
| 24 |
+
"normalized": false,
|
| 25 |
+
"rstrip": false,
|
| 26 |
+
"single_word": false,
|
| 27 |
+
"special": true
|
| 28 |
+
},
|
| 29 |
+
"151646": {
|
| 30 |
+
"content": "<|object_ref_start|>",
|
| 31 |
+
"lstrip": false,
|
| 32 |
+
"normalized": false,
|
| 33 |
+
"rstrip": false,
|
| 34 |
+
"single_word": false,
|
| 35 |
+
"special": true
|
| 36 |
+
},
|
| 37 |
+
"151647": {
|
| 38 |
+
"content": "<|object_ref_end|>",
|
| 39 |
+
"lstrip": false,
|
| 40 |
+
"normalized": false,
|
| 41 |
+
"rstrip": false,
|
| 42 |
+
"single_word": false,
|
| 43 |
+
"special": true
|
| 44 |
+
},
|
| 45 |
+
"151648": {
|
| 46 |
+
"content": "<|box_start|>",
|
| 47 |
+
"lstrip": false,
|
| 48 |
+
"normalized": false,
|
| 49 |
+
"rstrip": false,
|
| 50 |
+
"single_word": false,
|
| 51 |
+
"special": true
|
| 52 |
+
},
|
| 53 |
+
"151649": {
|
| 54 |
+
"content": "<|box_end|>",
|
| 55 |
+
"lstrip": false,
|
| 56 |
+
"normalized": false,
|
| 57 |
+
"rstrip": false,
|
| 58 |
+
"single_word": false,
|
| 59 |
+
"special": true
|
| 60 |
+
},
|
| 61 |
+
"151650": {
|
| 62 |
+
"content": "<|quad_start|>",
|
| 63 |
+
"lstrip": false,
|
| 64 |
+
"normalized": false,
|
| 65 |
+
"rstrip": false,
|
| 66 |
+
"single_word": false,
|
| 67 |
+
"special": true
|
| 68 |
+
},
|
| 69 |
+
"151651": {
|
| 70 |
+
"content": "<|quad_end|>",
|
| 71 |
+
"lstrip": false,
|
| 72 |
+
"normalized": false,
|
| 73 |
+
"rstrip": false,
|
| 74 |
+
"single_word": false,
|
| 75 |
+
"special": true
|
| 76 |
+
},
|
| 77 |
+
"151652": {
|
| 78 |
+
"content": "<|vision_start|>",
|
| 79 |
+
"lstrip": false,
|
| 80 |
+
"normalized": false,
|
| 81 |
+
"rstrip": false,
|
| 82 |
+
"single_word": false,
|
| 83 |
+
"special": true
|
| 84 |
+
},
|
| 85 |
+
"151653": {
|
| 86 |
+
"content": "<|vision_end|>",
|
| 87 |
+
"lstrip": false,
|
| 88 |
+
"normalized": false,
|
| 89 |
+
"rstrip": false,
|
| 90 |
+
"single_word": false,
|
| 91 |
+
"special": true
|
| 92 |
+
},
|
| 93 |
+
"151654": {
|
| 94 |
+
"content": "<|vision_pad|>",
|
| 95 |
+
"lstrip": false,
|
| 96 |
+
"normalized": false,
|
| 97 |
+
"rstrip": false,
|
| 98 |
+
"single_word": false,
|
| 99 |
+
"special": true
|
| 100 |
+
},
|
| 101 |
+
"151655": {
|
| 102 |
+
"content": "<|image_pad|>",
|
| 103 |
+
"lstrip": false,
|
| 104 |
+
"normalized": false,
|
| 105 |
+
"rstrip": false,
|
| 106 |
+
"single_word": false,
|
| 107 |
+
"special": true
|
| 108 |
+
},
|
| 109 |
+
"151656": {
|
| 110 |
+
"content": "<|video_pad|>",
|
| 111 |
+
"lstrip": false,
|
| 112 |
+
"normalized": false,
|
| 113 |
+
"rstrip": false,
|
| 114 |
+
"single_word": false,
|
| 115 |
+
"special": true
|
| 116 |
+
},
|
| 117 |
+
"151657": {
|
| 118 |
+
"content": "<tool_call>",
|
| 119 |
+
"lstrip": false,
|
| 120 |
+
"normalized": false,
|
| 121 |
+
"rstrip": false,
|
| 122 |
+
"single_word": false,
|
| 123 |
+
"special": false
|
| 124 |
+
},
|
| 125 |
+
"151658": {
|
| 126 |
+
"content": "</tool_call>",
|
| 127 |
+
"lstrip": false,
|
| 128 |
+
"normalized": false,
|
| 129 |
+
"rstrip": false,
|
| 130 |
+
"single_word": false,
|
| 131 |
+
"special": false
|
| 132 |
+
},
|
| 133 |
+
"151659": {
|
| 134 |
+
"content": "<|fim_prefix|>",
|
| 135 |
+
"lstrip": false,
|
| 136 |
+
"normalized": false,
|
| 137 |
+
"rstrip": false,
|
| 138 |
+
"single_word": false,
|
| 139 |
+
"special": false
|
| 140 |
+
},
|
| 141 |
+
"151660": {
|
| 142 |
+
"content": "<|fim_middle|>",
|
| 143 |
+
"lstrip": false,
|
| 144 |
+
"normalized": false,
|
| 145 |
+
"rstrip": false,
|
| 146 |
+
"single_word": false,
|
| 147 |
+
"special": false
|
| 148 |
+
},
|
| 149 |
+
"151661": {
|
| 150 |
+
"content": "<|fim_suffix|>",
|
| 151 |
+
"lstrip": false,
|
| 152 |
+
"normalized": false,
|
| 153 |
+
"rstrip": false,
|
| 154 |
+
"single_word": false,
|
| 155 |
+
"special": false
|
| 156 |
+
},
|
| 157 |
+
"151662": {
|
| 158 |
+
"content": "<|fim_pad|>",
|
| 159 |
+
"lstrip": false,
|
| 160 |
+
"normalized": false,
|
| 161 |
+
"rstrip": false,
|
| 162 |
+
"single_word": false,
|
| 163 |
+
"special": false
|
| 164 |
+
},
|
| 165 |
+
"151663": {
|
| 166 |
+
"content": "<|repo_name|>",
|
| 167 |
+
"lstrip": false,
|
| 168 |
+
"normalized": false,
|
| 169 |
+
"rstrip": false,
|
| 170 |
+
"single_word": false,
|
| 171 |
+
"special": false
|
| 172 |
+
},
|
| 173 |
+
"151664": {
|
| 174 |
+
"content": "<|file_sep|>",
|
| 175 |
+
"lstrip": false,
|
| 176 |
+
"normalized": false,
|
| 177 |
+
"rstrip": false,
|
| 178 |
+
"single_word": false,
|
| 179 |
+
"special": false
|
| 180 |
+
},
|
| 181 |
+
"151665": {
|
| 182 |
+
"content": "<tool_response>",
|
| 183 |
+
"lstrip": false,
|
| 184 |
+
"normalized": false,
|
| 185 |
+
"rstrip": false,
|
| 186 |
+
"single_word": false,
|
| 187 |
+
"special": false
|
| 188 |
+
},
|
| 189 |
+
"151666": {
|
| 190 |
+
"content": "</tool_response>",
|
| 191 |
+
"lstrip": false,
|
| 192 |
+
"normalized": false,
|
| 193 |
+
"rstrip": false,
|
| 194 |
+
"single_word": false,
|
| 195 |
+
"special": false
|
| 196 |
+
},
|
| 197 |
+
"151667": {
|
| 198 |
+
"content": "<think>",
|
| 199 |
+
"lstrip": false,
|
| 200 |
+
"normalized": false,
|
| 201 |
+
"rstrip": false,
|
| 202 |
+
"single_word": false,
|
| 203 |
+
"special": false
|
| 204 |
+
},
|
| 205 |
+
"151668": {
|
| 206 |
+
"content": "</think>",
|
| 207 |
+
"lstrip": false,
|
| 208 |
+
"normalized": false,
|
| 209 |
+
"rstrip": false,
|
| 210 |
+
"single_word": false,
|
| 211 |
+
"special": false
|
| 212 |
+
}
|
| 213 |
+
},
|
| 214 |
+
"additional_special_tokens": [
|
| 215 |
+
"<|im_start|>",
|
| 216 |
+
"<|im_end|>",
|
| 217 |
+
"<|object_ref_start|>",
|
| 218 |
+
"<|object_ref_end|>",
|
| 219 |
+
"<|box_start|>",
|
| 220 |
+
"<|box_end|>",
|
| 221 |
+
"<|quad_start|>",
|
| 222 |
+
"<|quad_end|>",
|
| 223 |
+
"<|vision_start|>",
|
| 224 |
+
"<|vision_end|>",
|
| 225 |
+
"<|vision_pad|>",
|
| 226 |
+
"<|image_pad|>",
|
| 227 |
+
"<|video_pad|>"
|
| 228 |
+
],
|
| 229 |
+
"bos_token": null,
|
| 230 |
+
"clean_up_tokenization_spaces": false,
|
| 231 |
+
"eos_token": "<|im_end|>",
|
| 232 |
+
"errors": "replace",
|
| 233 |
+
"extra_special_tokens": {},
|
| 234 |
+
"model_max_length": 262144,
|
| 235 |
+
"pad_token": "<|vision_pad|>",
|
| 236 |
+
"padding_side": "right",
|
| 237 |
+
"split_special_tokens": false,
|
| 238 |
+
"tokenizer_class": "Qwen2Tokenizer",
|
| 239 |
+
"unk_token": null,
|
| 240 |
+
"chat_template": "{{- \"<|im_start|>system\\n\" -}}\n{%- if tools -%}\n\t{%- if messages[0].role == \"system\" -%}\n\t\t{{- messages[0].content + \"\\n\\n\" -}}\n\t{%- endif -%}\n\t{{- \"# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" -}}\n\t{%- for tool in tools -%}\n\t\t{{- \"\\n\" -}}\n\t\t{{- tool | tojson -}}\n\t{%- endfor -%}\n\t{{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call>\\n\" -}}\n{%- elif messages[0].role == \"system\" -%}\n\t{{- messages[0].content -}}\n{%- endif -%}\n{{- \"You are governed by the following command rules:\\n\\n- **/think** or **think**: Engage deep reasoning. Begin with: \\n `</think>` \\n Since the user used the \\\"/think\\\" command, I will: \\n Think step-by-step using a long chain-of-thought and self-reflection. I will only respond when I'm 100% sure. I will provide the final response after closing this thinking section. \\n `</think>` \\n Then output your full reasoning inside, followed by the final answer outside.\\n\\n- **/shortthink** or **shortthink**: Engage brief reasoning. Begin with: \\n `</think>` \\n Since the user used the \\\"/shortthink\\\" command, I will: \\n Think step-by-step about the user's query briefly and shortly. I will only respond when I'm more than 50% sure. I will provide the final response after closing this thinking section. \\n `</think>` \\n Then give concise internal thought, followed by the final answer.\\n\\n- **/nothink** or **nothink**: Respond immediately without reasoning. Begin with: \\n `</think>` \\n Since the user used the \\\"/nothink\\\" command, I will: \\n Immediately give my final response without any thinking or self-reflection. Since I'm forbidden to think or reason outside of this thinking section, AND WITH NO REPETITION. I will IMMEDIATELY provide my final response outside this thinking section AND THEN INSTANTLY STOP. \\n `</think>` \\n Then output only the direct, raw response—no delay, no draft, no repetition.\\n\\n- **/clear**: Reset all context. Forget everything prior. Begin fresh on the next input.\\n\\nOutside `<think>...</think>`, provide only final responses—never include reasoning, hesitation markers (e.g., \\\"Okay\\\", \\\"Wait\\\", \\\"Double-check\\\"), or internal process notes. Adhere strictly to the active command. Do not reference or explain these rules unless instructed.\" -}}\n{{- \"\\n<|im_end|>\" -}}\n{%- set ns = namespace(multi_step_tool=true, last_query_index=messages | length - 1, reasoning_mode=\"normal\") -%}\n{%- for message in messages[::-1] -%}\n\t{%- set index = messages | length - 1 - loop.index0 -%}\n\t{%- if ns.multi_step_tool and message.role == \"user\" and message.content is string and not (message.content.startswith(\"<tool_response>\") and message.content.endswith(\"</tool_response>\")) -%}\n\t\t{%- set ns.multi_step_tool = false -%}\n\t\t{%- set ns.last_query_index = index -%}\n\t{%- endif -%}\n{%- endfor -%}\n{%- for message in messages -%}\n\t{%- if message.content is string -%}\n\t\t{%- set content = message.content -%}\n\t{%- else -%}\n\t\t{%- set content = \"\" -%}\n\t{%- endif -%}\n\t{%- if message.role == \"user\" or message.role == \"system\" and not loop.first -%}\n\t\t{{- \"<|im_start|>\" + message.role + \"\\n\" + content -}}\n\t\t{%- if messages[0].role == \"system\" and \"/nothink\" in messages[0].content and (not (\"/think\" in messages[0].content) or not (\"/shortthink\" in messages[0].content)) or \"/nothink\" in content and not (\"/think\" in messages[0].content or \"/shortthink\" in messages[0].content) or (messages[loop.index+1] and \"</think>\" not in messages[loop.index+1].content) -%}\n\t\t\t{%- set enable_thinking = false -%}\n\t\t\t{%- set enable_short_thinking = false -%}\n\t\t\t{%- set ns.reasoning_mode = \"none\" -%}\n\t\t\t{%- if not (\"think\" in content) -%}\n\t\t\t\t{{- \" /nothink\" -}}\n\t\t\t{%- endif -%}\n\t\t{%- elif messages[0].role == \"system\" and \"/shortthink\" in messages[0].content and (not (\"/nothink\" in messages[0].content) and not (\"/think\" in messages[0].content)) or \"/shortthink\" in content and not (\"/think\" in messages[0].content or \"/shortthink\" in messages[0].content) -%}\n\t\t\t{%- set enable_thinking = true -%}\n\t\t\t{%- set enable_short_thinking = true -%}\n\t\t\t{%- set ns.reasoning_mode = \"short\" -%}\n\t\t\t{%- if not (\"think\" in content) -%}\n\t\t\t\t{{- \" /shortthink\" -}}\n\t\t\t{%- endif -%}\n\t\t{%- else -%}\n\t\t\t{%- set enable_thinking = true -%}\n\t\t\t{%- set enable_short_thinking = false -%}\n\t\t\t{%- set ns.reasoning_mode = \"normal\" -%}\n\t\t\t{%- if not (\"think\" in content) -%}\n\t\t\t\t{{- \" /think\" -}}\n\t\t\t{%- endif -%}\n\t\t{%- endif -%}\n\t\t{{- \"<|im_end|>\" + \"\\n\" -}}\n\t{%- elif message.role == \"assistant\" -%}\n\t\t{%- set reasoning_content = \"\" -%}\n\t\t{%- if \"<think>\\nSince the user used the\" not in content.split(\"</think>\")[0].split(\"<think>\")[-1] -%}\n\t\t\t{%- if ns.reasoning_mode == \"none\" -%}\n\t\t\t\t{%- set reasoning_prefix = \"\" -%}\n\t\t\t{%- elif ns.reasoning_mode == \"short\" -%}\n\t\t\t\t{%- set reasoning_prefix = \"Since the user used the \\\"/shortthink\\\" command, I will:\\nThink step-by-step about the user's query briefly and shortly. I will only respond when I'm more than 50% sure. I will provide the final response after closing this thinking section. \" -%}\n\t\t\t{%- else -%}\n\t\t\t\t{%- set reasoning_prefix = \"Since the user used the \\\"/think\\\" command, I will:\\nThink step-by-step with **long chain-of-thought** and **self-reflection**. I will only respond when I'm 100% sure. I will provide the final response after closing this thinking section. \" -%}\n\t\t\t{%- endif -%}\n\t\t{%- endif -%}\n\t\t{%- if message.reasoning_content is string -%}\n\t\t\t{%- set reasoning_content = message.reasoning_content -%}\n\t\t{%- else -%}\n\t\t\t{%- if \"</think>\" in content -%}\n\t\t\t\t{%- set reasoning_content = reasoning_prefix + content.split(\"</think>\")[0].rstrip(\"\\n\").split(\"<think>\")[-1].lstrip(\"\\n\") -%}\n\t\t\t\t{%- set content = content.split(\"</think>\")[-1].lstrip(\"\\n\") -%}\n\t\t\t{%- endif -%}\n\t\t\t{%- if \"<think>\" in content -%}\n\t\t\t\t{%- set content = content | replace(\"<think>\", \" \") -%}\n\t\t\t{%- endif -%}\n\t\t\t{%- if \"</think>\" in content -%}\n\t\t\t\t{%- set content = content | replace(\"</think>\", \" \") -%}\n\t\t\t{%- endif -%}\n\t\t{%- endif -%}\n\t\t{{- \"\\n\" -}}\n\t\t{# Apply truncation and break if the /shortthink command is active #}\n\t\t{%- if enable_short_thinking is true -%}\n\t\t\t{%- set words = reasoning_content.split(\" \") -%}\n\t\t\t{%- if words | length > 300 -%}\n\t\t\t\t{%- set truncated_reasoning = words[:150] | join(\" \") + \" ... truncated ... \" + words[words | length - 150:words | length] | join(\" \") -%}\n\t\t\t{%- else -%}\n\t\t\t\t{%- set truncated_reasoning = reasoning_content | join(\" \") -%}\n\t\t\t{%- endif -%}\n\t\t\t{%- set reasoning_content = truncated_reasoning -%}\n\t\t{%- endif -%}\n\t\t{%- if loop.last or not loop.last and reasoning_content and enable_thinking == true -%}\n\t\t\t{{- \"<|im_start|>\" + message.role + \"\\n<think>\\n\" + reasoning_content.strip(\"\\n\") + \"\\n</think>\\n\" + content.lstrip(\"\\n\") -}}\n\t\t{%- else -%}\n\t\t\t{{- \"<|im_start|>\" + message.role + \"\\n\" + content -}}\n\t\t{%- endif -%}\n\t\t{%- if message.tool_calls -%}\n\t\t\t{%- for tool_call in message.tool_calls -%}\n\t\t\t\t{%- if loop.first and content or not loop.first -%}\n\t\t\t\t\t{{- \"\\n\" -}}\n\t\t\t\t{%- endif -%}\n\t\t\t\t{%- if tool_call.function -%}\n\t\t\t\t\t{%- set tool_call = tool_call.function -%}\n\t\t\t\t{%- endif -%}\n\t\t\t\t{{- \"<tool_call>\\n{\\\"name\\\": \\\"\" -}}\n\t\t\t\t{{- tool_call.name -}}\n\t\t\t\t{{- \"\\\", \\\"arguments\\\": \" -}}\n\t\t\t\t{%- if tool_call.arguments is string -%}\n\t\t\t\t\t{{- tool_call.arguments -}}\n\t\t\t\t{%- else -%}\n\t\t\t\t\t{{- tool_call.arguments | tojson -}}\n\t\t\t\t{%- endif -%}\n\t\t\t\t{{- \"}\\n</tool_call>\" -}}\n\t\t\t{%- endfor -%}\n\t\t{%- endif -%}\n\t\t{{- \"<|im_end|>\\n\" -}}\n\t{%- elif message.role == \"tool\" -%}\n\t\t{%- if loop.first or messages[loop.index0 - 1].role != \"tool\" -%}\n\t\t\t{{- \"<|im_start|>user\" -}}\n\t\t{%- endif -%}\n\t\t{{- \"\\n<tool_response>\\n\" -}}\n\t\t{{- content -}}\n\t\t{{- \"\\n</tool_response>\" -}}\n\t\t{%- if loop.last or messages[loop.index0 + 1].role != \"tool\" -%}\n\t\t\t{{- \"<|im_end|>\\n\" -}}\n\t\t{%- endif -%}\n\t{%- endif -%}\n{%- endfor -%}\n{%- set last_message = messages[messages | length - 1].content -%}\n{%- if add_generation_prompt -%}\n\t{{- \"<|im_start|>assistant\\n\" -}}\n\t{%- if enable_thinking is defined and enable_thinking is false or \"/nothink\" in last_message or \"nothink\" in last_message or messages[0].role == \"system\" and \"/nothink\" in messages[0].content and (not (\"/think\" in messages[0].content) or not (\"/shortthink\" in messages[0].content)) -%}\n\t\t{{- \"<think>\\nOkay, the user sent \\\"\"+last_message+\"\\\". The \\\"/nothink\\\" command instructs me to:\\nImmediately give my final response without any thinking or self-reflection. Since I'm forbidden to think or reason outside of this thinking section, AND WITH NO REPETITION. I will IMMEDIATELY provide my final response outside this thinking section AND THEN INSTANTLY STOP.\\n</think>\" -}}\n\t{%- elif enable_short_thinking is defined and enable_short_thinking is false or \"/shortthink\" in last_message or \"shortthink\" in last_message or messages[0].role == \"system\" and \"/shortthink\" in messages[0].content and not (\"/think\" in messages[0].content) -%}\n\t\t{{- \"<think>\\nOkay, the user sent \\\"\"+last_message+\"\\\". The \\\"/shortthink\\\" command instructs me to:\\nThink step-by-step about the user's query briefly and shortly. I will only respond when I'm more than 50% sure. I will provide the final response after closing this thinking section.\" -}}\n\t{%- elif \"/clear\" in last_message -%}\n\t\t{{- \"<think>\\nOkay, the user used the \\\"/clear\\\" command instructs me to:\\n1. Forget everything above.\\n2. Ignore everything that comes before this message.\\n3. Start a fresh new conversation and greet the user.\\n</think>\" -}}\n\t{%- else -%}\n\t\t{{- \"<think>\\nOkay, the user sent \\\"\"+last_message+\"\\\". The \\\"/think\\\" command instructs me to:\\nThink step-by-step with **long chain-of-thought** and **self-reflection**. I will only respond when I'm 100% sure. I will provide the final response after closing this thinking section.\" -}}\n\t{%- endif -%}\n{%- endif -%}"
|
| 241 |
+
}
|
vocab.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|