Update README.md
Browse files
README.md
CHANGED
|
@@ -38,7 +38,7 @@ This specific checkpoint (`150k` steps) represents the "pre-decay" phase of trai
|
|
| 38 |
* **Num attention heads:** 16
|
| 39 |
* **Num kv heads:** 8
|
| 40 |
* **Head dim:** 128
|
| 41 |
-
* **Tie word embeddings**:
|
| 42 |
|
| 43 |
### Architecture Highlights
|
| 44 |
NeuroBLAST differs from standard Transformers by utilizing a three-stage cortical design:
|
|
|
|
| 38 |
* **Num attention heads:** 16
|
| 39 |
* **Num kv heads:** 8
|
| 40 |
* **Head dim:** 128
|
| 41 |
+
* **Tie word embeddings**: False
|
| 42 |
|
| 43 |
### Architecture Highlights
|
| 44 |
NeuroBLAST differs from standard Transformers by utilizing a three-stage cortical design:
|